I think that reading all as doubles is fine as an interim step. This will work for very large numbers, but has the traditional problems with very large financial values, but I think that we aren't worried much yet about people talking about amounts > $10^17.
On Mon, Jan 26, 2015 at 5:17 PM, Jacques Nadeau <[email protected]> wrote: > Writing zero int to a float column should be allowed. Basically, if we > found a float previously and then we run across a zero, that should be > accepted. This doesn't fix the situation where the first value was zero > but definitely fixes many situations. I'm up for a second option to treat > all numbers as doubles but I'm not in support of it for the default as once > we finish embedded types, this would be our desired behavior. > > On Mon, Jan 26, 2015 at 1:36 PM, Jason Altekruse <[email protected] > > > wrote: > > > Hello Drillers, > > > > I am currently working on improving the error reporting in the JSON > reader > > to help users with files that Drill cannot read using the default > > configuration today. > > > > As a part of this change I think it may be useful to change the default > > behavior for reading numbers in JSON documents. Currently we fail on a > > simple case with reading numbers with decimal points and then hit a value > > of 0 (or any number without a decimal point) in a later record. The > reason > > for the current behavior is to allow better precision in the case of > files > > with only integers. The issue however is that we currently fail on the > > basic case with a mix of intergers and decimal numbers. See [1] for more > > discussion on this. > > > > I propose that we switch the JSON reader to read all numbers as doubles > by > > default. The reader already contains a workaround that allows lossless > > casting to integers and decimal types with some extra computational > > overhead using all_text_mode, see more info below. [2] > > > > Please share your thoughts on this change. > > > > [1] https://issues.apache.org/jira/browse/DRILL-1460 > > [2] https://issues.apache.org/jira/browse/DRILL-2071 > > > > -Jason > > >
