Hello Drillers,

I am currently working on improving the error reporting in the JSON reader
to help users with files that Drill cannot read using the default
configuration today.

As a part of this change I think it may be useful to change the default
behavior for reading numbers in JSON documents. Currently we fail on a
simple case with reading numbers with decimal points and then hit a value
of 0 (or any number without a decimal point) in a later record. The reason
for the current behavior is to allow better precision in the case of files
with only integers. The issue however is that we currently fail on the
basic case with a mix of intergers and decimal numbers. See [1] for more
discussion on this.

I propose that we switch the JSON reader to read all numbers as doubles by
default. The reader already contains a workaround that allows lossless
casting to integers and decimal types with some extra computational
overhead using all_text_mode, see more info below. [2]

Please share your thoughts on this change.

[1] https://issues.apache.org/jira/browse/DRILL-1460
[2] https://issues.apache.org/jira/browse/DRILL-2071

-Jason

Reply via email to