All, This issue with JSON still exists apparently. As we migrate to the EVF JSON reader, could we add a unit test to address? Thx, -- C
> Begin forwarded message: > > From: "Idan Sheinberg (Jira)" <[email protected]> > Subject: [jira] [Comment Edited] (DRILL-5769) IndexOutOfBoundsException when > querying JSON files > Date: January 24, 2020 at 4:23:00 PM EST > To: [email protected] > Reply-To: [email protected] > > > [ > https://issues.apache.org/jira/browse/DRILL-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023274#comment-17023274 > ] > > Idan Sheinberg edited comment on DRILL-5769 at 1/24/20 9:22 PM: > ---------------------------------------------------------------- > > [[email protected]] More than 2 years after, the exact behavior/issue > you've spotted still exists. > Might I add it only happens for DFS storage (S3 is fine) and for JSON input > (CSV was tested to be fine). > > > was (Author: sheinbergon): > [[email protected]] More than 2 years after, the exact behavior/issue > you've spotted still exists. > >> IndexOutOfBoundsException when querying JSON files >> -------------------------------------------------- >> >> Key: DRILL-5769 >> URL: https://issues.apache.org/jira/browse/DRILL-5769 >> Project: Apache Drill >> Issue Type: Bug >> Components: Server, Storage - JSON >> Affects Versions: 1.10.0 >> Environment: *jdk_8u45_x64* >> *single drillbit running on zookeeper* >> *Following options set to TRUE:* >> drill.exec.functions.cast_empty_string_to_null >> store.json.all_text_mode >> store.parquet.enable_dictionary_encoding >> store.parquet.use_new_reader >> Reporter: David Lee >> Assignee: Jinfeng Ni >> Priority: Major >> Fix For: Future >> >> Attachments: 001.json, 100.json, 111.json >> >> >> *Running the following SQL on these three JSON files fail: * >> 001.json 100.json 111.json >> select t.id >> from dfs.`/tmp/???.json` t >> where t.assetData.debt.couponPaymentFeature.interestBasis = '5' >> *Error:* >> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: >> IndexOutOfBoundsException: index: 1024, length: 1 (expected: range(0, 1024)) >> Fragment 0:0 [Error Id: xxxx.xxxx... >> *However running the same SQL on two out of three files works:* >> select t.id >> from dfs.`/tmp/1??.json` t >> where t.assetData.debt.couponPaymentFeature.interestBasis = '5' >> select t.id >> from dfs.`/tmp/?1?.json` t >> where t.assetData.debt.couponPaymentFeature.interestBasis = '5' >> select t.id >> from dfs.`/tmp/??1.json` t >> where t.assetData.debt.couponPaymentFeature.interestBasis = '5' >> *Changing the selected column from t.id to t.* also works: * >> select * >> from dfs.`/tmp/???.json` t >> where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > > > > -- > This message was sent by Atlassian Jira > (v8.3.4#803005)
