[ https://issues.apache.org/jira/browse/DRILL-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082787#comment-14082787 ]
Rahul Challapalli commented on DRILL-1239: ------------------------------------------ Thanks for the jira Andy. Looks like we can consistently reproduce this issue irrespective of whether the data is nested or not. Once the json data exceeds the batch size of drill we are hitting this issue > java.lang.AssertionError When performing select against nested JSON > 60,000 > records > ------------------------------------------------------------------------------------ > > Key: DRILL-1239 > URL: https://issues.apache.org/jira/browse/DRILL-1239 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON > Affects Versions: 0.4.0 > Environment: Seen both on standalone (OSX host, 16 GB RAM) as well as > on cluster in AWS: > 5 nodes, centos-6.5, 64GB RAM, 2 SSD's/node for mfs/dfs. Running MapR 3.1.1. > Reporter: Andy Pernsteiner > > Using a JSON file with contents like so for each record: > {code}: > {"trans_id":999999,"date":"11/03/2012","time":"09:07:05","user_info":{"cust_id":2,"device":"AOS4.3","state":"tx"},"marketing_info":{"camp_id":14,"keywords":["it","i","wants","yes","things","few","like"]},"trans_info":{"prod_id":[167,145,5,487,290],"purch_flag":"false"}} > {code} > First I set the following to get more verbose output: > {quote} > 0: jdbc:drill:> alter session set `exec.errors.verbose`=true; > {quote} > Then performed a simple select via sqlline: > {quote} > select * from dfs.`/mapr/drillram/JSON/large/mobile.json`; > <50,000+ rows of output> > | 56184 | 03/11/2013 | 14:19:10 | > {"cust_id":4,"device":"IOS5","state":"va"} | > {"camp_id":15,"keywords":["young"]} | {"prod_id | > | 56185 | 07/03/2013 | 14:30:38 | > {"cust_id":1518,"device":"AOS4.4","state":"wi"} | > {"camp_id":11,"keywords":["so","way","okay | > | 56186 | 07/07/2013 | 10:41:04 | > {"cust_id":97279,"device":"IOS5","state":"ga"} | {"camp_id":7,"keywords":[]} > | {"prod_id":[9 | > Query failed: Failure while running fragment. null > [4407eef7-06aa-4cf9-9962-a2f187ce8f17] > Node details: ip-172-16-1-111:31011/31012 > java.lang.AssertionError > at > org.apache.drill.exec.vector.complex.WriteState.fail(WriteState.java:37) > at > org.apache.drill.exec.vector.complex.impl.AbstractBaseWriter.inform(AbstractBaseWriter.java:62) > at > org.apache.drill.exec.vector.complex.impl.RepeatedBigIntWriterImpl.inform(RepeatedBigIntWriterImpl.java:108) > at > org.apache.drill.exec.vector.complex.impl.RepeatedBigIntWriterImpl.setPosition(RepeatedBigIntWriterImpl.java:130) > at > org.apache.drill.exec.vector.complex.impl.SingleListWriter.setPosition(SingleListWriter.java:700) > at > org.apache.drill.exec.vector.complex.impl.SingleMapWriter.setPosition(SingleMapWriter.java:153) > at > org.apache.drill.exec.vector.complex.impl.SingleMapWriter.setPosition(SingleMapWriter.java:153) > at > org.apache.drill.exec.vector.complex.impl.VectorContainerWriter.setPosition(VectorContainerWriter.java:66) > at > org.apache.drill.exec.store.easy.json.JSONRecordReader2.next(JSONRecordReader2.java:80) > at > org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:148) > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:116) > at > org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:59) > at > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:98) > at > org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:49) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:116) > at > org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:250) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > java.lang.RuntimeException: java.sql.SQLException: Failure while trying to > get next result batch. > at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514) > at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148) > at sqlline.SqlLine.print(SqlLine.java:1809) > at sqlline.SqlLine$Commands.execute(SqlLine.java:3766) > at sqlline.SqlLine$Commands.sql(SqlLine.java:3663) > at sqlline.SqlLine.dispatch(SqlLine.java:889) > at sqlline.SqlLine.begin(SqlLine.java:763) > at sqlline.SqlLine.start(SqlLine.java:498) > at sqlline.SqlLine.main(SqlLine.java:460) > {quote} > If I re-run the same query against a smaller version of the same dataset > (<50,000 records), I don't have the issue. So far I've tried modifying the > DRILL_MAX_DIRECT_MEMORY and > DRILL_MAX_HEAP variables to see if I could find something that works, but > neither seem to make a difference. Note: the error appears the same if I run > on standalone mode. -- This message was sent by Atlassian JIRA (v6.2#6252)