Khurram Faraaz created DRILL-7100: ------------------------------------- Summary: parquet RecordBatchSizerManager : IllegalArgumentException: the requested size must be non-negative Key: DRILL-7100 URL: https://issues.apache.org/jira/browse/DRILL-7100 Project: Apache Drill Issue Type: Bug Components: Storage - Parquet Affects Versions: 1.15.0 Reporter: Khurram Faraaz
Table has string columns that can range from 1024 bytes to 32MB in length, we should be able to handle such wide string columns in parquet, when querying from Drill. Hive Version 2.3.3 Drill Version 1.15 {noformat} CREATE TABLE temp.cust_bhsf_ce_blob_parquet ( event_id DECIMAL, valid_until_dt_tm string, blob_seq_num DECIMAL, valid_from_dt_tm string, blob_length DECIMAL, compression_cd DECIMAL, blob_contents string, updt_dt_tm string, updt_id DECIMAL, updt_task DECIMAL, updt_cnt DECIMAL, updt_applctx DECIMAL, last_utc_ts string, ccl_load_dt_tm string, ccl_updt_dt_tm string ) STORED AS PARQUET; {noformat} The source table is stored as ORC format. Failing query. {noformat} SELECT event_id, BLOB_CONTENTS FROM hive.temp.cust_bhsf_ce_blob_parquet WHERE event_id = 3443236037 2019-03-07 14:40:17,886 [237e8c79-0e9b-45d6-9134-0da95dba462f:frag:1:269] INFO o.a.d.exec.physical.impl.ScanBatch - User Error Occurred: the requested size must be non-negative (the requested size must be non-negative) org.apache.drill.common.exceptions.UserException: INTERNAL_ERROR ERROR: the requested size must be non-negative {noformat} Snippet from drillbit.log file {noformat} [Error Id: 41a4d597-f54d-42a6-be6d-5dbeb7f642ba ] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) ~[drill-common-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:293) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:69) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:93) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:297) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:284) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_181] at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_181] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669) [hadoop-common-2.7.0-mapr-1808.jar:na] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:284) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.15.0.0-mapr.jar:1.15.0.0-mapr] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181] Caused by: java.lang.IllegalArgumentException: the requested size must be non-negative at org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkArgument(Preconditions.java:135) ~[drill-shaded-guava-23.0.jar:23.0] at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:224) ~[drill-memory-base-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:211) ~[drill-memory-base-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.vector.VarCharVector.allocateNew(VarCharVector.java:394) ~[vector-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.vector.NullableVarCharVector.allocateNew(NullableVarCharVector.java:250) ~[vector-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.vector.AllocationHelper.allocatePrecomputedChildCount(AllocationHelper.java:41) ~[vector-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.vector.AllocationHelper.allocate(AllocationHelper.java:54) ~[vector-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.store.parquet.columnreaders.batchsizing.RecordBatchSizerManager.allocate(RecordBatchSizerManager.java:165) ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.allocate(ParquetRecordReader.java:276) ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.physical.impl.ScanBatch.internalNext(ScanBatch.java:221) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:271) [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] ... 21 common frames omitted 2019-03-07 14:40:17,887 [237e8c79-0e9b-45d6-9134-0da95dba462f:frag:1:269] INFO o.a.d.e.w.fragment.FragmentExecutor - 237e8c79-0e9b-45d6-9134-0da95dba462f:1:269: State change requested RUNNING --> FAILED 2019-03-07 14:40:17,887 [237e8c79-0e9b-45d6-9134-0da95dba462f:frag:1:269] ERROR o.a.d.e.physical.impl.BaseRootExec - Batch dump started: dumping last 2 failed batches {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)