[ https://issues.apache.org/jira/browse/DRILL-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16539354#comment-16539354 ]
ASF GitHub Bot commented on DRILL-6517: --------------------------------------- Ben-Zvi opened a new pull request #1373: DRILL-6517: Hash-Join: If not OK, exit early from prefetchFirstBatchFromBothSides URL: https://github.com/apache/drill/pull/1373 When a running query is cancelled (e.g., when Disk is full while spilling), many next() calls return a STOP outcome. Any Hash-Join that is "sniffing" its incoming batches (by repeatedly calling next()) would proceed to update the memory manager regardless of the outcomes returned. In some abnormal outcomes (like STOP), the batch's container is **not initialized**. The update call needs the batch's row count, which for *RemovingRecordBatch* is implemented (differently) by invoking the container's getRowCount() -- so in this case the call fails. The Fix: Move all the checks for the abnormal conditions immediately after the "sniffing", thus returning early and avoiding the update of the memory manager. Longer term question: How to implement getRowCount() for the record batch consistently. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > IllegalStateException: Record count not set for this vector container > --------------------------------------------------------------------- > > Key: DRILL-6517 > URL: https://issues.apache.org/jira/browse/DRILL-6517 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow > Affects Versions: 1.14.0 > Reporter: Khurram Faraaz > Assignee: Boaz Ben-Zvi > Priority: Critical > Fix For: 1.14.0 > > Attachments: 24d7b377-7589-7928-f34f-57d02061acef.sys.drill > > > TPC-DS query is Canceled after 2 hrs and 47 mins and we see anĀ > IllegalStateException: Record count not set for this vector container, in > drillbit.log > Steps to reproduce the problem, query profile > (24d7b377-7589-7928-f34f-57d02061acef) is attached here. > {noformat} > In drill-env.sh set max direct memory to 12G on all 4 nodes in cluster > export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"12G"} > and set these options from sqlline, > alter system set `planner.memory.max_query_memory_per_node` = 10737418240; > alter system set `drill.exec.hashagg.fallback.enabled` = true; > To run the query (replace IP-ADDRESS with your foreman node's IP address) > cd /opt/mapr/drill/drill-1.14.0/bin > ./sqlline -u > "jdbc:drill:schema=dfs.tpcds_sf1_parquet_views;drillbit=<IP-ADDRESS>" -f > /root/query72.sql > {noformat} > Stack trace from drillbit.log > {noformat} > 2018-06-18 20:08:51,912 [24d7b377-7589-7928-f34f-57d02061acef:frag:4:49] > ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: > IllegalStateException: Record count not set for this vector container > Fragment 4:49 > [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Record count not set for this vector container > Fragment 4:49 > [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361) > [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216) > [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:327) > [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > Caused by: java.lang.IllegalStateException: Record count not set for this > vector container > at com.google.common.base.Preconditions.checkState(Preconditions.java:173) > ~[guava-18.0.jar:na] > at > org.apache.drill.exec.record.VectorContainer.getRecordCount(VectorContainer.java:394) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.getRecordCount(RemovingRecordBatch.java:49) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.RecordBatchSizer.<init>(RecordBatchSizer.java:690) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.RecordBatchSizer.<init>(RecordBatchSizer.java:662) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.JoinBatchMemoryManager.update(JoinBatchMemoryManager.java:73) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.JoinBatchMemoryManager.update(JoinBatchMemoryManager.java:79) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides(HashJoinBatch.java:242) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema(HashJoinBatch.java:218) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:152) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch(HashJoinBatch.java:276) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides(HashJoinBatch.java:238) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema(HashJoinBatch.java:218) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:152) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:137) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.buildSchema(HashAggBatch.java:119) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:152) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:137) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:103) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:93) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:93) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:294) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:281) > ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_161] > at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_161] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595) > ~[hadoop-common-2.7.0-mapr-1707.jar:na] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:281) > [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] > ... 4 common frames omitted > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)