[ https://issues.apache.org/jira/browse/DRILL-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17835829#comment-17835829 ]
ASF GitHub Bot commented on DRILL-8480: --------------------------------------- rymarm commented on PR #2897: URL: https://github.com/apache/drill/pull/2897#issuecomment-2048081465 @cgivre Actually, there is one more thing, that I would fix in the scope of this PR: https://github.com/apache/drill/blob/a726a4544dfbf1427f41fb916d3d976bd511189b/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java#L396-L399 These few lines seem to be kludge which may not work in some very rare cases, but I forgot what issue occurs if remove it. I would like to take a look at this, but I can do this in a separate PR because the current issue is completely fixed with the current changes. > Cleanup before finished. 0 out of 1 streams have finished > --------------------------------------------------------- > > Key: DRILL-8480 > URL: https://issues.apache.org/jira/browse/DRILL-8480 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.21.1 > Reporter: Maksym Rymar > Assignee: Maksym Rymar > Priority: Major > Attachments: 1a349ff1-d1f9-62bf-ed8c-26346c548005.sys.drill, > tableWithNumber2.parquet > > > Drill fails to execute a query with the following exception: > {code:java} > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Cleanup before finished. 0 out of 1 streams have > finished > Fragment: 1:0 > Please, refer to logs for more information. > [Error Id: 270da8f4-0bb6-4985-bf4f-34853138881c on > compute7.vmcluster.com:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:395) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:245) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:362) > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: java.lang.IllegalStateException: Cleanup before finished. 0 out of > 1 streams have finished > at > org.apache.drill.exec.work.batch.BaseRawBatchBuffer.close(BaseRawBatchBuffer.java:111) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:71) > at > org.apache.drill.exec.work.batch.AbstractDataCollector.close(AbstractDataCollector.java:121) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91) > at > org.apache.drill.exec.work.batch.IncomingBuffers.close(IncomingBuffers.java:144) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:567) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:417) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:240) > ... 5 common frames omitted > Suppressed: java.lang.IllegalStateException: Cleanup before finished. > 0 out of 1 streams have finished > ... 15 common frames omitted > Suppressed: java.lang.IllegalStateException: Memory was leaked by > query. Memory leaked: (32768) > Allocator(op:1:0:8:UnorderedReceiver) 1000000/32768/32768/10000000000 > (res/actual/peak/limit) > at > org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) > at > org.apache.drill.exec.ops.BaseOperatorContext.close(BaseOperatorContext.java:159) > at > org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:77) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:571) > ... 7 common frames omitted > Suppressed: java.lang.IllegalStateException: Memory was leaked by > query. Memory leaked: (1016640) > Allocator(frag:1:0) 30000000/1016640/30016640/90715827882 > (res/actual/peak/limit) > at > org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:574) > ... 7 common frames omitted {code} > Steps to reproduce: > 1.Enable unequal join: > {code:java} > alter session set `planner.enable_nljoin_for_scalar_only`=false; {code} > 2. Disable join optimization to prevent Drill from flipping sides of > join that may break the query execution because the NestedLoopJoin operator > that executes unequal joins supports only the left join. > {code:java} > alter session set `planner.enable_join_optimization`=false; {code} > 3. Execute join one side which is UNION ALL with DISTINCT: > {code:java} > SELECT * > FROM ( > ( > SELECT DISTINCT > log_number > FROM > dfs.tmp.`tableWithNumber2.parquet` > ) > UNION ALL > ( > SELECT DISTINCT > log_number > FROM > dfs.tmp.`tableWithNumber2.parquet` > ) > ) t1 > LEFT JOIN ( > SELECT > log_number AS server_number > FROM > dfs.tmp.`tableWithNumber2.parquet` > ) t3 > ON ( > t3.server_number >= t1.log_number > ) > {code} > {{Parquet file for the reproduce: [^tableWithNumber2.parquet]. It contains 10 > 000 rows, with a single column of random double values. The file was > generated by Drill:}} > {code:java} > apache drill> select *, sqlTypeOf(log_number) as log_number_column_type from > dfs.tmp.`tableWithNumber2.parquet` limit 10; > +------------+------------------------+ > | log_number | log_number_column_type | > +------------+------------------------+ > | 4.0 | DOUBLE | > | 5.0 | DOUBLE | > | 4.0 | DOUBLE | > | 3.0 | DOUBLE | > | 4.0 | DOUBLE | > | 0.0 | DOUBLE | > | 0.0 | DOUBLE | > | 8.0 | DOUBLE | > | 3.0 | DOUBLE | > | 4.0 | DOUBLE | > +------------+------------------------+ > 10 rows selected (0.175 seconds) {code} > Also attaching the profile of the failed query. -- This message was sent by Atlassian Jira (v8.20.10#820010)