[ 
https://issues.apache.org/jira/browse/DRILL-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16186143#comment-16186143
 ] 

ASF GitHub Bot commented on DRILL-5564:
---------------------------------------

GitHub user KulykRoman opened a pull request:

    https://github.com/apache/drill/pull/967

    DRILL-5564: Added finally block for stopWait() to avoid all situation…

    …s where Drill able to miss stopWait() in case of exceptions (it can lead 
to assertions).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/KulykRoman/drill DRILL-5564

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/967.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #967
    
----
commit c0e567cefe7ad9bf9745ed41fb87abba60dbf1b9
Author: Roman Kulyk <[email protected]>
Date:   2017-09-29T17:26:39Z

    DRILL-5564: Added finally block for stopWait() to avoid all situations 
where Drill able to miss stopWait() in case of exceptions (it can lead to 
assertions).

----


> IllegalStateException: allocator[op:21:1:5:HashJoinPOP]: buffer space 
> (16674816) + prealloc space (0) + child space (0) != allocated (16740352)
> -----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-5564
>                 URL: https://issues.apache.org/jira/browse/DRILL-5564
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.11.0
>         Environment: 3 node CentOS cluster
>            Reporter: Khurram Faraaz
>            Assignee: Roman Kulyk
>
> Run a concurrent Java program that executes TPCDS query11
> while the above concurrent java program is under execution
> stop foreman Drillbit (from another shell, using below command)
> ./bin/drillbit.sh stop
> and you will see the IllegalStateException: allocator[op:21:1:5:HashJoinPOP]: 
>  and another assertion error, in the drillbit.log
> AssertionError: Failure while stopping processing for operator id 10. 
> Currently have states of processing:false, setup:false, waiting:true.       
> Drill 1.11.0 git commit ID: d11aba2 (with assertions enabled)
>  
> details from drillbit.log from the foreman Drillbit node.
> {noformat}
> 2017-06-05 18:38:33,838 [26ca5afa-7f6d-991b-1fdf-6196faddc229:frag:23:1] INFO 
>  o.a.d.e.w.fragment.FragmentExecutor - 
> 26ca5afa-7f6d-991b-1fdf-6196faddc229:23:1: State change requested RUNNING --> 
> FAILED
> 2017-06-05 18:38:33,849 [26ca5afa-7f6d-991b-1fdf-6196faddc229:frag:23:1] INFO 
>  o.a.d.e.w.fragment.FragmentExecutor - 
> 26ca5afa-7f6d-991b-1fdf-6196faddc229:23:1: State change requested FAILED --> 
> FINISHED
> 2017-06-05 18:38:33,852 [26ca5afa-7f6d-991b-1fdf-6196faddc229:frag:23:1] 
> ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: AssertionError: 
> Failure while stopping processing for operator id 10. Currently have states 
> of processing:false, setup:false, waiting:true.
> Fragment 23:1
> [Error Id: a116b326-43ed-4569-a20e-a10ba03d215e on centos-01.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> AssertionError: Failure while stopping processing for operator id 10. 
> Currently have states of processing:false, setup:false, waiting:true.
> Fragment 23:1
> [Error Id: a116b326-43ed-4569-a20e-a10ba03d215e on centos-01.qa.lab:31010]
>         at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544)
>  ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:295)
>  [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
>  [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:264)
>  [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_91]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_91]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> Caused by: java.lang.RuntimeException: java.lang.AssertionError: Failure 
> while stopping processing for operator id 10. Currently have states of 
> processing:false, setup:false, waiting:true.
>         at 
> org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:101)
>  ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:409)
>  [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250)
>  [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Failure while stopping processing for 
> operator id 10. Currently have states of processing:false, setup:false, 
> waiting:true.
>         at 
> org.apache.drill.exec.ops.OperatorStats.stopProcessing(OperatorStats.java:167)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:255) 
> ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.test.generated.HashJoinProbeGen31.executeProbePhase(HashJoinProbeTemplate.java:119)
>  ~[na:na]
>         at 
> org.apache.drill.exec.test.generated.HashJoinProbeGen31.probeAndProject(HashJoinProbeTemplate.java:225)
>  ~[na:na]
>         at 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext(HashJoinBatch.java:237)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.test.generated.HashJoinProbeGen80.executeProbePhase(HashJoinProbeTemplate.java:119)
>  ~[na:na]
>         at 
> org.apache.drill.exec.test.generated.HashJoinProbeGen80.probeAndProject(HashJoinProbeTemplate.java:225)
>  ~[na:na]
>         at 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext(HashJoinBatch.java:237)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.test.generated.HashAggregatorGen166.doWork(HashAggTemplate.java:312)
>  ~[na:na]
>         at 
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext(HashAggBatch.java:146)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:105) 
> ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:92)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:95) 
> ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:234)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:227)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at java.security.AccessController.doPrivileged(Native Method) 
> ~[na:1.8.0_91]
>         at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_91]
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
>  ~[hadoop-common-2.7.0-mapr-1607.jar:na]
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:227)
>  [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         ... 4 common frames omitted
> 2017-06-05 18:38:33,933 [Drillbit-ShutdownHook#0] ERROR 
> o.a.d.exec.server.BootStrapContext - Pool did not terminate
> 2017-06-05 18:38:33,943 [Drillbit-ShutdownHook#0] ERROR 
> o.a.d.exec.server.BootStrapContext - Error while closing
> java.lang.IllegalStateException: allocator[op:21:1:5:HashJoinPOP]: buffer 
> space (16674816) + prealloc space (0) + child space (0) != allocated 
> (16740352)
>         at 
> org.apache.drill.exec.memory.BaseAllocator.verifyAllocator(BaseAllocator.java:719)
>  ~[drill-memory-base-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.memory.BaseAllocator.verifyAllocator(BaseAllocator.java:614)
>  ~[drill-memory-base-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.memory.BaseAllocator.verifyAllocator(BaseAllocator.java:614)
>  ~[drill-memory-base-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.memory.BaseAllocator.verifyAllocator(BaseAllocator.java:585)
>  ~[drill-memory-base-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:480) 
> ~[drill-memory-base-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:247)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
> [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
> [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:159) 
> [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.server.Drillbit$ShutdownThread.run(Drillbit.java:253) 
> [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
> 2017-06-05 18:38:33,944 [Drillbit-ShutdownHook#0] INFO  
> o.apache.drill.exec.server.Drillbit - Shutdown completed (11439 ms).
> {noformat}
> Java snippet that has details of number of threads.
> {noformat}
> ExecutorService executor = Executors.newFixedThreadPool(16);
>         try {
>             for (int i = 1; i <= 8; i++) {
>                 executor.submit(new ConcurrentQuery(conn));
>             }
>         } catch (Exception e) {
>             System.out.println(e.getMessage());
>             e.printStackTrace();
>         }
> {noformat}
> There are 24 cores on each of the three CentOS nodes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to