[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates
[ https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317762#comment-15317762 ] Siddharth Seth commented on HIVE-13599: --- Thanks for the review. Test failures are unrelated. Committing. > LLAP: Incorrect handling of the preemption queue on finishable state updates > > > Key: HIVE-13599 > URL: https://issues.apache.org/jira/browse/HIVE-13599 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Siddharth Seth >Priority: Critical > Attachments: HIVE-13599.01.patch, HIVE-13599.01.patch, > HIVE-13599.02.patch > > > When running some tests with pre-emption enabled, got the following exception > Looks like a race condition when removing items from pre-emption queue. > {code} > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : > Wait queue scheduler worker exited with failure! > java.util.NoSuchElementException > at java.util.AbstractQueue.remove(AbstractQueue.java:117) > ~[?:1.7.0_55] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_55] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [?:1.7.0_55] > at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55] > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] INFO impl.LlapDaemon : > UncaughtExceptionHandler invoked > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.LlapDaemon : Thread > Thread[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutting down now... > java.util.NoSuchElementException > at java.util.AbstractQueue.remove(AbstractQueue.java:117) > ~[?:1.7.0_55] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_55] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [?:1.7.0_55] > at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates
[ https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316920#comment-15316920 ] Prasanth Jayachandran commented on HIVE-13599: -- LGTM, +1 > LLAP: Incorrect handling of the preemption queue on finishable state updates > > > Key: HIVE-13599 > URL: https://issues.apache.org/jira/browse/HIVE-13599 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Siddharth Seth >Priority: Critical > Attachments: HIVE-13599.01.patch, HIVE-13599.01.patch, > HIVE-13599.02.patch > > > When running some tests with pre-emption enabled, got the following exception > Looks like a race condition when removing items from pre-emption queue. > {code} > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : > Wait queue scheduler worker exited with failure! > java.util.NoSuchElementException > at java.util.AbstractQueue.remove(AbstractQueue.java:117) > ~[?:1.7.0_55] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_55] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [?:1.7.0_55] > at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55] > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] INFO impl.LlapDaemon : > UncaughtExceptionHandler invoked > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.LlapDaemon : Thread > Thread[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutting down now... > java.util.NoSuchElementException > at java.util.AbstractQueue.remove(AbstractQueue.java:117) > ~[?:1.7.0_55] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_55] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [?:1.7.0_55] > at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates
[ https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15315314#comment-15315314 ] Hive QA commented on HIVE-13599: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12807581/HIVE-13599.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10219 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 {noformat} Test results: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/515/testReport Console output: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/515/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-515/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12807581 - PreCommit-HIVE-MASTER-Build > LLAP: Incorrect handling of the preemption queue on finishable state updates > > > Key: HIVE-13599 > URL: https://issues.apache.org/jira/browse/HIVE-13599 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Siddharth Seth >Priority: Critical > Attachments: HIVE-13599.01.patch, HIVE-13599.01.patch, > HIVE-13599.02.patch > > > When running some tests with pre-emption enabled, got the following exception > Looks like a race condition when removing items from pre-emption queue. > {code} > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : > Wait queue scheduler worker exited with failure! > java.util.NoSuchElementException > at java.util.AbstractQueue.remove(AbstractQueue.java:117) > ~[?:1.7.0_55] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_55] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [?:1.7.0_55] > at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55] > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] INFO impl.LlapDaemon : > UncaughtExceptionHandler invoked > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.LlapDaemon : Thread > Thread[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutting down now... > java.util.NoSuchElementException > at java.util.AbstractQueue.remove(AbstractQueue.java:117) > ~[?:1.7.0_55] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at >
[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates
[ https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311478#comment-15311478 ] Hive QA commented on HIVE-13599: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12807222/HIVE-13599.01.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/490/testReport Console output: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/490/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-490/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.8.0_25 ]] + export JAVA_HOME=/usr/java/jdk1.8.0_25 + JAVA_HOME=/usr/java/jdk1.8.0_25 + export PATH=/usr/java/jdk1.8.0_25/bin/:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.8.0_25/bin/:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-490/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive a8cc702..b36c735 branch-2.1 -> origin/branch-2.1 28f6015..eb9cea3 master -> origin/master + git reset --hard HEAD HEAD is now at 28f6015 HIVE-13894: Fix more json related JDK8 test failures Part 2 (Mohit Sabharwal, reviewed by Sergio Pena) + git clean -f -d + git checkout master Already on 'master' Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded. + git reset --hard origin/master HEAD is now at eb9cea3 HIVE-13448 : LLAP: check ZK acls for ZKSM and fail if they are too permissive (Sergey Shelukhin, reviewed by Prasanth Jayachandran) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12807222 - PreCommit-HIVE-MASTER-Build > LLAP: Incorrect handling of the preemption queue on finishable state updates > > > Key: HIVE-13599 > URL: https://issues.apache.org/jira/browse/HIVE-13599 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Siddharth Seth >Priority: Critical > Attachments: HIVE-13599.01.patch, HIVE-13599.01.patch > > > When running some tests with pre-emption enabled, got the following exception > Looks like a race condition when removing items from pre-emption queue. > {code} > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : > Wait queue scheduler worker exited with failure! > java.util.NoSuchElementException > at java.util.AbstractQueue.remove(AbstractQueue.java:117) > ~[?:1.7.0_55] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at >
[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates
[ https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310757#comment-15310757 ] Siddharth Seth commented on HIVE-13599: --- Pending review and a jenkins run. If the RC is created before this goes in - we'll move this to 2.1.1 > LLAP: Incorrect handling of the preemption queue on finishable state updates > > > Key: HIVE-13599 > URL: https://issues.apache.org/jira/browse/HIVE-13599 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Siddharth Seth >Priority: Critical > Attachments: HIVE-13599.01.patch, HIVE-13599.01.patch > > > When running some tests with pre-emption enabled, got the following exception > Looks like a race condition when removing items from pre-emption queue. > {code} > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : > Wait queue scheduler worker exited with failure! > java.util.NoSuchElementException > at java.util.AbstractQueue.remove(AbstractQueue.java:117) > ~[?:1.7.0_55] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_55] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [?:1.7.0_55] > at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55] > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] INFO impl.LlapDaemon : > UncaughtExceptionHandler invoked > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.LlapDaemon : Thread > Thread[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutting down now... > java.util.NoSuchElementException > at java.util.AbstractQueue.remove(AbstractQueue.java:117) > ~[?:1.7.0_55] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_55] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [?:1.7.0_55] > at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates
[ https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308464#comment-15308464 ] Jesus Camacho Rodriguez commented on HIVE-13599: [~sseth], what is the status on this one? I plan to create the first 2.1.0 RC tomorrow and this is marked as Critical. Should it go in and can it be deferred? Thanks > LLAP: Incorrect handling of the preemption queue on finishable state updates > > > Key: HIVE-13599 > URL: https://issues.apache.org/jira/browse/HIVE-13599 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Siddharth Seth >Priority: Critical > Attachments: HIVE-13599.01.patch, HIVE-13599.01.patch > > > When running some tests with pre-emption enabled, got the following exception > Looks like a race condition when removing items from pre-emption queue. > {code} > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : > Wait queue scheduler worker exited with failure! > java.util.NoSuchElementException > at java.util.AbstractQueue.remove(AbstractQueue.java:117) > ~[?:1.7.0_55] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_55] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [?:1.7.0_55] > at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55] > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] INFO impl.LlapDaemon : > UncaughtExceptionHandler invoked > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.LlapDaemon : Thread > Thread[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutting down now... > java.util.NoSuchElementException > at java.util.AbstractQueue.remove(AbstractQueue.java:117) > ~[?:1.7.0_55] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_55] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [?:1.7.0_55] > at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates
[ https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305018#comment-15305018 ] Siddharth Seth commented on HIVE-13599: --- There's also a race, and locking around that was intentionally left out. Have added a bunch of comments around this. In a subsequent jira - we may want to change this to include the preemption state updates within the main scheduler lock - the issue there being everything becomes single threaded, included task completions. > LLAP: Incorrect handling of the preemption queue on finishable state updates > > > Key: HIVE-13599 > URL: https://issues.apache.org/jira/browse/HIVE-13599 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Siddharth Seth >Priority: Critical > Attachments: HIVE-13599.01.patch > > > When running some tests with pre-emption enabled, got the following exception > Looks like a race condition when removing items from pre-emption queue. > {code} > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : > Wait queue scheduler worker exited with failure! > java.util.NoSuchElementException > at java.util.AbstractQueue.remove(AbstractQueue.java:117) > ~[?:1.7.0_55] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_55] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [?:1.7.0_55] > at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55] > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] INFO impl.LlapDaemon : > UncaughtExceptionHandler invoked > 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.LlapDaemon : Thread > Thread[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutting down now... > java.util.NoSuchElementException > at java.util.AbstractQueue.remove(AbstractQueue.java:117) > ~[?:1.7.0_55] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285) > ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[?:1.7.0_55] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [?:1.7.0_55] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [?:1.7.0_55] > at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)