[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates

2016-06-06 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317762#comment-15317762
 ] 

Siddharth Seth commented on HIVE-13599:
---

Thanks for the review. Test failures are unrelated. Committing.

> LLAP: Incorrect handling of the preemption queue on finishable state updates
> 
>
> Key: HIVE-13599
> URL: https://issues.apache.org/jira/browse/HIVE-13599
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-13599.01.patch, HIVE-13599.01.patch, 
> HIVE-13599.02.patch
>
>
> When running some tests with pre-emption enabled, got the following exception
> Looks like a race condition when removing items from pre-emption queue.
> {code}
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : 
> Wait queue scheduler worker exited with failure!
> java.util.NoSuchElementException
> at java.util.AbstractQueue.remove(AbstractQueue.java:117) 
> ~[?:1.7.0_55]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_55]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [?:1.7.0_55]
> at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55]
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] INFO impl.LlapDaemon : 
> UncaughtExceptionHandler invoked
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.LlapDaemon : Thread 
> Thread[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutting down now...
> java.util.NoSuchElementException
> at java.util.AbstractQueue.remove(AbstractQueue.java:117) 
> ~[?:1.7.0_55]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_55]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [?:1.7.0_55]
> at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates

2016-06-06 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316920#comment-15316920
 ] 

Prasanth Jayachandran commented on HIVE-13599:
--

LGTM, +1

> LLAP: Incorrect handling of the preemption queue on finishable state updates
> 
>
> Key: HIVE-13599
> URL: https://issues.apache.org/jira/browse/HIVE-13599
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-13599.01.patch, HIVE-13599.01.patch, 
> HIVE-13599.02.patch
>
>
> When running some tests with pre-emption enabled, got the following exception
> Looks like a race condition when removing items from pre-emption queue.
> {code}
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : 
> Wait queue scheduler worker exited with failure!
> java.util.NoSuchElementException
> at java.util.AbstractQueue.remove(AbstractQueue.java:117) 
> ~[?:1.7.0_55]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_55]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [?:1.7.0_55]
> at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55]
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] INFO impl.LlapDaemon : 
> UncaughtExceptionHandler invoked
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.LlapDaemon : Thread 
> Thread[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutting down now...
> java.util.NoSuchElementException
> at java.util.AbstractQueue.remove(AbstractQueue.java:117) 
> ~[?:1.7.0_55]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_55]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [?:1.7.0_55]
> at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates

2016-06-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15315314#comment-15315314
 ] 

Hive QA commented on HIVE-13599:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12807581/HIVE-13599.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10219 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
{noformat}

Test results: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/515/testReport
Console output: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/515/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-515/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12807581 - PreCommit-HIVE-MASTER-Build

> LLAP: Incorrect handling of the preemption queue on finishable state updates
> 
>
> Key: HIVE-13599
> URL: https://issues.apache.org/jira/browse/HIVE-13599
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-13599.01.patch, HIVE-13599.01.patch, 
> HIVE-13599.02.patch
>
>
> When running some tests with pre-emption enabled, got the following exception
> Looks like a race condition when removing items from pre-emption queue.
> {code}
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : 
> Wait queue scheduler worker exited with failure!
> java.util.NoSuchElementException
> at java.util.AbstractQueue.remove(AbstractQueue.java:117) 
> ~[?:1.7.0_55]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_55]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [?:1.7.0_55]
> at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55]
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] INFO impl.LlapDaemon : 
> UncaughtExceptionHandler invoked
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.LlapDaemon : Thread 
> Thread[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutting down now...
> java.util.NoSuchElementException
> at java.util.AbstractQueue.remove(AbstractQueue.java:117) 
> ~[?:1.7.0_55]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> 

[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates

2016-06-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311478#comment-15311478
 ] 

Hive QA commented on HIVE-13599:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12807222/HIVE-13599.01.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/490/testReport
Console output: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/490/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-490/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-490/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   a8cc702..b36c735  branch-2.1 -> origin/branch-2.1
   28f6015..eb9cea3  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 28f6015 HIVE-13894: Fix more json related JDK8 test failures 
Part 2 (Mohit Sabharwal, reviewed by Sergio Pena)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded.
+ git reset --hard origin/master
HEAD is now at eb9cea3 HIVE-13448 : LLAP: check ZK acls for ZKSM and fail if 
they are too permissive (Sergey Shelukhin, reviewed by Prasanth Jayachandran)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12807222 - PreCommit-HIVE-MASTER-Build

> LLAP: Incorrect handling of the preemption queue on finishable state updates
> 
>
> Key: HIVE-13599
> URL: https://issues.apache.org/jira/browse/HIVE-13599
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-13599.01.patch, HIVE-13599.01.patch
>
>
> When running some tests with pre-emption enabled, got the following exception
> Looks like a race condition when removing items from pre-emption queue.
> {code}
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : 
> Wait queue scheduler worker exited with failure!
> java.util.NoSuchElementException
> at java.util.AbstractQueue.remove(AbstractQueue.java:117) 
> ~[?:1.7.0_55]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> 

[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates

2016-06-01 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310757#comment-15310757
 ] 

Siddharth Seth commented on HIVE-13599:
---

Pending review and a jenkins run. If the RC is created before this goes in - 
we'll move this to 2.1.1

> LLAP: Incorrect handling of the preemption queue on finishable state updates
> 
>
> Key: HIVE-13599
> URL: https://issues.apache.org/jira/browse/HIVE-13599
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-13599.01.patch, HIVE-13599.01.patch
>
>
> When running some tests with pre-emption enabled, got the following exception
> Looks like a race condition when removing items from pre-emption queue.
> {code}
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : 
> Wait queue scheduler worker exited with failure!
> java.util.NoSuchElementException
> at java.util.AbstractQueue.remove(AbstractQueue.java:117) 
> ~[?:1.7.0_55]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_55]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [?:1.7.0_55]
> at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55]
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] INFO impl.LlapDaemon : 
> UncaughtExceptionHandler invoked
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.LlapDaemon : Thread 
> Thread[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutting down now...
> java.util.NoSuchElementException
> at java.util.AbstractQueue.remove(AbstractQueue.java:117) 
> ~[?:1.7.0_55]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_55]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [?:1.7.0_55]
> at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates

2016-05-31 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308464#comment-15308464
 ] 

Jesus Camacho Rodriguez commented on HIVE-13599:


[~sseth], what is the status on this one? I plan to create the first 2.1.0 RC 
tomorrow and this is marked as Critical. Should it go in and can it be 
deferred? Thanks

> LLAP: Incorrect handling of the preemption queue on finishable state updates
> 
>
> Key: HIVE-13599
> URL: https://issues.apache.org/jira/browse/HIVE-13599
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-13599.01.patch, HIVE-13599.01.patch
>
>
> When running some tests with pre-emption enabled, got the following exception
> Looks like a race condition when removing items from pre-emption queue.
> {code}
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : 
> Wait queue scheduler worker exited with failure!
> java.util.NoSuchElementException
> at java.util.AbstractQueue.remove(AbstractQueue.java:117) 
> ~[?:1.7.0_55]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_55]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [?:1.7.0_55]
> at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55]
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] INFO impl.LlapDaemon : 
> UncaughtExceptionHandler invoked
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.LlapDaemon : Thread 
> Thread[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutting down now...
> java.util.NoSuchElementException
> at java.util.AbstractQueue.remove(AbstractQueue.java:117) 
> ~[?:1.7.0_55]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_55]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [?:1.7.0_55]
> at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13599) LLAP: Incorrect handling of the preemption queue on finishable state updates

2016-05-27 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305018#comment-15305018
 ] 

Siddharth Seth commented on HIVE-13599:
---

There's also a race, and locking around that was intentionally left out. Have 
added a bunch of comments around this. In a subsequent jira - we may want to 
change this to include the preemption state updates within the main scheduler 
lock - the issue there being everything becomes single threaded, included task 
completions.

> LLAP: Incorrect handling of the preemption queue on finishable state updates
> 
>
> Key: HIVE-13599
> URL: https://issues.apache.org/jira/browse/HIVE-13599
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-13599.01.patch
>
>
> When running some tests with pre-emption enabled, got the following exception
> Looks like a race condition when removing items from pre-emption queue.
> {code}
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.TaskExecutorService : 
> Wait queue scheduler worker exited with failure!
> java.util.NoSuchElementException
> at java.util.AbstractQueue.remove(AbstractQueue.java:117) 
> ~[?:1.7.0_55]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_55]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [?:1.7.0_55]
> at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55]
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] INFO impl.LlapDaemon : 
> UncaughtExceptionHandler invoked
> 16/04/23 23:32:00 [Wait-Queue-Scheduler-0[]] ERROR impl.LlapDaemon : Thread 
> Thread[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutting down now...
> java.util.NoSuchElementException
> at java.util.AbstractQueue.remove(AbstractQueue.java:117) 
> ~[?:1.7.0_55]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.removeAndGetFromPreemptionQueue(TaskExecutorService.java:568)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.java:493)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.access$1100(TaskExecutorService.java:81)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueueWorker.run(TaskExecutorService.java:285)
>  ~[hive-llap-server-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_55]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [?:1.7.0_55]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [?:1.7.0_55]
> at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)