[jira] [Commented] (YARN-3697) FairScheduler: ContinuousSchedulingThread can't be shutdown after stop sometimes.

2015-09-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742773#comment-14742773
 ] 

Karthik Kambatla commented on YARN-3697:


+1

> FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
> sometimes. 
> --
>
> Key: YARN-3697
> URL: https://issues.apache.org/jira/browse/YARN-3697
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Attachments: YARN-3697.000.patch, YARN-3697.001.patch
>
>
> FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
> sometimes. 
> The reason is because the InterruptedException is blocked in 
> continuousSchedulingAttempt
> {code}
>   try {
> if (node != null && Resources.fitsIn(minimumAllocation,
> node.getAvailableResource())) {
>   attemptScheduling(node);
> }
>   } catch (Throwable ex) {
> LOG.error("Error while attempting scheduling for node " + node +
> ": " + ex.toString(), ex);
>   }
> {code}
> I saw the following exception after stop:
> {code}
> 2015-05-17 23:30:43,065 WARN  [FairSchedulerContinuousScheduling] 
> event.AsyncDispatcher (AsyncDispatcher.java:handle(247)) - AsyncDispatcher 
> thread interrupted
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
>   at 
> java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
>   at 
> java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:244)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:462)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:387)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:58)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.allocate(FSAppAttempt.java:357)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:516)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:649)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:803)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:334)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:173)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1082)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1014)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:285)
> 2015-05-17 23:30:43,066 ERROR [FairSchedulerContinuousScheduling] 
> fair.FairScheduler (FairScheduler.java:continuousSchedulingAttempt(1017)) - 
> Error while attempting scheduling for node host: 127.0.0.2:2 #containers=1 
> available= used=: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.InterruptedException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.InterruptedException
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:249)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
>   at 
> 

[jira] [Commented] (YARN-3697) FairScheduler: ContinuousSchedulingThread can't be shutdown after stop sometimes.

2015-08-23 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708310#comment-14708310
 ] 

zhihai xu commented on YARN-3697:
-

Hi [~kasha], thanks for the review! That is a good suggestion, I attached a new 
patch YARN-3697.001.patch which addressed your comments with two tests. Please 
review it. thanks again!

 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 --

 Key: YARN-3697
 URL: https://issues.apache.org/jira/browse/YARN-3697
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3697.000.patch, YARN-3697.001.patch


 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 The reason is because the InterruptedException is blocked in 
 continuousSchedulingAttempt
 {code}
   try {
 if (node != null  Resources.fitsIn(minimumAllocation,
 node.getAvailableResource())) {
   attemptScheduling(node);
 }
   } catch (Throwable ex) {
 LOG.error(Error while attempting scheduling for node  + node +
 :  + ex.toString(), ex);
   }
 {code}
 I saw the following exception after stop:
 {code}
 2015-05-17 23:30:43,065 WARN  [FairSchedulerContinuousScheduling] 
 event.AsyncDispatcher (AsyncDispatcher.java:handle(247)) - AsyncDispatcher 
 thread interrupted
 java.lang.InterruptedException
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
   at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
   at 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:244)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:462)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:387)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:58)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.allocate(FSAppAttempt.java:357)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:516)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:649)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:803)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:334)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:173)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1082)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1014)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:285)
 2015-05-17 23:30:43,066 ERROR [FairSchedulerContinuousScheduling] 
 fair.FairScheduler (FairScheduler.java:continuousSchedulingAttempt(1017)) - 
 Error while attempting scheduling for node host: 127.0.0.2:2 #containers=1 
 available=memory:7168, vCores:7 used=memory:1024, vCores:1: 
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.InterruptedException
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.InterruptedException
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:249)
   at 
 

[jira] [Commented] (YARN-3697) FairScheduler: ContinuousSchedulingThread can't be shutdown after stop sometimes.

2015-08-22 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707921#comment-14707921
 ] 

Karthik Kambatla commented on YARN-3697:


The patch itself looks good. Can we add a unittest to catch any regressions 
here? For instance, in the future, AsyncDispatcher might stop wrapping 
InterruptedException in a YarnRuntimeException. 

 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 --

 Key: YARN-3697
 URL: https://issues.apache.org/jira/browse/YARN-3697
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3697.000.patch


 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 The reason is because the InterruptedException is blocked in 
 continuousSchedulingAttempt
 {code}
   try {
 if (node != null  Resources.fitsIn(minimumAllocation,
 node.getAvailableResource())) {
   attemptScheduling(node);
 }
   } catch (Throwable ex) {
 LOG.error(Error while attempting scheduling for node  + node +
 :  + ex.toString(), ex);
   }
 {code}
 I saw the following exception after stop:
 {code}
 2015-05-17 23:30:43,065 WARN  [FairSchedulerContinuousScheduling] 
 event.AsyncDispatcher (AsyncDispatcher.java:handle(247)) - AsyncDispatcher 
 thread interrupted
 java.lang.InterruptedException
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
   at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
   at 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:244)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:462)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:387)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:58)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.allocate(FSAppAttempt.java:357)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:516)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:649)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:803)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:334)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:173)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1082)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1014)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:285)
 2015-05-17 23:30:43,066 ERROR [FairSchedulerContinuousScheduling] 
 fair.FairScheduler (FairScheduler.java:continuousSchedulingAttempt(1017)) - 
 Error while attempting scheduling for node host: 127.0.0.2:2 #containers=1 
 available=memory:7168, vCores:7 used=memory:1024, vCores:1: 
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.InterruptedException
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.InterruptedException
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:249)
   at 
 

[jira] [Commented] (YARN-3697) FairScheduler: ContinuousSchedulingThread can't be shutdown after stop sometimes.

2015-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653042#comment-14653042
 ] 

Hadoop QA commented on YARN-3697:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 31s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 51s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 49s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 47s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 23s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 30s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  53m  4s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  91m 56s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12748585/YARN-3697.000.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / c3364ca |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8761/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8761/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8761/console |


This message was automatically generated.

 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 --

 Key: YARN-3697
 URL: https://issues.apache.org/jira/browse/YARN-3697
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3697.000.patch


 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 The reason is because the InterruptedException is blocked in 
 continuousSchedulingAttempt
 {code}
   try {
 if (node != null  Resources.fitsIn(minimumAllocation,
 node.getAvailableResource())) {
   attemptScheduling(node);
 }
   } catch (Throwable ex) {
 LOG.error(Error while attempting scheduling for node  + node +
 :  + ex.toString(), ex);
   }
 {code}
 I saw the following exception after stop:
 {code}
 2015-05-17 23:30:43,065 WARN  [FairSchedulerContinuousScheduling] 
 event.AsyncDispatcher (AsyncDispatcher.java:handle(247)) - AsyncDispatcher 
 thread interrupted
 java.lang.InterruptedException
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
   at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
   at 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:244)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:462)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 

[jira] [Commented] (YARN-3697) FairScheduler: ContinuousSchedulingThread can't be shutdown after stop sometimes.

2015-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578588#comment-14578588
 ] 

Hadoop QA commented on YARN-3697:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 47s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 30s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 45s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 25s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  51m  7s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  88m 41s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12738526/YARN-3697.000.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 8d0ef31 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8225/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8225/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8225/console |


This message was automatically generated.

 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 --

 Key: YARN-3697
 URL: https://issues.apache.org/jira/browse/YARN-3697
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3697.000.patch


 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 The reason is because the InterruptedException is blocked in 
 continuousSchedulingAttempt
 {code}
   try {
 if (node != null  Resources.fitsIn(minimumAllocation,
 node.getAvailableResource())) {
   attemptScheduling(node);
 }
   } catch (Throwable ex) {
 LOG.error(Error while attempting scheduling for node  + node +
 :  + ex.toString(), ex);
   }
 {code}
 I saw the following exception after stop:
 {code}
 2015-05-17 23:30:43,065 WARN  [FairSchedulerContinuousScheduling] 
 event.AsyncDispatcher (AsyncDispatcher.java:handle(247)) - AsyncDispatcher 
 thread interrupted
 java.lang.InterruptedException
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
   at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
   at 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:244)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:462)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 

[jira] [Commented] (YARN-3697) FairScheduler: ContinuousSchedulingThread can't be shutdown after stop sometimes.

2015-06-09 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578677#comment-14578677
 ] 

Rohith commented on YARN-3697:
--

Hi [~zxu], 
 Trying for understanding the problem, Is it ocured when the RM shutdown is 
called which tries to stop FS service? Does it causing RM to hang during 
shutdown?

 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 --

 Key: YARN-3697
 URL: https://issues.apache.org/jira/browse/YARN-3697
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3697.000.patch


 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 The reason is because the InterruptedException is blocked in 
 continuousSchedulingAttempt
 {code}
   try {
 if (node != null  Resources.fitsIn(minimumAllocation,
 node.getAvailableResource())) {
   attemptScheduling(node);
 }
   } catch (Throwable ex) {
 LOG.error(Error while attempting scheduling for node  + node +
 :  + ex.toString(), ex);
   }
 {code}
 I saw the following exception after stop:
 {code}
 2015-05-17 23:30:43,065 WARN  [FairSchedulerContinuousScheduling] 
 event.AsyncDispatcher (AsyncDispatcher.java:handle(247)) - AsyncDispatcher 
 thread interrupted
 java.lang.InterruptedException
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
   at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
   at 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:244)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:462)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:387)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:58)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.allocate(FSAppAttempt.java:357)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:516)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:649)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:803)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:334)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:173)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1082)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1014)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:285)
 2015-05-17 23:30:43,066 ERROR [FairSchedulerContinuousScheduling] 
 fair.FairScheduler (FairScheduler.java:continuousSchedulingAttempt(1017)) - 
 Error while attempting scheduling for node host: 127.0.0.2:2 #containers=1 
 available=memory:7168, vCores:7 used=memory:1024, vCores:1: 
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.InterruptedException
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.InterruptedException
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:249)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
  

[jira] [Commented] (YARN-3697) FairScheduler: ContinuousSchedulingThread can't be shutdown after stop sometimes.

2015-06-09 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578449#comment-14578449
 ] 

zhihai xu commented on YARN-3697:
-

Updated the patch based on the latest code.

 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 --

 Key: YARN-3697
 URL: https://issues.apache.org/jira/browse/YARN-3697
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3697.000.patch


 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 The reason is because the InterruptedException is blocked in 
 continuousSchedulingAttempt
 {code}
   try {
 if (node != null  Resources.fitsIn(minimumAllocation,
 node.getAvailableResource())) {
   attemptScheduling(node);
 }
   } catch (Throwable ex) {
 LOG.error(Error while attempting scheduling for node  + node +
 :  + ex.toString(), ex);
   }
 {code}
 I saw the following exception after stop:
 {code}
 2015-05-17 23:30:43,065 WARN  [FairSchedulerContinuousScheduling] 
 event.AsyncDispatcher (AsyncDispatcher.java:handle(247)) - AsyncDispatcher 
 thread interrupted
 java.lang.InterruptedException
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
   at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
   at 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:244)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:462)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:387)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:58)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.allocate(FSAppAttempt.java:357)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:516)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:649)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:803)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:334)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:173)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1082)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1014)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:285)
 2015-05-17 23:30:43,066 ERROR [FairSchedulerContinuousScheduling] 
 fair.FairScheduler (FairScheduler.java:continuousSchedulingAttempt(1017)) - 
 Error while attempting scheduling for node host: 127.0.0.2:2 #containers=1 
 available=memory:7168, vCores:7 used=memory:1024, vCores:1: 
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.InterruptedException
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.InterruptedException
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:249)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
   at 
 

[jira] [Commented] (YARN-3697) FairScheduler: ContinuousSchedulingThread can't be shutdown after stop sometimes.

2015-06-09 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579428#comment-14579428
 ] 

zhihai xu commented on YARN-3697:
-

Hi [~rohithsharma], thanks for the review.
The ContinuousSchedulingThread is a daemon thread, it won't cause the RM to 
hang during shutdown.
I may cause issue for the test when it tries to stop the scheduler.

 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 --

 Key: YARN-3697
 URL: https://issues.apache.org/jira/browse/YARN-3697
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3697.000.patch


 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 The reason is because the InterruptedException is blocked in 
 continuousSchedulingAttempt
 {code}
   try {
 if (node != null  Resources.fitsIn(minimumAllocation,
 node.getAvailableResource())) {
   attemptScheduling(node);
 }
   } catch (Throwable ex) {
 LOG.error(Error while attempting scheduling for node  + node +
 :  + ex.toString(), ex);
   }
 {code}
 I saw the following exception after stop:
 {code}
 2015-05-17 23:30:43,065 WARN  [FairSchedulerContinuousScheduling] 
 event.AsyncDispatcher (AsyncDispatcher.java:handle(247)) - AsyncDispatcher 
 thread interrupted
 java.lang.InterruptedException
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
   at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
   at 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:244)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:462)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:387)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:58)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.allocate(FSAppAttempt.java:357)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:516)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:649)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:803)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:334)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:173)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1082)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1014)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:285)
 2015-05-17 23:30:43,066 ERROR [FairSchedulerContinuousScheduling] 
 fair.FairScheduler (FairScheduler.java:continuousSchedulingAttempt(1017)) - 
 Error while attempting scheduling for node host: 127.0.0.2:2 #containers=1 
 available=memory:7168, vCores:7 used=memory:1024, vCores:1: 
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.InterruptedException
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.InterruptedException
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:249)
   at 
 

[jira] [Commented] (YARN-3697) FairScheduler: ContinuousSchedulingThread can't be shutdown after stop sometimes.

2015-05-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555126#comment-14555126
 ] 

Hadoop QA commented on YARN-3697:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 41s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 46s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 16s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  49m 59s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  86m 23s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734631/YARN-3697.000.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 05e04f3 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8047/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8047/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8047/console |


This message was automatically generated.

 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 --

 Key: YARN-3697
 URL: https://issues.apache.org/jira/browse/YARN-3697
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3697.000.patch


 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 The reason is because the InterruptedException is blocked in 
 continuousSchedulingAttempt
 {code}
   try {
 if (node != null  Resources.fitsIn(minimumAllocation,
 node.getAvailableResource())) {
   attemptScheduling(node);
 }
   } catch (Throwable ex) {
 LOG.error(Error while attempting scheduling for node  + node +
 :  + ex.toString(), ex);
   }
 {code}
 I saw the following exception after stop:
 {code}
 2015-05-17 23:30:43,065 WARN  [FairSchedulerContinuousScheduling] 
 event.AsyncDispatcher (AsyncDispatcher.java:handle(247)) - AsyncDispatcher 
 thread interrupted
 java.lang.InterruptedException
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
   at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
   at 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:244)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:462)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 

[jira] [Commented] (YARN-3697) FairScheduler: ContinuousSchedulingThread can't be shutdown after stop sometimes.

2015-05-21 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14554951#comment-14554951
 ] 

zhihai xu commented on YARN-3697:
-

I uploaded a patch YARN-3697.000.patch  for review. I find a good article at 
http://www.ibm.com/developerworks/library/j-jtp05236/
,which talked about InterruptedException


 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 --

 Key: YARN-3697
 URL: https://issues.apache.org/jira/browse/YARN-3697
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3697.000.patch


 FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
 sometimes. 
 The reason is because the InterruptedException is blocked in 
 continuousSchedulingAttempt
 {code}
   try {
 if (node != null  Resources.fitsIn(minimumAllocation,
 node.getAvailableResource())) {
   attemptScheduling(node);
 }
   } catch (Throwable ex) {
 LOG.error(Error while attempting scheduling for node  + node +
 :  + ex.toString(), ex);
   }
 {code}
 I saw the following exception after stop:
 {code}
 2015-05-17 23:30:43,065 WARN  [FairSchedulerContinuousScheduling] 
 event.AsyncDispatcher (AsyncDispatcher.java:handle(247)) - AsyncDispatcher 
 thread interrupted
 java.lang.InterruptedException
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
   at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
   at 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:244)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:462)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:387)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:58)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.allocate(FSAppAttempt.java:357)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:516)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:649)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:803)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:334)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:173)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1082)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1014)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:285)
 2015-05-17 23:30:43,066 ERROR [FairSchedulerContinuousScheduling] 
 fair.FairScheduler (FairScheduler.java:continuousSchedulingAttempt(1017)) - 
 Error while attempting scheduling for node host: 127.0.0.2:2 #containers=1 
 available=memory:7168, vCores:7 used=memory:1024, vCores:1: 
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.InterruptedException
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
 java.lang.InterruptedException
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:249)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
   at