[jira] [Commented] (YARN-1572) Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal

2015-05-01 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523861#comment-14523861
 ] 

Jian He commented on YARN-1572:
---

Hi [~gujilangzi], the patch you uploaded is a branch-2 patch. Could you please 
work on a trunk patch ?

 Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal
 --

 Key: YARN-1572
 URL: https://issues.apache.org/jira/browse/YARN-1572
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.3.0
Reporter: Wenwu Peng
Assignee: Wenwu Peng
 Attachments: YARN-1572-branch-2.3.0.001.patch, YARN-1572-log.tar.gz, 
 conf.tar.gz, log.tar.gz


 we have lower chance to hit NPE in allocateNodeLocal  when run benchmark(hit 
 4 in 20 times).
 {code}
 2014-07-31 04:18:19,653 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: 
 Assigned container container_1406794589275_0001_01_21 of capacity 
 memory:1024, vCores:1 on host datanode10:57281, which has 6 containers, 
 memory:6144, vCores:6 used and memory:2048, vCores:2 available after 
 allocation
 2014-07-31 04:18:19,654 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:311)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:268)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:136)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:683)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:602)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:560)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:488)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:729)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:774)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:101)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:599)
 at java.lang.Thread.run(Thread.java:662)
 2014-07-31 04:18:19,655 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1572) Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal

2015-04-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391976#comment-14391976
 ] 

Hadoop QA commented on YARN-1572:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12708878/YARN-1572-branch-2.3.0.001.patch
  against trunk revision f383fd9.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7197//console

This message is automatically generated.

 Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal
 --

 Key: YARN-1572
 URL: https://issues.apache.org/jira/browse/YARN-1572
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.3.0
Reporter: Wenwu Peng
Assignee: Wenwu Peng
 Attachments: YARN-1572-branch-2.3.0.001.patch, YARN-1572-log.tar.gz, 
 conf.tar.gz, log.tar.gz


 we have lower chance to hit NPE in allocateNodeLocal  when run benchmark(hit 
 4 in 20 times).
 {code}
 2014-07-31 04:18:19,653 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: 
 Assigned container container_1406794589275_0001_01_21 of capacity 
 memory:1024, vCores:1 on host datanode10:57281, which has 6 containers, 
 memory:6144, vCores:6 used and memory:2048, vCores:2 available after 
 allocation
 2014-07-31 04:18:19,654 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:311)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:268)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:136)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:683)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:602)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:560)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:488)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:729)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:774)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:101)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:599)
 at java.lang.Thread.run(Thread.java:662)
 2014-07-31 04:18:19,655 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1572) Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal

2015-04-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391710#comment-14391710
 ] 

Hadoop QA commented on YARN-1572:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12708801/0001-Fix-for-YARN-1572.patch
  against trunk revision 3c7adaa.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7195//console

This message is automatically generated.

 Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal
 --

 Key: YARN-1572
 URL: https://issues.apache.org/jira/browse/YARN-1572
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.3.0
Reporter: Wenwu Peng
Assignee: Wenwu Peng
 Attachments: 0001-Fix-for-YARN-1572.patch, YARN-1572-log.tar.gz, 
 conf.tar.gz, log.tar.gz


 we have lower chance to hit NPE in allocateNodeLocal  when run benchmark(hit 
 4 in 20 times).
 {code}
 2014-07-31 04:18:19,653 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: 
 Assigned container container_1406794589275_0001_01_21 of capacity 
 memory:1024, vCores:1 on host datanode10:57281, which has 6 containers, 
 memory:6144, vCores:6 used and memory:2048, vCores:2 available after 
 allocation
 2014-07-31 04:18:19,654 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:311)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:268)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:136)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:683)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:602)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:560)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:488)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:729)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:774)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:101)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:599)
 at java.lang.Thread.run(Thread.java:662)
 2014-07-31 04:18:19,655 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1572) Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal

2015-04-01 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391406#comment-14391406
 ] 

Wangda Tan commented on YARN-1572:
--

This is a bad bug, adding null check seems enough to me, this could caused by 
user uses ApplicationMasterProtocol instead of AMRMClient, adding node-label 
request but doesn't add rack-local request. [~gujilangzi], are you still 
working on this?

 Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal
 --

 Key: YARN-1572
 URL: https://issues.apache.org/jira/browse/YARN-1572
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.2.0
Reporter: Wenwu Peng
Assignee: Wenwu Peng
 Attachments: YARN-1572-log.tar.gz, conf.tar.gz, log.tar.gz


 we have lower chance to hit NPE in allocateNodeLocal  when run benchmark(hit 
 4 in 20 times).
 {code}
 2014-07-31 04:18:19,653 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: 
 Assigned container container_1406794589275_0001_01_21 of capacity 
 memory:1024, vCores:1 on host datanode10:57281, which has 6 containers, 
 memory:6144, vCores:6 used and memory:2048, vCores:2 available after 
 allocation
 2014-07-31 04:18:19,654 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:311)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:268)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:136)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:683)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:602)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:560)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:488)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:729)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:774)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:101)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:599)
 at java.lang.Thread.run(Thread.java:662)
 2014-07-31 04:18:19,655 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1572) Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal

2015-04-01 Thread Kareem El Gebaly (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391363#comment-14391363
 ] 

Kareem El Gebaly commented on YARN-1572:


I agree with solution.
It affects version 2.3.0 as well, any patch/ solution found yet?

 Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal
 --

 Key: YARN-1572
 URL: https://issues.apache.org/jira/browse/YARN-1572
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.2.0
Reporter: Wenwu Peng
Assignee: Wenwu Peng
 Attachments: YARN-1572-log.tar.gz, conf.tar.gz, log.tar.gz


 we have lower chance to hit NPE in allocateNodeLocal  when run benchmark(hit 
 4 in 20 times).
 {code}
 2014-07-31 04:18:19,653 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: 
 Assigned container container_1406794589275_0001_01_21 of capacity 
 memory:1024, vCores:1 on host datanode10:57281, which has 6 containers, 
 memory:6144, vCores:6 used and memory:2048, vCores:2 available after 
 allocation
 2014-07-31 04:18:19,654 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:311)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:268)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:136)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:683)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:602)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:560)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:488)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:729)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:774)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:101)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:599)
 at java.lang.Thread.run(Thread.java:662)
 2014-07-31 04:18:19,655 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1572) Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal

2014-07-31 Thread Wenwu Peng (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080540#comment-14080540
 ] 

Wenwu Peng commented on YARN-1572:
--

Not sure the rackLocalRequest is cause of NPE, Better to check rackLocalRequest 
whether is null before rackLocalRequest.setNumContainers
{code}
ResourceRequest rackLocalRequest = 
requests.get(priority).get(node.getRackName());
rackLocalRequest.setNumContainers(rackLocalRequest.getNumContainers() - 1);
{code}

 Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal
 --

 Key: YARN-1572
 URL: https://issues.apache.org/jira/browse/YARN-1572
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.2.0
Reporter: Wenwu Peng
Assignee: Junping Du
 Attachments: conf.tar.gz, log.tar.gz


 we have lower chance to hit NPE in allocateNodeLocal  when run benchmark(hit 
 4 in 20 times).
 Steps:
 1. setup hadoop 2.2.0 environment
 2. Run for i in {1..10}; do /hadoop/hadoop-smoke/bin/hadoop jar 
 /hadoop/hadoop-smoke/share/hadoop/mapreduce/hadoop-mapreduce-client-common-*.jar
  org.apache.hadoop.fs.TestDFSIO -write -nrFiles 30 -fileSize 64MB; sleep 
 10;done
 2014-01-08 03:56:14,082 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:291)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:252)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:294)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:614)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:524)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:482)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:419)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:658)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:687)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:95)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:440)
 at java.lang.Thread.run(Thread.java:662)
 will attach log and configure files later
 Note: 
 My topology file:
 10.111.89.230   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
 10.111.89.231   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
 10.111.89.232   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
 10.111.89.239   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
 10.111.89.233   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
 10.111.89.234   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
 10.111.89.240   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
 10.111.89.236   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
 10.111.89.241   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
 10.111.89.238   /QE2/sin2-pekaurora-bdcqe048.eng.vmware.com
 10.111.89.242   /QE2/sin2-pekaurora-bdcqe048.eng.vmware.com



--
This message was sent by Atlassian JIRA
(v6.2#6252)