[jira] [Commented] (YARN-2823) NullPointerException in RM HA enabled 3-node cluster

2018-11-11 Thread Paul Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683225#comment-16683225
 ] 

Paul Lin commented on YARN-2823:


[~imstefanlee] Hi, I'm facing the same issue with Flink applications. I tried 
explicitly setting `KeepContainersAcrossApplicationAttempts` to false, but it 
doesn't work. How do you solve the problem at last? And could you please point 
me to the code where the default value KeepContainersAcrossApplicationAttempts 
is set to true? Thanks a lot!

> NullPointerException in RM HA enabled 3-node cluster
> 
>
> Key: YARN-2823
> URL: https://issues.apache.org/jira/browse/YARN-2823
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Gour Saha
>Assignee: Jian He
>Priority: Critical
> Fix For: 2.6.0
>
> Attachments: YARN-2823.1.patch, logs_with_NPE_in_RM.zip
>
>
> Branch:
> 2.6.0
> Environment: 
> A 3-node cluster with RM HA enabled. The HA setup went pretty smooth (used 
> Ambari) and then installed HBase using Slider. After some time the RMs went 
> down and would not come back up anymore. Following is the NPE we see in both 
> the RM logs.
> {noformat}
> 2014-09-16 01:36:28,037 FATAL resourcemanager.ResourceManager 
> (ResourceManager.java:run(612)) - Error in handling event type 
> APP_ATTEMPT_ADDED to the scheduler
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.transferStateFromPreviousAttempt(SchedulerApplicationAttempt.java:530)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:678)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1015)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:603)
> at java.lang.Thread.run(Thread.java:744)
> 2014-09-16 01:36:28,042 INFO  resourcemanager.ResourceManager 
> (ResourceManager.java:run(616)) - Exiting, bbye..
> {noformat}
> All the logs for this 3-node cluster has been uploaded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2823) NullPointerException in RM HA enabled 3-node cluster

2017-08-03 Thread stefanlee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113767#comment-16113767
 ] 

stefanlee commented on YARN-2823:
-

IMO, NPE  happened when *transferStateFromPreviousAttempt*  is *true* ,and  the 
value of *transferStateFromPreviousAttempt*  is depend on 
*KeepContainersAcrossApplicationAttempts* in *ApplicationSubmissionContext*, i 
have this NPE,because there is *FLINK* type application running in my cluster, 
then i saw the default value of *KeepContainersAcrossApplicationAttempts* in 
flink code is *true*. so, i want to know if 
*KeepContainersAcrossApplicationAttempts* is *false*, then this NPE can not 
happened?[~jianhe] thanks

> NullPointerException in RM HA enabled 3-node cluster
> 
>
> Key: YARN-2823
> URL: https://issues.apache.org/jira/browse/YARN-2823
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Gour Saha
>Assignee: Jian He
>Priority: Critical
> Fix For: 2.6.0
>
> Attachments: logs_with_NPE_in_RM.zip, YARN-2823.1.patch
>
>
> Branch:
> 2.6.0
> Environment: 
> A 3-node cluster with RM HA enabled. The HA setup went pretty smooth (used 
> Ambari) and then installed HBase using Slider. After some time the RMs went 
> down and would not come back up anymore. Following is the NPE we see in both 
> the RM logs.
> {noformat}
> 2014-09-16 01:36:28,037 FATAL resourcemanager.ResourceManager 
> (ResourceManager.java:run(612)) - Error in handling event type 
> APP_ATTEMPT_ADDED to the scheduler
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.transferStateFromPreviousAttempt(SchedulerApplicationAttempt.java:530)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:678)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1015)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:603)
> at java.lang.Thread.run(Thread.java:744)
> 2014-09-16 01:36:28,042 INFO  resourcemanager.ResourceManager 
> (ResourceManager.java:run(616)) - Exiting, bbye..
> {noformat}
> All the logs for this 3-node cluster has been uploaded.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2823) NullPointerException in RM HA enabled 3-node cluster

2014-11-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203389#comment-14203389
 ] 

Hudson commented on YARN-2823:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #737 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/737/])
YARN-2823. Fixed ResourceManager app-attempt state machine to inform schedulers 
about previous finished attempts of a running appliation to avoid expectation 
mismatch w.r.t transferred containers. Contributed by Jian He. (vinodkv: rev 
a5657182a7accebe08cd86e46b4cdeb163d4d1f2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java


 NullPointerException in RM HA enabled 3-node cluster
 

 Key: YARN-2823
 URL: https://issues.apache.org/jira/browse/YARN-2823
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Gour Saha
Assignee: Jian He
Priority: Critical
 Fix For: 2.6.0

 Attachments: YARN-2823.1.patch, logs_with_NPE_in_RM.zip


 Branch:
 2.6.0
 Environment: 
 A 3-node cluster with RM HA enabled. The HA setup went pretty smooth (used 
 Ambari) and then installed HBase using Slider. After some time the RMs went 
 down and would not come back up anymore. Following is the NPE we see in both 
 the RM logs.
 {noformat}
 2014-09-16 01:36:28,037 FATAL resourcemanager.ResourceManager 
 (ResourceManager.java:run(612)) - Error in handling event type 
 APP_ATTEMPT_ADDED to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.transferStateFromPreviousAttempt(SchedulerApplicationAttempt.java:530)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:678)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1015)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:603)
 at java.lang.Thread.run(Thread.java:744)
 2014-09-16 01:36:28,042 INFO  resourcemanager.ResourceManager 
 (ResourceManager.java:run(616)) - Exiting, bbye..
 {noformat}
 All the logs for this 3-node cluster has been uploaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2823) NullPointerException in RM HA enabled 3-node cluster

2014-11-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203434#comment-14203434
 ] 

Hudson commented on YARN-2823:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1927 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1927/])
YARN-2823. Fixed ResourceManager app-attempt state machine to inform schedulers 
about previous finished attempts of a running appliation to avoid expectation 
mismatch w.r.t transferred containers. Contributed by Jian He. (vinodkv: rev 
a5657182a7accebe08cd86e46b4cdeb163d4d1f2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java


 NullPointerException in RM HA enabled 3-node cluster
 

 Key: YARN-2823
 URL: https://issues.apache.org/jira/browse/YARN-2823
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Gour Saha
Assignee: Jian He
Priority: Critical
 Fix For: 2.6.0

 Attachments: YARN-2823.1.patch, logs_with_NPE_in_RM.zip


 Branch:
 2.6.0
 Environment: 
 A 3-node cluster with RM HA enabled. The HA setup went pretty smooth (used 
 Ambari) and then installed HBase using Slider. After some time the RMs went 
 down and would not come back up anymore. Following is the NPE we see in both 
 the RM logs.
 {noformat}
 2014-09-16 01:36:28,037 FATAL resourcemanager.ResourceManager 
 (ResourceManager.java:run(612)) - Error in handling event type 
 APP_ATTEMPT_ADDED to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.transferStateFromPreviousAttempt(SchedulerApplicationAttempt.java:530)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:678)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1015)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:603)
 at java.lang.Thread.run(Thread.java:744)
 2014-09-16 01:36:28,042 INFO  resourcemanager.ResourceManager 
 (ResourceManager.java:run(616)) - Exiting, bbye..
 {noformat}
 All the logs for this 3-node cluster has been uploaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2823) NullPointerException in RM HA enabled 3-node cluster

2014-11-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203459#comment-14203459
 ] 

Hudson commented on YARN-2823:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1951 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1951/])
YARN-2823. Fixed ResourceManager app-attempt state machine to inform schedulers 
about previous finished attempts of a running appliation to avoid expectation 
mismatch w.r.t transferred containers. Contributed by Jian He. (vinodkv: rev 
a5657182a7accebe08cd86e46b4cdeb163d4d1f2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
* hadoop-yarn-project/CHANGES.txt


 NullPointerException in RM HA enabled 3-node cluster
 

 Key: YARN-2823
 URL: https://issues.apache.org/jira/browse/YARN-2823
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Gour Saha
Assignee: Jian He
Priority: Critical
 Fix For: 2.6.0

 Attachments: YARN-2823.1.patch, logs_with_NPE_in_RM.zip


 Branch:
 2.6.0
 Environment: 
 A 3-node cluster with RM HA enabled. The HA setup went pretty smooth (used 
 Ambari) and then installed HBase using Slider. After some time the RMs went 
 down and would not come back up anymore. Following is the NPE we see in both 
 the RM logs.
 {noformat}
 2014-09-16 01:36:28,037 FATAL resourcemanager.ResourceManager 
 (ResourceManager.java:run(612)) - Error in handling event type 
 APP_ATTEMPT_ADDED to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.transferStateFromPreviousAttempt(SchedulerApplicationAttempt.java:530)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:678)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1015)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:603)
 at java.lang.Thread.run(Thread.java:744)
 2014-09-16 01:36:28,042 INFO  resourcemanager.ResourceManager 
 (ResourceManager.java:run(616)) - Exiting, bbye..
 {noformat}
 All the logs for this 3-node cluster has been uploaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2823) NullPointerException in RM HA enabled 3-node cluster

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202315#comment-14202315
 ] 

Hudson commented on YARN-2823:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6479 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6479/])
YARN-2823. Fixed ResourceManager app-attempt state machine to inform schedulers 
about previous finished attempts of a running appliation to avoid expectation 
mismatch w.r.t transferred containers. Contributed by Jian He. (vinodkv: rev 
a5657182a7accebe08cd86e46b4cdeb163d4d1f2)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java


 NullPointerException in RM HA enabled 3-node cluster
 

 Key: YARN-2823
 URL: https://issues.apache.org/jira/browse/YARN-2823
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Gour Saha
Assignee: Jian He
Priority: Critical
 Fix For: 2.6.0

 Attachments: YARN-2823.1.patch, logs_with_NPE_in_RM.zip


 Branch:
 2.6.0
 Environment: 
 A 3-node cluster with RM HA enabled. The HA setup went pretty smooth (used 
 Ambari) and then installed HBase using Slider. After some time the RMs went 
 down and would not come back up anymore. Following is the NPE we see in both 
 the RM logs.
 {noformat}
 2014-09-16 01:36:28,037 FATAL resourcemanager.ResourceManager 
 (ResourceManager.java:run(612)) - Error in handling event type 
 APP_ATTEMPT_ADDED to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.transferStateFromPreviousAttempt(SchedulerApplicationAttempt.java:530)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:678)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1015)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:603)
 at java.lang.Thread.run(Thread.java:744)
 2014-09-16 01:36:28,042 INFO  resourcemanager.ResourceManager 
 (ResourceManager.java:run(616)) - Exiting, bbye..
 {noformat}
 All the logs for this 3-node cluster has been uploaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2823) NullPointerException in RM HA enabled 3-node cluster

2014-11-07 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202322#comment-14202322
 ] 

Vinod Kumar Vavilapalli commented on YARN-2823:
---

bq. I think there is more that we can and should do but in the near future. In 
the non-restart control flow, AMs cannot register till the RM knows about the 
attempt (obviously), this condition is invalidated after restart. Will file a 
ticket.
Filed YARN-2829.

 NullPointerException in RM HA enabled 3-node cluster
 

 Key: YARN-2823
 URL: https://issues.apache.org/jira/browse/YARN-2823
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Gour Saha
Assignee: Jian He
Priority: Critical
 Fix For: 2.6.0

 Attachments: YARN-2823.1.patch, logs_with_NPE_in_RM.zip


 Branch:
 2.6.0
 Environment: 
 A 3-node cluster with RM HA enabled. The HA setup went pretty smooth (used 
 Ambari) and then installed HBase using Slider. After some time the RMs went 
 down and would not come back up anymore. Following is the NPE we see in both 
 the RM logs.
 {noformat}
 2014-09-16 01:36:28,037 FATAL resourcemanager.ResourceManager 
 (ResourceManager.java:run(612)) - Error in handling event type 
 APP_ATTEMPT_ADDED to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.transferStateFromPreviousAttempt(SchedulerApplicationAttempt.java:530)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:678)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1015)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:603)
 at java.lang.Thread.run(Thread.java:744)
 2014-09-16 01:36:28,042 INFO  resourcemanager.ResourceManager 
 (ResourceManager.java:run(616)) - Exiting, bbye..
 {noformat}
 All the logs for this 3-node cluster has been uploaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2823) NullPointerException in RM HA enabled 3-node cluster

2014-11-06 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201040#comment-14201040
 ] 

Jian He commented on YARN-2823:
---

The problem is on recovery, if the previous attempt already finished, we are 
not adding it the scheduler. when scheduler tries to 
transferStateFromPreviousAttempt for work-presrving AM restart, it throws NPE.

 NullPointerException in RM HA enabled 3-node cluster
 

 Key: YARN-2823
 URL: https://issues.apache.org/jira/browse/YARN-2823
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Gour Saha
Assignee: Jian He
 Attachments: YARN-2823.1.patch, logs_with_NPE_in_RM.zip


 Branch:
 2.6.0
 Environment: 
 A 3-node cluster with RM HA enabled. The HA setup went pretty smooth (used 
 Ambari) and then installed HBase using Slider. After some time the RMs went 
 down and would not come back up anymore. Following is the NPE we see in both 
 the RM logs.
 {noformat}
 2014-09-16 01:36:28,037 FATAL resourcemanager.ResourceManager 
 (ResourceManager.java:run(612)) - Error in handling event type 
 APP_ATTEMPT_ADDED to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.transferStateFromPreviousAttempt(SchedulerApplicationAttempt.java:530)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:678)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1015)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:603)
 at java.lang.Thread.run(Thread.java:744)
 2014-09-16 01:36:28,042 INFO  resourcemanager.ResourceManager 
 (ResourceManager.java:run(616)) - Exiting, bbye..
 {noformat}
 All the logs for this 3-node cluster has been uploaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2823) NullPointerException in RM HA enabled 3-node cluster

2014-11-06 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201041#comment-14201041
 ] 

Jian He commented on YARN-2823:
---

Upload a patch to add the previously finished attempt to scheduler

 NullPointerException in RM HA enabled 3-node cluster
 

 Key: YARN-2823
 URL: https://issues.apache.org/jira/browse/YARN-2823
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Gour Saha
Assignee: Jian He
 Attachments: YARN-2823.1.patch, logs_with_NPE_in_RM.zip


 Branch:
 2.6.0
 Environment: 
 A 3-node cluster with RM HA enabled. The HA setup went pretty smooth (used 
 Ambari) and then installed HBase using Slider. After some time the RMs went 
 down and would not come back up anymore. Following is the NPE we see in both 
 the RM logs.
 {noformat}
 2014-09-16 01:36:28,037 FATAL resourcemanager.ResourceManager 
 (ResourceManager.java:run(612)) - Error in handling event type 
 APP_ATTEMPT_ADDED to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.transferStateFromPreviousAttempt(SchedulerApplicationAttempt.java:530)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:678)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1015)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:603)
 at java.lang.Thread.run(Thread.java:744)
 2014-09-16 01:36:28,042 INFO  resourcemanager.ResourceManager 
 (ResourceManager.java:run(616)) - Exiting, bbye..
 {noformat}
 All the logs for this 3-node cluster has been uploaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2823) NullPointerException in RM HA enabled 3-node cluster

2014-11-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201193#comment-14201193
 ] 

Hadoop QA commented on YARN-2823:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12679967/YARN-2823.1.patch
  against trunk revision 75b820c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5760//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5760//console

This message is automatically generated.

 NullPointerException in RM HA enabled 3-node cluster
 

 Key: YARN-2823
 URL: https://issues.apache.org/jira/browse/YARN-2823
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Gour Saha
Assignee: Jian He
 Attachments: YARN-2823.1.patch, logs_with_NPE_in_RM.zip


 Branch:
 2.6.0
 Environment: 
 A 3-node cluster with RM HA enabled. The HA setup went pretty smooth (used 
 Ambari) and then installed HBase using Slider. After some time the RMs went 
 down and would not come back up anymore. Following is the NPE we see in both 
 the RM logs.
 {noformat}
 2014-09-16 01:36:28,037 FATAL resourcemanager.ResourceManager 
 (ResourceManager.java:run(612)) - Error in handling event type 
 APP_ATTEMPT_ADDED to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.transferStateFromPreviousAttempt(SchedulerApplicationAttempt.java:530)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:678)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1015)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:603)
 at java.lang.Thread.run(Thread.java:744)
 2014-09-16 01:36:28,042 INFO  resourcemanager.ResourceManager 
 (ResourceManager.java:run(616)) - Exiting, bbye..
 {noformat}
 All the logs for this 3-node cluster has been uploaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)