subject:"\[jira\] \[Commented\] \(YARN\-4344\) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations"

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-23 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022933#comment-15022933
 ] 

Jason Lowe commented on YARN-4344:
--

+1 for branch-2.6 patch as well, committing this.

> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Attachments: YARN-4344-branch-2.6.001.patch, YARN-4344.001.patch, 
> YARN-4344.002.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023219#comment-15023219
 ] 

Hudson commented on YARN-4344:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2647 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2647/])
YARN-4344. NMs reconnecting with changed capabilities can lead to wrong (jlowe: 
rev d36b6e045f317c94e97cb41a163aa974d161a404)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java


> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Fix For: 2.6.3, 2.7.3
>
> Attachments: YARN-4344-branch-2.6.001.patch, YARN-4344.001.patch, 
> YARN-4344.002.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023799#comment-15023799
 ] 

Hudson commented on YARN-4344:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2572 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2572/])
YARN-4344. NMs reconnecting with changed capabilities can lead to wrong (jlowe: 
rev d36b6e045f317c94e97cb41a163aa974d161a404)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java


> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Fix For: 2.6.3, 2.7.3
>
> Attachments: YARN-4344-branch-2.6.001.patch, YARN-4344.001.patch, 
> YARN-4344.002.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023487#comment-15023487
 ] 

Hudson commented on YARN-4344:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #633 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/633/])
YARN-4344. NMs reconnecting with changed capabilities can lead to wrong (jlowe: 
rev d36b6e045f317c94e97cb41a163aa974d161a404)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* hadoop-yarn-project/CHANGES.txt


> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Fix For: 2.6.3, 2.7.3
>
> Attachments: YARN-4344-branch-2.6.001.patch, YARN-4344.001.patch, 
> YARN-4344.002.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023375#comment-15023375
 ] 

Hudson commented on YARN-4344:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #717 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/717/])
YARN-4344. NMs reconnecting with changed capabilities can lead to wrong (jlowe: 
rev d36b6e045f317c94e97cb41a163aa974d161a404)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java


> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Fix For: 2.6.3, 2.7.3
>
> Attachments: YARN-4344-branch-2.6.001.patch, YARN-4344.001.patch, 
> YARN-4344.002.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022963#comment-15022963
 ] 

Hudson commented on YARN-4344:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8864 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8864/])
YARN-4344. NMs reconnecting with changed capabilities can lead to wrong (jlowe: 
rev d36b6e045f317c94e97cb41a163aa974d161a404)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java


> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Attachments: YARN-4344-branch-2.6.001.patch, YARN-4344.001.patch, 
> YARN-4344.002.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023127#comment-15023127
 ] 

Hudson commented on YARN-4344:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1439 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1439/])
YARN-4344. NMs reconnecting with changed capabilities can lead to wrong (jlowe: 
rev d36b6e045f317c94e97cb41a163aa974d161a404)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java


> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Fix For: 2.6.3, 2.7.3
>
> Attachments: YARN-4344-branch-2.6.001.patch, YARN-4344.001.patch, 
> YARN-4344.002.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023142#comment-15023142
 ] 

Hudson commented on YARN-4344:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #706 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/706/])
YARN-4344. NMs reconnecting with changed capabilities can lead to wrong (jlowe: 
rev d36b6e045f317c94e97cb41a163aa974d161a404)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java


> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Fix For: 2.6.3, 2.7.3
>
> Attachments: YARN-4344-branch-2.6.001.patch, YARN-4344.001.patch, 
> YARN-4344.002.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-18 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012250#comment-15012250
 ] 

Jason Lowe commented on YARN-4344:
--

+1 lgtm.  [~vvasudev] could you also put up a patch for 2.6?  It doesn't apply 
there.

> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Attachments: YARN-4344.001.patch, YARN-4344.002.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-13 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15004455#comment-15004455
 ] 

Wangda Tan commented on YARN-4344:
--

Good catch, [~vvasudev]! Fix looks good to me, as commented by [~zxu], we 
should decouple RMNode status from the scheduler's view.

> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Attachments: YARN-4344.001.patch, YARN-4344.002.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-13 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15004024#comment-15004024
 ] 

Varun Vasudev commented on YARN-4344:
-

The test failures are unrelated to the patch.

> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Attachments: YARN-4344.001.patch, YARN-4344.002.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15003979#comment-15003979
 ] 

Hadoop QA commented on YARN-4344:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 7s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
4s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 6s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 27s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 142m 52s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.7.0_79 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-11-13 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12772165/YARN-4344.002.patch |
| JIRA Issue | YARN-4344 |
| Optional Tests |  asflicense  compile  javac  javadoc

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-12 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15001824#comment-15001824
 ] 

Naganarasimha G R commented on YARN-4344:
-

Hi [~zxu], 
Seems like the jira number is wrong as YARN-3286 is closed as wont fix ! Are 
you referring to any other [~rohithsharma]'s jira?



> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Attachments: YARN-4344.001.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-12 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15001844#comment-15001844
 ] 

Sunil G commented on YARN-4344:
---

I think its a correct JIRA id. A discussion happened there regarding the 
removal of a node and adding it back again when we get a 
{{ReconnectNodeTransition}}, but that change has an impact on a existing 
behavior about killing all containers while removing a node.

> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Attachments: YARN-4344.001.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-12 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15001867#comment-15001867
 ] 

Naganarasimha G R commented on YARN-4344:
-

thanks for the clarification, earlier had interpreted [~zxu]'s comment wrongly.

> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Attachments: YARN-4344.001.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-12 Thread zhihai xu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15003181#comment-15003181
 ] 

zhihai xu commented on YARN-4344:
-

+1 for Jason Lowe's suggestion to fix the issue at scheduler side. Using 
{{SchedulerNode.getTotalResource()}} instead of {{RMNode.getTotalCapability()}} 
inside Scheduler can better decouple Scheduler from RMNodeImpl state machine. 
It may also fix some other potential issues. For example, 
{{CapacityScheduler#addNode}} uses {{nodeManager.getTotalCapability()}} after 
creating {{FiCaSchedulerNode}}, if {{nodeManager.totalCapability}} is changed 
by RMNodeImpl state machine right after {{FiCaSchedulerNode}} was created, 
similar issue may happen.

> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Attachments: YARN-4344.001.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-12 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15002587#comment-15002587
 ] 

Jason Lowe commented on YARN-4344:
--

Ah yes, the non-work-preserving NM restart case.  The code is assuming that an 
NM registering without any active apps might be a non-work-preserving NM 
reconnecting, so we need to explicitly remove the node and add it back in so 
the scheduler will release any containers that were being tracked on that node.

At first I thought YARN-3802 had an inherent race in it where it assumes that 
the node event will be processed before the capability is updated.  That turns 
out to be true for the CapacityScheduler, but I think that's a bug in the 
CapacityScheduler.  Note that node update path appears to have the same issue 
-- RMNodeImpl updates the node's capability _before_ sending the scheduler node 
updated event.  So how can it work in that case?  It works because the 
CapacityScheduler for node update isn't looking at what the resource was in the 
RMNode passed in the event.  Instead it's looking up the scheduler node based 
on the RMNodeId and then referencing the total capability tracked there.  Seems 
to me the bug here is that the scheduler is relying on the RMNode in the event 
directly rather than the SchedulerNode to handle the capability calculation.  
We probably should have limited a lot of these scheduler events to just having 
RMNodeId rather than the full RMNode to avoid the temptation to directly 
examine the RMNode when handling the event.  As seen here, the RMNode can be 
"moving" while the scheduler is trying to examine it.

> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Attachments: YARN-4344.001.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-11 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15000521#comment-15000521
 ] 

Jason Lowe commented on YARN-4344:
--

Thanks for the patch, Varun!  I think the change will fix the reported issue, 
but I'm a bit skeptical of the vastly different handling of the event based on 
whether apps are running or not.  For example, if the http port is changing 
when the node re-registers, why are we treating it as a node removal then 
addition if there aren't any apps running but not if there are apps running?  
Seems like that should be consistent.

Comments on the patch itself:

The comment about sending the node removal event at the start of the main block 
in the transition is no longer very accurate.
 
Please don't put large sleeps (on the order of seconds) in tests.  These extra 
sleep seconds quickly add up to a significant amount of time over many tests.  
If we need to sleep for polling reasons the sleep should be much shorter, like 
on the order of 10ms.  Better than sleep-polling is flushing the event 
dispatcher and then checking since we can avoid polling entirely.

Nit: isCapabilityChanged init can be simplified to the following, similar to 
the noRunningApps boolean init above it:
{code}
  boolean isCapabilityChanged =
  !rmNode.getTotalCapability().equals(newNode.getTotalCapability());
 {code}

Nit: is this conditional check even necessary?  We can just update the total 
capability with no semantic effect if it hasn't changed.  Since this is just 
updating a reference with another precomputed one, it's not like we're avoiding 
some expensive code. ;-)
{code}
if (isCapabilityChanged) {
  rmNode.totalCapability = newNode.getTotalCapability();
}
{code}

> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Attachments: YARN-4344.001.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-11 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15000229#comment-15000229
 ] 

Varun Vasudev commented on YARN-4344:
-

An example of a situation is shown below -
{code}
2015-11-09 10:43:51,784 INFO  resourcemanager.ResourceTrackerService 
(ResourceTrackerService.java:registerNodeManager(345)) - NodeManager from node 
10.0.0.64(cmPort: 30050 httpPort: 30060) registered with capability: 
, assigned nodeId 10.0.0.64:30050
2015-11-09 10:43:51,786 INFO  rmnode.RMNodeImpl (RMNodeImpl.java:handle(434)) - 
10.0.0.64:30050 Node Transitioned from NEW to RUNNING
2015-11-09 10:43:51,814 INFO  capacity.CapacityScheduler 
(CapacityScheduler.java:addNode(1193)) - Added node 10.0.0.64:30050 
clusterResource: 
2015-11-09 10:44:37,878 INFO  util.RackResolver 
(RackResolver.java:coreResolve(109)) - Resolved 10.0.0.63 to /default-rack
2015-11-09 10:44:37,879 INFO  resourcemanager.ResourceTrackerService 
(ResourceTrackerService.java:registerNodeManager(345)) - NodeManager from node 
10.0.0.63(cmPort: 30050 httpPort: 30060) registered with capability: 
, assigned nodeId 10.0.0.63:30050
2015-11-09 10:44:37,879 INFO  rmnode.RMNodeImpl (RMNodeImpl.java:handle(434)) - 
10.0.0.63:30050 Node Transitioned from NEW to RUNNING
2015-11-09 10:44:37,882 INFO  capacity.CapacityScheduler 
(CapacityScheduler.java:addNode(1193)) - Added node 10.0.0.63:30050 
clusterResource: 
2015-11-09 10:44:39,307 INFO  util.RackResolver 
(RackResolver.java:coreResolve(109)) - Resolved 10.0.0.64 to /default-rack
2015-11-09 10:44:39,309 INFO  resourcemanager.ResourceTrackerService 
(ResourceTrackerService.java:registerNodeManager(313)) - Reconnect from the 
node at: 10.0.0.64
2015-11-09 10:44:39,312 INFO  resourcemanager.ResourceTrackerService 
(ResourceTrackerService.java:registerNodeManager(345)) - NodeManager from node 
10.0.0.64(cmPort: 30050 httpPort: 30060) registered with capability: 
, assigned nodeId 10.0.0.64:30050
2015-11-09 10:44:39,314 INFO  capacity.CapacityScheduler 
(CapacityScheduler.java:removeNode(1247)) - Removed node 10.0.0.64:30050 
clusterResource: 
2015-11-09 10:44:39,315 INFO  capacity.CapacityScheduler 
(CapacityScheduler.java:addNode(1193)) - Added node 10.0.0.64:30050 
clusterResource: 
{code}

In this case - NM's from 10.0.0.64 and 10.0.0.63 registered leading to a total 
cluster resource of clusterResource: . After that 
10.0.0.64 re-connected with changed capabilities(from  
to ). This should have led to the cluster resources 
becoming  but instead it is calculated to be 
.

The root cause is this piece of code from RMNodeImpl -
{code}
rmNode.context.getDispatcher().getEventHandler().handle(
  new NodeRemovedSchedulerEvent(rmNode));

if (!rmNode.getTotalCapability().equals(
 newNode.getTotalCapability())) {
   rmNode.totalCapability = newNode.getTotalCapability();
{code}

If the dispatcher is delayed in its processing of the event, by the time the 
remove node is processed,  rmNode.totalCapability = 
newNode.getTotalCapability() has already been executed and the resources that 
are removed are the changed capabilities and not the older capabilities of the 
node.



> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15000361#comment-15000361
 ] 

Hadoop QA commented on YARN-4344:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
6s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
11s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 49s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 1s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 132m 2s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.scheduler.TestAbstractYarnScheduler |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
| JDK v1.7.0_79 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.scheduler.TestAbstractYarnScheduler |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-11-11 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12771722/YARN-4344.001.patch |
| JIRA Issue | YARN-4344 |
| Optional Tests |  asflicense  javac  javadoc  mvninstall  unit  findbugs  
checkstyle  compile  |
| uname | Linux a9687b820f5f 3.13.0-36-lowlatency #63-Ubuntu

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

2015-11-11 Thread zhihai xu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15001800#comment-15001800
 ] 

zhihai xu commented on YARN-4344:
-

Thanks for reporting this issue [~vvasudev]! Thanks for the review [~Jason 
Lowe]! 
[~rohithsharma] tried to clean up the code at YARN-3286. Based on the following 
comment from [~jianhe] at YARN-3286,
{code}
I think this has changed the behavior that without any RM/NM restart features 
enabled, earlier restarting a node will trigger RM to kill all the containers 
on this node, but now it won't ?
{code}
The patch may cause compatibility issue. Maybe we can merge the case 
{{rmNode.getHttpPort() == newNode.getHttpPort()}} with {{rmNode.getHttpPort() 
!= newNode.getHttpPort()}} for noRunningApps.
Thoughts?

> NMs reconnecting with changed capabilities can lead to wrong cluster resource 
> calculations
> --
>
> Key: YARN-4344
> URL: https://issues.apache.org/jira/browse/YARN-4344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Critical
> Attachments: YARN-4344.001.patch
>
>
> After YARN-3802, if an NM re-connects to the RM with changed capabilities, 
> there can arise situations where the overall cluster resource calculation for 
> the cluster will be incorrect leading to inconsistencies in scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

[jira] [Commented] (YARN-4344) NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations

21 matches

Site Navigation

Mail list logo

Footer information