[jira] [Updated] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-02 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-3565:

Attachment: YARN-3565-20150502-1.patch

Hi [~Wangd], Attaching a patch with the modifications to support NodeLabel 
instead of string in NM HB/Register

 NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
 instead of String
 -

 Key: YARN-3565
 URL: https://issues.apache.org/jira/browse/YARN-3565
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
Priority: Blocker
 Attachments: YARN-3565-20150502-1.patch


 Now NM HB/Register uses SetString, it will be hard to add new fields if we 
 want to support specifying NodeLabel type such as exclusivity/constraints, 
 etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3381) A typographical error in InvalidStateTransitonException

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525103#comment-14525103
 ] 

Hadoop QA commented on YARN-3381:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 32s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 30s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 32s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 42s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   5m 19s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | mapreduce tests |   8m 50s | Tests passed in 
hadoop-mapreduce-client-app. |
| {color:green}+1{color} | yarn tests |   6m 47s | Tests passed in 
hadoop-yarn-client. |
| {color:green}+1{color} | yarn tests |   1m 55s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   5m 50s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| {color:green}+1{color} | yarn tests |  52m 13s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | | 117m 46s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729924/YARN-3381-003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| hadoop-mapreduce-client-app test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7630/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7630/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7630/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7630/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7630/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7630/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7630/console |


This message was automatically generated.

 A typographical error in InvalidStateTransitonException
 -

 Key: YARN-3381
 URL: https://issues.apache.org/jira/browse/YARN-3381
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.6.0
Reporter: Xiaoshuang LU
Assignee: Brahma Reddy Battula
 Attachments: YARN-3381-002.patch, YARN-3381-003.patch, YARN-3381.patch


 Appears that InvalidStateTransitonException should be 
 InvalidStateTransitionException.  Transition was misspelled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3554) Default value for maximum nodemanager connect wait time is too high

2015-05-02 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525143#comment-14525143
 ] 

sandflee commented on YARN-3554:


set this to a bigger value maybe based on network partition considerations not 
only for nm restart.

 Default value for maximum nodemanager connect wait time is too high
 ---

 Key: YARN-3554
 URL: https://issues.apache.org/jira/browse/YARN-3554
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Naganarasimha G R
  Labels: newbie
 Attachments: YARN-3554-20150429-2.patch, YARN-3554.20150429-1.patch


 The default value for yarn.client.nodemanager-connect.max-wait-ms is 90 
 msec or 15 minutes, which is way too high.  The default container expiry time 
 from the RM and the default task timeout in MapReduce are both only 10 
 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3513) Remove unused variables in ContainersMonitorImpl

2015-05-02 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525252#comment-14525252
 ] 

Naganarasimha G R commented on YARN-3513:
-

Thanks for commenting on this [~gtCarrera], I found out this when i was 
referring to YARN-3334 (2928  subjira) patch modifications after it was 
committed, and as per the code in 2928 branch, {{vmemStillInUsage and 
pmemStillInUsage}} has not been made use and other variables  
{{currentPmemUsage}} and {{cpuUsageTotalCoresPercentage}} have been used to 
publish the container metrics to ATS.

 Remove unused variables in ContainersMonitorImpl
 

 Key: YARN-3513
 URL: https://issues.apache.org/jira/browse/YARN-3513
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
Priority: Trivial
  Labels: newbie
 Fix For: 2.8.0

 Attachments: YARN-3513.20150421-1.patch


 class members :  {{private final Context context;}}
 and some local variables in MonitoringThread.run()  : {{vmemStillInUsage and 
 pmemStillInUsage}} are not used and just updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525299#comment-14525299
 ] 

Hudson commented on YARN-3363:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #182 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/182/])
YARN-3363. add localization and container launch time to ContainerMetrics at NM 
to show these timing information for each active container. (zxu via rkanter) 
(rkanter: rev ac7d152901e29b1f444507fe4e421eb6e1402b5a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerStartMonitoringEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* hadoop-yarn-project/CHANGES.txt


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 

 Key: YARN-3363
 URL: https://issues.apache.org/jira/browse/YARN-3363
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: metrics, supportability
 Fix For: 2.8.0

 Attachments: YARN-3363.000.patch, YARN-3363.001.patch


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 Currently ContainerMetrics has container's actual memory usage(YARN-2984),  
 actual CPU usage(YARN-3122), resource  and pid(YARN-3022). It will be better 
 to have localization and container launch time in ContainerMetrics for each 
 active container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525296#comment-14525296
 ] 

Hudson commented on YARN-2893:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #182 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/182/])
YARN-2893. AMLaucher: sporadic job failures due to EOFException in 
readTokenStorageStream. (Zhihai Xu via gera) (gera: rev 
f8204e241d9271497defd4d42646fb89c61cefe3)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterLauncher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/amlauncher/AMLauncher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* hadoop-yarn-project/CHANGES.txt


 AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
 --

 Key: YARN-2893
 URL: https://issues.apache.org/jira/browse/YARN-2893
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
 YARN-2893.002.patch, YARN-2893.003.patch, YARN-2893.004.patch, 
 YARN-2893.005.patch


 MapReduce jobs on our clusters experience sporadic failures due to corrupt 
 tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3006) Improve the error message when attempting manual failover with auto-failover enabled

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525298#comment-14525298
 ] 

Hudson commented on YARN-3006:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #182 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/182/])
YARN-3006. Improve the error message when attempting manual failover with 
auto-failover enabled. (Akira AJISAKA via wangda) (wangda: rev 
7d46a806e71de6692cd230e64e7de18a8252019d)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java


 Improve the error message when attempting manual failover with auto-failover 
 enabled
 

 Key: YARN-3006
 URL: https://issues.apache.org/jira/browse/YARN-3006
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Fix For: 2.8.0, 2.7.1

 Attachments: YARN-3006.001.patch


 When executing manual failover with automatic failover enabled, 
 UnsupportedOperationException is thrown.
 {code}
 # yarn rmadmin -failover rm1 rm2
 Exception in thread main java.lang.UnsupportedOperationException: 
 RMHAServiceTarget doesn't have a corresponding ZKFC address
   at 
 org.apache.hadoop.yarn.client.RMHAServiceTarget.getZKFCAddress(RMHAServiceTarget.java:51)
   at 
 org.apache.hadoop.ha.HAServiceTarget.getZKFCProxy(HAServiceTarget.java:94)
   at 
 org.apache.hadoop.ha.HAAdmin.gracefulFailoverThroughZKFCs(HAAdmin.java:311)
   at org.apache.hadoop.ha.HAAdmin.failover(HAAdmin.java:282)
   at org.apache.hadoop.ha.HAAdmin.runCmd(HAAdmin.java:449)
   at org.apache.hadoop.ha.HAAdmin.run(HAAdmin.java:378)
   at org.apache.hadoop.yarn.client.cli.RMAdminCLI.run(RMAdminCLI.java:482)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
   at 
 org.apache.hadoop.yarn.client.cli.RMAdminCLI.main(RMAdminCLI.java:622)
 {code}
 I'm thinking the above message is confusing to users. (Users may think 
 whether ZKFC is configured correctly...) The command should output error 
 message to stderr instead of throwing Exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525215#comment-14525215
 ] 

Hudson commented on YARN-2893:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #915 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/915/])
YARN-2893. AMLaucher: sporadic job failures due to EOFException in 
readTokenStorageStream. (Zhihai Xu via gera) (gera: rev 
f8204e241d9271497defd4d42646fb89c61cefe3)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterLauncher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/amlauncher/AMLauncher.java


 AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
 --

 Key: YARN-2893
 URL: https://issues.apache.org/jira/browse/YARN-2893
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
 YARN-2893.002.patch, YARN-2893.003.patch, YARN-2893.004.patch, 
 YARN-2893.005.patch


 MapReduce jobs on our clusters experience sporadic failures due to corrupt 
 tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3006) Improve the error message when attempting manual failover with auto-failover enabled

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525217#comment-14525217
 ] 

Hudson commented on YARN-3006:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #915 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/915/])
YARN-3006. Improve the error message when attempting manual failover with 
auto-failover enabled. (Akira AJISAKA via wangda) (wangda: rev 
7d46a806e71de6692cd230e64e7de18a8252019d)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java


 Improve the error message when attempting manual failover with auto-failover 
 enabled
 

 Key: YARN-3006
 URL: https://issues.apache.org/jira/browse/YARN-3006
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Fix For: 2.8.0, 2.7.1

 Attachments: YARN-3006.001.patch


 When executing manual failover with automatic failover enabled, 
 UnsupportedOperationException is thrown.
 {code}
 # yarn rmadmin -failover rm1 rm2
 Exception in thread main java.lang.UnsupportedOperationException: 
 RMHAServiceTarget doesn't have a corresponding ZKFC address
   at 
 org.apache.hadoop.yarn.client.RMHAServiceTarget.getZKFCAddress(RMHAServiceTarget.java:51)
   at 
 org.apache.hadoop.ha.HAServiceTarget.getZKFCProxy(HAServiceTarget.java:94)
   at 
 org.apache.hadoop.ha.HAAdmin.gracefulFailoverThroughZKFCs(HAAdmin.java:311)
   at org.apache.hadoop.ha.HAAdmin.failover(HAAdmin.java:282)
   at org.apache.hadoop.ha.HAAdmin.runCmd(HAAdmin.java:449)
   at org.apache.hadoop.ha.HAAdmin.run(HAAdmin.java:378)
   at org.apache.hadoop.yarn.client.cli.RMAdminCLI.run(RMAdminCLI.java:482)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
   at 
 org.apache.hadoop.yarn.client.cli.RMAdminCLI.main(RMAdminCLI.java:622)
 {code}
 I'm thinking the above message is confusing to users. (Users may think 
 whether ZKFC is configured correctly...) The command should output error 
 message to stderr instead of throwing Exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525212#comment-14525212
 ] 

Hadoop QA commented on YARN-3565:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 41s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   3m 23s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  3s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   5m 51s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   0m 21s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 55s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-server-common. |
| {color:green}+1{color} | yarn tests |   5m 49s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| {color:green}+1{color} | yarn tests |  52m 19s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | | 104m 29s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729929/YARN-3565-20150502-1.patch
 |
| Optional Tests | javac unit findbugs checkstyle javadoc |
| git revision | trunk / f1a152c |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7655/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7655/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7655/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7655/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7655/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7655/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7655/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7655/console |


This message was automatically generated.

 NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
 instead of String
 -

 Key: YARN-3565
 URL: https://issues.apache.org/jira/browse/YARN-3565
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
Priority: Blocker
 Attachments: YARN-3565-20150502-1.patch


 Now NM HB/Register uses SetString, it will be hard to add new fields if we 
 want to support specifying NodeLabel type such as exclusivity/constraints, 
 etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525218#comment-14525218
 ] 

Hudson commented on YARN-3363:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #915 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/915/])
YARN-3363. add localization and container launch time to ContainerMetrics at NM 
to show these timing information for each active container. (zxu via rkanter) 
(rkanter: rev ac7d152901e29b1f444507fe4e421eb6e1402b5a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerStartMonitoringEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 

 Key: YARN-3363
 URL: https://issues.apache.org/jira/browse/YARN-3363
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: metrics, supportability
 Fix For: 2.8.0

 Attachments: YARN-3363.000.patch, YARN-3363.001.patch


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 Currently ContainerMetrics has container's actual memory usage(YARN-2984),  
 actual CPU usage(YARN-3122), resource  and pid(YARN-3022). It will be better 
 to have localization and container launch time in ContainerMetrics for each 
 active container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525283#comment-14525283
 ] 

Hudson commented on YARN-2893:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2113 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2113/])
YARN-2893. AMLaucher: sporadic job failures due to EOFException in 
readTokenStorageStream. (Zhihai Xu via gera) (gera: rev 
f8204e241d9271497defd4d42646fb89c61cefe3)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterLauncher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/amlauncher/AMLauncher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java


 AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
 --

 Key: YARN-2893
 URL: https://issues.apache.org/jira/browse/YARN-2893
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
 YARN-2893.002.patch, YARN-2893.003.patch, YARN-2893.004.patch, 
 YARN-2893.005.patch


 MapReduce jobs on our clusters experience sporadic failures due to corrupt 
 tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3006) Improve the error message when attempting manual failover with auto-failover enabled

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525285#comment-14525285
 ] 

Hudson commented on YARN-3006:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2113 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2113/])
YARN-3006. Improve the error message when attempting manual failover with 
auto-failover enabled. (Akira AJISAKA via wangda) (wangda: rev 
7d46a806e71de6692cd230e64e7de18a8252019d)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java


 Improve the error message when attempting manual failover with auto-failover 
 enabled
 

 Key: YARN-3006
 URL: https://issues.apache.org/jira/browse/YARN-3006
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Fix For: 2.8.0, 2.7.1

 Attachments: YARN-3006.001.patch


 When executing manual failover with automatic failover enabled, 
 UnsupportedOperationException is thrown.
 {code}
 # yarn rmadmin -failover rm1 rm2
 Exception in thread main java.lang.UnsupportedOperationException: 
 RMHAServiceTarget doesn't have a corresponding ZKFC address
   at 
 org.apache.hadoop.yarn.client.RMHAServiceTarget.getZKFCAddress(RMHAServiceTarget.java:51)
   at 
 org.apache.hadoop.ha.HAServiceTarget.getZKFCProxy(HAServiceTarget.java:94)
   at 
 org.apache.hadoop.ha.HAAdmin.gracefulFailoverThroughZKFCs(HAAdmin.java:311)
   at org.apache.hadoop.ha.HAAdmin.failover(HAAdmin.java:282)
   at org.apache.hadoop.ha.HAAdmin.runCmd(HAAdmin.java:449)
   at org.apache.hadoop.ha.HAAdmin.run(HAAdmin.java:378)
   at org.apache.hadoop.yarn.client.cli.RMAdminCLI.run(RMAdminCLI.java:482)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
   at 
 org.apache.hadoop.yarn.client.cli.RMAdminCLI.main(RMAdminCLI.java:622)
 {code}
 I'm thinking the above message is confusing to users. (Users may think 
 whether ZKFC is configured correctly...) The command should output error 
 message to stderr instead of throwing Exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525286#comment-14525286
 ] 

Hudson commented on YARN-3363:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2113 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2113/])
YARN-3363. add localization and container launch time to ContainerMetrics at NM 
to show these timing information for each active container. (zxu via rkanter) 
(rkanter: rev ac7d152901e29b1f444507fe4e421eb6e1402b5a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerStartMonitoringEvent.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 

 Key: YARN-3363
 URL: https://issues.apache.org/jira/browse/YARN-3363
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: metrics, supportability
 Fix For: 2.8.0

 Attachments: YARN-3363.000.patch, YARN-3363.001.patch


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 Currently ContainerMetrics has container's actual memory usage(YARN-2984),  
 actual CPU usage(YARN-3122), resource  and pid(YARN-3022). It will be better 
 to have localization and container launch time in ContainerMetrics for each 
 active container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525198#comment-14525198
 ] 

Hudson commented on YARN-2893:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #181 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/181/])
YARN-2893. AMLaucher: sporadic job failures due to EOFException in 
readTokenStorageStream. (Zhihai Xu via gera) (gera: rev 
f8204e241d9271497defd4d42646fb89c61cefe3)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/amlauncher/AMLauncher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterLauncher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java


 AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
 --

 Key: YARN-2893
 URL: https://issues.apache.org/jira/browse/YARN-2893
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
 YARN-2893.002.patch, YARN-2893.003.patch, YARN-2893.004.patch, 
 YARN-2893.005.patch


 MapReduce jobs on our clusters experience sporadic failures due to corrupt 
 tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525201#comment-14525201
 ] 

Hudson commented on YARN-3363:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #181 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/181/])
YARN-3363. add localization and container launch time to ContainerMetrics at NM 
to show these timing information for each active container. (zxu via rkanter) 
(rkanter: rev ac7d152901e29b1f444507fe4e421eb6e1402b5a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerStartMonitoringEvent.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 

 Key: YARN-3363
 URL: https://issues.apache.org/jira/browse/YARN-3363
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: metrics, supportability
 Fix For: 2.8.0

 Attachments: YARN-3363.000.patch, YARN-3363.001.patch


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 Currently ContainerMetrics has container's actual memory usage(YARN-2984),  
 actual CPU usage(YARN-3122), resource  and pid(YARN-3022). It will be better 
 to have localization and container launch time in ContainerMetrics for each 
 active container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3006) Improve the error message when attempting manual failover with auto-failover enabled

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525200#comment-14525200
 ] 

Hudson commented on YARN-3006:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #181 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/181/])
YARN-3006. Improve the error message when attempting manual failover with 
auto-failover enabled. (Akira AJISAKA via wangda) (wangda: rev 
7d46a806e71de6692cd230e64e7de18a8252019d)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java
* hadoop-yarn-project/CHANGES.txt


 Improve the error message when attempting manual failover with auto-failover 
 enabled
 

 Key: YARN-3006
 URL: https://issues.apache.org/jira/browse/YARN-3006
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Fix For: 2.8.0, 2.7.1

 Attachments: YARN-3006.001.patch


 When executing manual failover with automatic failover enabled, 
 UnsupportedOperationException is thrown.
 {code}
 # yarn rmadmin -failover rm1 rm2
 Exception in thread main java.lang.UnsupportedOperationException: 
 RMHAServiceTarget doesn't have a corresponding ZKFC address
   at 
 org.apache.hadoop.yarn.client.RMHAServiceTarget.getZKFCAddress(RMHAServiceTarget.java:51)
   at 
 org.apache.hadoop.ha.HAServiceTarget.getZKFCProxy(HAServiceTarget.java:94)
   at 
 org.apache.hadoop.ha.HAAdmin.gracefulFailoverThroughZKFCs(HAAdmin.java:311)
   at org.apache.hadoop.ha.HAAdmin.failover(HAAdmin.java:282)
   at org.apache.hadoop.ha.HAAdmin.runCmd(HAAdmin.java:449)
   at org.apache.hadoop.ha.HAAdmin.run(HAAdmin.java:378)
   at org.apache.hadoop.yarn.client.cli.RMAdminCLI.run(RMAdminCLI.java:482)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
   at 
 org.apache.hadoop.yarn.client.cli.RMAdminCLI.main(RMAdminCLI.java:622)
 {code}
 I'm thinking the above message is confusing to users. (Users may think 
 whether ZKFC is configured correctly...) The command should output error 
 message to stderr instead of throwing Exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (YARN-2764) counters.LimitExceededException shouldn't abort AsyncDispatcher

2015-05-02 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened YARN-2764:
--

 counters.LimitExceededException shouldn't abort AsyncDispatcher
 ---

 Key: YARN-2764
 URL: https://issues.apache.org/jira/browse/YARN-2764
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.5.1
Reporter: Ted Yu
  Labels: counters

 I saw the following in container log:
 {code}
 2014-10-25 10:28:55,052 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with 
 attemptattempt_1414221548789_0023_r_03_0
 2014-10-25 10:28:55,052 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: 
 task_1414221548789_0023_r_03 Task Transitioned from RUNNING to SUCCEEDED
 2014-10-25 10:28:55,052 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 24
 2014-10-25 10:28:55,053 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: 
 job_1414221548789_0023Job Transitioned from RUNNING to COMMITTING
 2014-10-25 10:28:55,054 INFO [CommitterEvent Processor #1] 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing 
 the event EventType: JOB_COMMIT
 2014-10-25 10:28:55,177 FATAL [AsyncDispatcher event handler] 
 org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
 org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many 
 counters: 121 max=120
   at 
 org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:101)
   at org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:108)
   at 
 org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounter(AbstractCounterGroup.java:78)
   at 
 org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:95)
   at 
 org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:106)
   at 
 org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.incrAllCounters(AbstractCounterGroup.java:203)
   at 
 org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:348)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.constructFinalFullcounters(JobImpl.java:1754)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.mayBeConstructFinalFullCounters(JobImpl.java:1737)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.createJobFinishedEvent(JobImpl.java:1718)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.logJobHistoryFinishedEvent(JobImpl.java:1089)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$CommitSucceededTransition.transition(JobImpl.java:2049)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$CommitSucceededTransition.transition(JobImpl.java:2045)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1289)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1285)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
   at java.lang.Thread.run(Thread.java:745)
 2014-10-25 10:28:55,185 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
 {code}
 Counter limit was exceeded when JobFinishedEvent was created.
 Better handling of LimitExceededException should be provided so that 
 AsyncDispatcher can continue functioning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2454) Fix compareTo of variable UNBOUNDED in o.a.h.y.util.resource.Resources.

2015-05-02 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-2454:
-
Summary: Fix compareTo of variable UNBOUNDED in 
o.a.h.y.util.resource.Resources.  (was: The function compareTo of variable 
UNBOUNDED in org.apache.hadoop.yarn.util.resource.Resources is definited wrong.)

 Fix compareTo of variable UNBOUNDED in o.a.h.y.util.resource.Resources.
 ---

 Key: YARN-2454
 URL: https://issues.apache.org/jira/browse/YARN-2454
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0, 2.5.0, 2.4.1
Reporter: Xu Yang
Assignee: Xu Yang
 Attachments: YARN-2454 -v2.patch, YARN-2454-patch.diff, 
 YARN-2454.patch


 The variable UNBOUNDED implement the abstract class Resources, and override 
 the function compareTo. But there is something wrong in this function. We 
 should not compare resources with zero as the same as the variable NONE. We 
 should change 0 to Integer.MAX_VALUE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1418) Add Tracing to YARN

2015-05-02 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525259#comment-14525259
 ] 

Masatake Iwasaki commented on YARN-1418:


HI, [~djp]. We don't have proposal yet but I will write some and attach it in a 
few days.


 Add Tracing to YARN
 ---

 Key: YARN-1418
 URL: https://issues.apache.org/jira/browse/YARN-1418
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api, nodemanager, resourcemanager
Reporter: Masatake Iwasaki
Assignee: Yi Liu

 Adding tracing using HTrace in the same way as HBASE-6449 and HDFS-5274.
 The most part of changes needed for basis such as RPC seems to be almost 
 ready in HDFS-5274.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1564) add some basic workflow YARN services

2015-05-02 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525243#comment-14525243
 ] 

Steve Loughran commented on YARN-1564:
--

I should look at the tests for the execution service; we have a fork of these 
in slider and they were failing on windows unless the installation had the 
right path setup with all the cygwin binaries (ls and the like).

 add some basic workflow YARN services
 -

 Key: YARN-1564
 URL: https://issues.apache.org/jira/browse/YARN-1564
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: api
Affects Versions: 2.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Minor
 Attachments: YARN-1564-001.patch

   Original Estimate: 24h
  Time Spent: 48h
  Remaining Estimate: 0h

 I've been using some alternative composite services to help build workflows 
 of process execution in a YARN AM.
 They and their tests could be moved in YARN for the use by others -this would 
 make it easier to build aggregate services in an AM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525315#comment-14525315
 ] 

Hudson commented on YARN-2893:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2131 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2131/])
YARN-2893. AMLaucher: sporadic job failures due to EOFException in 
readTokenStorageStream. (Zhihai Xu via gera) (gera: rev 
f8204e241d9271497defd4d42646fb89c61cefe3)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterLauncher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/amlauncher/AMLauncher.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java


 AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
 --

 Key: YARN-2893
 URL: https://issues.apache.org/jira/browse/YARN-2893
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
 YARN-2893.002.patch, YARN-2893.003.patch, YARN-2893.004.patch, 
 YARN-2893.005.patch


 MapReduce jobs on our clusters experience sporadic failures due to corrupt 
 tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3006) Improve the error message when attempting manual failover with auto-failover enabled

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525317#comment-14525317
 ] 

Hudson commented on YARN-3006:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2131 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2131/])
YARN-3006. Improve the error message when attempting manual failover with 
auto-failover enabled. (Akira AJISAKA via wangda) (wangda: rev 
7d46a806e71de6692cd230e64e7de18a8252019d)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java


 Improve the error message when attempting manual failover with auto-failover 
 enabled
 

 Key: YARN-3006
 URL: https://issues.apache.org/jira/browse/YARN-3006
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Fix For: 2.8.0, 2.7.1

 Attachments: YARN-3006.001.patch


 When executing manual failover with automatic failover enabled, 
 UnsupportedOperationException is thrown.
 {code}
 # yarn rmadmin -failover rm1 rm2
 Exception in thread main java.lang.UnsupportedOperationException: 
 RMHAServiceTarget doesn't have a corresponding ZKFC address
   at 
 org.apache.hadoop.yarn.client.RMHAServiceTarget.getZKFCAddress(RMHAServiceTarget.java:51)
   at 
 org.apache.hadoop.ha.HAServiceTarget.getZKFCProxy(HAServiceTarget.java:94)
   at 
 org.apache.hadoop.ha.HAAdmin.gracefulFailoverThroughZKFCs(HAAdmin.java:311)
   at org.apache.hadoop.ha.HAAdmin.failover(HAAdmin.java:282)
   at org.apache.hadoop.ha.HAAdmin.runCmd(HAAdmin.java:449)
   at org.apache.hadoop.ha.HAAdmin.run(HAAdmin.java:378)
   at org.apache.hadoop.yarn.client.cli.RMAdminCLI.run(RMAdminCLI.java:482)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
   at 
 org.apache.hadoop.yarn.client.cli.RMAdminCLI.main(RMAdminCLI.java:622)
 {code}
 I'm thinking the above message is confusing to users. (Users may think 
 whether ZKFC is configured correctly...) The command should output error 
 message to stderr instead of throwing Exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525318#comment-14525318
 ] 

Hudson commented on YARN-3363:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2131 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2131/])
YARN-3363. add localization and container launch time to ContainerMetrics at NM 
to show these timing information for each active container. (zxu via rkanter) 
(rkanter: rev ac7d152901e29b1f444507fe4e421eb6e1402b5a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerStartMonitoringEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 

 Key: YARN-3363
 URL: https://issues.apache.org/jira/browse/YARN-3363
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: metrics, supportability
 Fix For: 2.8.0

 Attachments: YARN-3363.000.patch, YARN-3363.001.patch


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 Currently ContainerMetrics has container's actual memory usage(YARN-2984),  
 actual CPU usage(YARN-3122), resource  and pid(YARN-3022). It will be better 
 to have localization and container launch time in ContainerMetrics for each 
 active container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-05-02 Thread Jun Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Gong updated YARN-3480:
---
Description: 
When RM HA is enabled and running containers are kept across attempts, apps are 
more likely to finish successfully with more retries(attempts), so it will be 
better to set 'yarn.resourcemanager.am.max-attempts' larger. However it will 
make RMStateStore(FileSystem/HDFS/ZK) store more attempts, and make RM recover 
process much slower. It might be better to set max attempts to be stored in 
RMStateStore.

BTW: When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to a 
small value, retried attempts might be very large. So we need to delete some 
attempts stored in RMStateStore and RMStateStore.

  was:When RM HA is enabled and running containers are kept across attempts, 
apps are more likely to finish successfully with more retries(attempts), so it 
will be better to set 'yarn.resourcemanager.am.max-attempts' larger. However it 
will make RMStateStore(FileSystem/HDFS/ZK) store more attempts, and make RM 
recover process much slower. It might be better to set max attempts to be 
stored in RMStateStore.


 Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable
 

 Key: YARN-3480
 URL: https://issues.apache.org/jira/browse/YARN-3480
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3480.01.patch, YARN-3480.02.patch


 When RM HA is enabled and running containers are kept across attempts, apps 
 are more likely to finish successfully with more retries(attempts), so it 
 will be better to set 'yarn.resourcemanager.am.max-attempts' larger. However 
 it will make RMStateStore(FileSystem/HDFS/ZK) store more attempts, and make 
 RM recover process much slower. It might be better to set max attempts to be 
 stored in RMStateStore.
 BTW: When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to 
 a small value, retried attempts might be very large. So we need to delete 
 some attempts stored in RMStateStore and RMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3554) Default value for maximum nodemanager connect wait time is too high

2015-05-02 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525245#comment-14525245
 ] 

Naganarasimha G R commented on YARN-3554:
-

Hi [~gtCarrera9],
Thanks for commenting on this jira but did not get the intention completely, 
whether you are expecting me to merge the changes required for 3518 here ?
if so i had few questions 
1. yarn-3518 tries to modify default value of 
yarn.resourcemanager.connect.max-wait.ms from 90 to 60, which not only 
impacts timeout from AM - RM but also NM - RM  and client(cli, web, application 
report etc..) - RM. Is that ok ? (I am ok with it but just wanted to point it 
out)
2. Given the current high availability, is it required to wait for 10 mins to 
detect that RM has failed is valid or shall i decrease that too to 3 mins ?

If you inform i can merge the changes of 3518 and also update in 
yarn-default.xml which is missing in 3518.

 Default value for maximum nodemanager connect wait time is too high
 ---

 Key: YARN-3554
 URL: https://issues.apache.org/jira/browse/YARN-3554
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Naganarasimha G R
  Labels: newbie
 Attachments: YARN-3554-20150429-2.patch, YARN-3554.20150429-1.patch


 The default value for yarn.client.nodemanager-connect.max-wait-ms is 90 
 msec or 15 minutes, which is way too high.  The default container expiry time 
 from the RM and the default task timeout in MapReduce are both only 10 
 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-05-02 Thread Jun Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Gong updated YARN-3480:
---
Attachment: YARN-3480.03.patch

Update patch. Fix javac warning, checkstyple and test cases error.

 Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable
 

 Key: YARN-3480
 URL: https://issues.apache.org/jira/browse/YARN-3480
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3480.01.patch, YARN-3480.02.patch, 
 YARN-3480.03.patch


 When RM HA is enabled and running containers are kept across attempts, apps 
 are more likely to finish successfully with more retries(attempts), so it 
 will be better to set 'yarn.resourcemanager.am.max-attempts' larger. However 
 it will make RMStateStore(FileSystem/HDFS/ZK) store more attempts, and make 
 RM recover process much slower. It might be better to set max attempts to be 
 stored in RMStateStore.
 BTW: When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to 
 a small value, retried attempts might be very large. So we need to delete 
 some attempts stored in RMStateStore and RMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1946) need Public interface for WebAppUtils.getProxyHostAndPort

2015-05-02 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525242#comment-14525242
 ] 

Steve Loughran commented on YARN-1946:
--

This has got more complex with the HA proxy stuff: there's 1 (host,port) pair 
and the caller needs to handle failover.

its still something that could be made public for anyone wanting to do their 
own AmIPFilter; they'd just need to add failover logic themselves

 need Public interface for WebAppUtils.getProxyHostAndPort
 -

 Key: YARN-1946
 URL: https://issues.apache.org/jira/browse/YARN-1946
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, webapp
Affects Versions: 2.4.0
Reporter: Thomas Graves
Priority: Critical

 ApplicationMasters are supposed to go through the ResourceManager web app 
 proxy if they have web UI's so they are properly secured.  There is currently 
 no public interface for Application Masters to conveniently get the proxy 
 host and port.  There is a function in WebAppUtils, but that class is 
 private.  
 We should provide this as a utility since any properly written AM will need 
 to do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3554) Default value for maximum nodemanager connect wait time is too high

2015-05-02 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525264#comment-14525264
 ] 

sandflee commented on YARN-3554:


Hi [~Naganarasimha] 3 mins seems dangerous,  If rm fails over and the recover 
takes serval mins
, nm maybe kill all containers, in production env, it's not expected.

 Default value for maximum nodemanager connect wait time is too high
 ---

 Key: YARN-3554
 URL: https://issues.apache.org/jira/browse/YARN-3554
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Naganarasimha G R
  Labels: newbie
 Attachments: YARN-3554-20150429-2.patch, YARN-3554.20150429-1.patch


 The default value for yarn.client.nodemanager-connect.max-wait-ms is 90 
 msec or 15 minutes, which is way too high.  The default container expiry time 
 from the RM and the default task timeout in MapReduce are both only 10 
 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3006) Improve the error message when attempting manual failover with auto-failover enabled

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525272#comment-14525272
 ] 

Hudson commented on YARN-3006:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #172 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/172/])
YARN-3006. Improve the error message when attempting manual failover with 
auto-failover enabled. (Akira AJISAKA via wangda) (wangda: rev 
7d46a806e71de6692cd230e64e7de18a8252019d)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java
* hadoop-yarn-project/CHANGES.txt


 Improve the error message when attempting manual failover with auto-failover 
 enabled
 

 Key: YARN-3006
 URL: https://issues.apache.org/jira/browse/YARN-3006
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
 Fix For: 2.8.0, 2.7.1

 Attachments: YARN-3006.001.patch


 When executing manual failover with automatic failover enabled, 
 UnsupportedOperationException is thrown.
 {code}
 # yarn rmadmin -failover rm1 rm2
 Exception in thread main java.lang.UnsupportedOperationException: 
 RMHAServiceTarget doesn't have a corresponding ZKFC address
   at 
 org.apache.hadoop.yarn.client.RMHAServiceTarget.getZKFCAddress(RMHAServiceTarget.java:51)
   at 
 org.apache.hadoop.ha.HAServiceTarget.getZKFCProxy(HAServiceTarget.java:94)
   at 
 org.apache.hadoop.ha.HAAdmin.gracefulFailoverThroughZKFCs(HAAdmin.java:311)
   at org.apache.hadoop.ha.HAAdmin.failover(HAAdmin.java:282)
   at org.apache.hadoop.ha.HAAdmin.runCmd(HAAdmin.java:449)
   at org.apache.hadoop.ha.HAAdmin.run(HAAdmin.java:378)
   at org.apache.hadoop.yarn.client.cli.RMAdminCLI.run(RMAdminCLI.java:482)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
   at 
 org.apache.hadoop.yarn.client.cli.RMAdminCLI.main(RMAdminCLI.java:622)
 {code}
 I'm thinking the above message is confusing to users. (Users may think 
 whether ZKFC is configured correctly...) The command should output error 
 message to stderr instead of throwing Exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525270#comment-14525270
 ] 

Hudson commented on YARN-2893:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #172 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/172/])
YARN-2893. AMLaucher: sporadic job failures due to EOFException in 
readTokenStorageStream. (Zhihai Xu via gera) (gera: rev 
f8204e241d9271497defd4d42646fb89c61cefe3)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/amlauncher/AMLauncher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterLauncher.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java


 AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
 --

 Key: YARN-2893
 URL: https://issues.apache.org/jira/browse/YARN-2893
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
 YARN-2893.002.patch, YARN-2893.003.patch, YARN-2893.004.patch, 
 YARN-2893.005.patch


 MapReduce jobs on our clusters experience sporadic failures due to corrupt 
 tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525273#comment-14525273
 ] 

Hudson commented on YARN-3363:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #172 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/172/])
YARN-3363. add localization and container launch time to ContainerMetrics at NM 
to show these timing information for each active container. (zxu via rkanter) 
(rkanter: rev ac7d152901e29b1f444507fe4e421eb6e1402b5a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerStartMonitoringEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java
* hadoop-yarn-project/CHANGES.txt


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 

 Key: YARN-3363
 URL: https://issues.apache.org/jira/browse/YARN-3363
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: metrics, supportability
 Fix For: 2.8.0

 Attachments: YARN-3363.000.patch, YARN-3363.001.patch


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 Currently ContainerMetrics has container's actual memory usage(YARN-2984),  
 actual CPU usage(YARN-3122), resource  and pid(YARN-3022). It will be better 
 to have localization and container launch time in ContainerMetrics for each 
 active container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2123) Progress bars in Web UI always at 100% due to non-US locale

2015-05-02 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-2123:

Attachment: YARN-2123-004.patch

Thank you [~ozawa]. Attached v4 patch.

 Progress bars in Web UI always at 100% due to non-US locale
 ---

 Key: YARN-2123
 URL: https://issues.apache.org/jira/browse/YARN-2123
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 2.3.0
Reporter: Johannes Simon
Assignee: Akira AJISAKA
 Attachments: NaN_after_launching_RM.png, YARN-2123-001.patch, 
 YARN-2123-002.patch, YARN-2123-003.patch, YARN-2123-004.patch, 
 fair-scheduler-ajisaka.xml, screenshot-noPatch.png, screenshot-patch.png, 
 screenshot.png, yarn-site-ajisaka.xml


 In our cluster setup, the YARN web UI always shows progress bars at 100% (see 
 screenshot, progress of the reduce step is roughly at 32.82%). I opened the 
 HTML source code to check (also see screenshot), and it seems the problem is 
 that it uses a comma as decimal mark, where most browsers expect a dot for 
 floating-point numbers. This could possibly be due to localized number 
 formatting being used in the wrong place, which would also explain why this 
 bug is not always visible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2892) Unable to get AMRMToken in unmanaged AM when using a secure cluster

2015-05-02 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525374#comment-14525374
 ] 

Junping Du commented on YARN-2892:
--

Sure. [~leftnoteasy], I agree. please go ahead.

 Unable to get AMRMToken in unmanaged AM when using a secure cluster
 ---

 Key: YARN-2892
 URL: https://issues.apache.org/jira/browse/YARN-2892
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Sevada Abraamyan
Assignee: Sevada Abraamyan
 Attachments: YARN-2892.patch, YARN-2892.patch, YARN-2892.patch


 An AMRMToken is retrieved from the ApplicationReport by the YarnClient. 
 When the RM creates the ApplicationReport and sends it back to the client it 
 makes a simple security check whether it should include the AMRMToken in the 
 report (See createAndGetApplicationReport in RMAppImpl).This security check 
 verifies that the user who submitted the original application is the same 
 user who is requesting the ApplicationReport. If they are indeed the same 
 user then it includes the AMRMToken, otherwise it does not include it.
 The problem arises from the fact that when an application is submitted, the 
 RM  saves the short username of the user who created the application (See 
 submitApplication in ClientRmService). Afterwards when the ApplicationReport 
 is requested, the system tries to match the full username of the requester 
 against the previously stored short username. 
 In a secure cluster using Kerberos this check fails because the principle is 
 stripped from the username when we request a short username. So for example 
 the short username might be Foo whereas the full username is 
 f...@company.com
 Note: A very similar problem has been previously reported 
 ([Yarn-2232|https://issues.apache.org/jira/browse/YARN-2232])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3554) Default value for maximum nodemanager connect wait time is too high

2015-05-02 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525406#comment-14525406
 ] 

Li Lu commented on YARN-3554:
-

Hi [~Naganarasimha], I just wanted to bring that JIRA into attention. We may 
want to share some discussions for both JIRAs. 

 Default value for maximum nodemanager connect wait time is too high
 ---

 Key: YARN-3554
 URL: https://issues.apache.org/jira/browse/YARN-3554
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Naganarasimha G R
  Labels: newbie
 Attachments: YARN-3554-20150429-2.patch, YARN-3554.20150429-1.patch


 The default value for yarn.client.nodemanager-connect.max-wait-ms is 90 
 msec or 15 minutes, which is way too high.  The default container expiry time 
 from the RM and the default task timeout in MapReduce are both only 10 
 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3422) relatedentities always return empty list when primary filter is set

2015-05-02 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525410#comment-14525410
 ] 

Billie Rinaldi commented on YARN-3422:
--

Let’s say we post entity A with related entity B and primary filter C.  This 
implies a directional relationship B - A.  The entries stored include the 
following:
{noformat}
entity entry: A (with hidden B)
related entity entry: B A
primary filter entry: C A (no B)
{noformat}
The patch submitted adds primary filter entry C B A, which is not correct for 
the existing design because C was posted as a primary filter for A, not for B.  
What we might want (that the store does not currently do) is for A to be added 
under primary filter entries for B (i.e. D B A, where D is a primary filter for 
B).  One problem with doing this is that we do not know the primary filters for 
B when we are posting entity A, and a further problem is that the primary 
filters for B could change over time and have to be kept up to date.

 relatedentities always return empty list when primary filter is set
 ---

 Key: YARN-3422
 URL: https://issues.apache.org/jira/browse/YARN-3422
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-3422.1.patch


 When you curl for ats entities with a primary filter, the relatedentities 
 fields always return empty list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2454) Fix compareTo of variable UNBOUNDED in o.a.h.y.util.resource.Resources.

2015-05-02 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525362#comment-14525362
 ] 

Junping Du commented on YARN-2454:
--

Also, Congratulation to [~yxls123123] for the first patch contribution to 
Apache Hadoop! 

 Fix compareTo of variable UNBOUNDED in o.a.h.y.util.resource.Resources.
 ---

 Key: YARN-2454
 URL: https://issues.apache.org/jira/browse/YARN-2454
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0, 2.5.0, 2.4.1
Reporter: Xu Yang
Assignee: Xu Yang
 Fix For: 2.8.0

 Attachments: YARN-2454 -v2.patch, YARN-2454-patch.diff, 
 YARN-2454.patch


 The variable UNBOUNDED implement the abstract class Resources, and override 
 the function compareTo. But there is something wrong in this function. We 
 should not compare resources with zero as the same as the variable NONE. We 
 should change 0 to Integer.MAX_VALUE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2454) Fix compareTo of variable UNBOUNDED in o.a.h.y.util.resource.Resources.

2015-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525395#comment-14525395
 ] 

Hudson commented on YARN-2454:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7717 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7717/])
YARN-2454. Fix compareTo of variable UNBOUNDED in 
o.a.h.y.util.resource.Resources. Contributed by Xu Yang. (junping_du: rev 
57d9a972cbd62aae0ab010d38a0973619972edd6)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/Resources.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/resource/TestResources.java
* hadoop-yarn-project/CHANGES.txt


 Fix compareTo of variable UNBOUNDED in o.a.h.y.util.resource.Resources.
 ---

 Key: YARN-2454
 URL: https://issues.apache.org/jira/browse/YARN-2454
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0, 2.5.0, 2.4.1
Reporter: Xu Yang
Assignee: Xu Yang
 Fix For: 2.8.0

 Attachments: YARN-2454 -v2.patch, YARN-2454-patch.diff, 
 YARN-2454.patch


 The variable UNBOUNDED implement the abstract class Resources, and override 
 the function compareTo. But there is something wrong in this function. We 
 should not compare resources with zero as the same as the variable NONE. We 
 should change 0 to Integer.MAX_VALUE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3513) Remove unused variables in ContainersMonitorImpl

2015-05-02 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525412#comment-14525412
 ] 

Li Lu commented on YARN-3513:
-

No I was talking about these lines in YARN-3334:
{code}
+try {
+  TimelineClient timelineClient = context.getApplications().get(
+  containerId.getApplicationAttemptId().getApplicationId()).
+  getTimelineClient();
+  putEntityWithoutBlocking(timelineClient, entity);
+}
{code}
which refs context and will have problems with
{code}
-  private final Context context;
{code}

This may be fine in trunk, but since YARN-2928 needs to merge back in near 
future, we may not want to make the change on content for now. We need to 
consider code clean up comprehensively when we're doing the branch merge. 

 Remove unused variables in ContainersMonitorImpl
 

 Key: YARN-3513
 URL: https://issues.apache.org/jira/browse/YARN-3513
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
Priority: Trivial
  Labels: newbie
 Fix For: 2.8.0

 Attachments: YARN-3513.20150421-1.patch


 class members :  {{private final Context context;}}
 and some local variables in MonitoringThread.run()  : {{vmemStillInUsage and 
 pmemStillInUsage}} are not used and just updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3422) relatedentities always return empty list when primary filter is set

2015-05-02 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525417#comment-14525417
 ] 

Billie Rinaldi commented on YARN-3422:
--

In retrospect, the directional nature of the related entity relationship seems 
to make things more confusing.  Perhaps it would be better if relatedness were 
bidirectional.

 relatedentities always return empty list when primary filter is set
 ---

 Key: YARN-3422
 URL: https://issues.apache.org/jira/browse/YARN-3422
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-3422.1.patch


 When you curl for ats entities with a primary filter, the relatedentities 
 fields always return empty list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2123) Progress bars in Web UI always at 100% due to non-US locale

2015-05-02 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525336#comment-14525336
 ] 

Akira AJISAKA commented on YARN-2123:
-

and thanks [~xgong] for pinging me.

 Progress bars in Web UI always at 100% due to non-US locale
 ---

 Key: YARN-2123
 URL: https://issues.apache.org/jira/browse/YARN-2123
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 2.3.0
Reporter: Johannes Simon
Assignee: Akira AJISAKA
 Attachments: NaN_after_launching_RM.png, YARN-2123-001.patch, 
 YARN-2123-002.patch, YARN-2123-003.patch, YARN-2123-004.patch, 
 fair-scheduler-ajisaka.xml, screenshot-noPatch.png, screenshot-patch.png, 
 screenshot.png, yarn-site-ajisaka.xml


 In our cluster setup, the YARN web UI always shows progress bars at 100% (see 
 screenshot, progress of the reduce step is roughly at 32.82%). I opened the 
 HTML source code to check (also see screenshot), and it seems the problem is 
 that it uses a comma as decimal mark, where most browsers expect a dot for 
 floating-point numbers. This could possibly be due to localized number 
 formatting being used in the wrong place, which would also explain why this 
 bug is not always visible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3097) Logging of resource recovery on NM restart has redundancies

2015-05-02 Thread Eric Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-3097:
-
Attachment: YARN-3097.001.patch

 Logging of resource recovery on NM restart has redundancies
 ---

 Key: YARN-3097
 URL: https://issues.apache.org/jira/browse/YARN-3097
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Eric Payne
Priority: Minor
  Labels: newbie
 Attachments: YARN-3097.001.patch


 ResourceLocalizationService logs that it is recovering a resource with the 
 remote and local paths, but then very shortly afterwards the 
 LocalizedResource emits an INIT-LOCALIZED transition that also logs the same 
 remote and local paths.  The recovery message should be a debug message, 
 since it's not conveying any useful information that isn't already covered by 
 the resource state transition log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1418) Add Tracing to YARN

2015-05-02 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525381#comment-14525381
 ] 

Junping Du commented on YARN-1418:
--

No worry. [~iwasakims], just curious on this feature as we typically have some 
writeup for umbrella JIRA so other contributors can help.

 Add Tracing to YARN
 ---

 Key: YARN-1418
 URL: https://issues.apache.org/jira/browse/YARN-1418
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api, nodemanager, resourcemanager
Reporter: Masatake Iwasaki
Assignee: Yi Liu

 Adding tracing using HTrace in the same way as HBASE-6449 and HDFS-5274.
 The most part of changes needed for basis such as RPC seems to be almost 
 ready in HDFS-5274.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3570) Non-zero exit status of master application not propagated

2015-05-02 Thread Eric O. LEBIGOT (EOL) (JIRA)
Eric O. LEBIGOT (EOL) created YARN-3570:
---

 Summary: Non-zero exit status of master application not propagated
 Key: YARN-3570
 URL: https://issues.apache.org/jira/browse/YARN-3570
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
 Environment: PySpark on AWS EMR
Reporter: Eric O. LEBIGOT (EOL)


The master of my application fails, but the Final app status is 0. This 
causes all sorts of problems (EMR not detecting a problem, my data pipeline 
continuing, etc.).

Here is what happens. The master fails (showing only relevant lines from 
daemons/i-…/yarn-hadoop-nodemanager-ip-….log.gz):
{quote}
2015-05-02 03:32:11,000 WARN 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor 
(ContainersLauncher #0): Exit code from container 
container_1430537363277_0001_01_01 is : 1
2015-05-02 03:32:11,001 WARN 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor 
(ContainersLauncher #0): Exception from container-launch with container ID: 
container_1430537363277_0001_01_01 and exit code: 1
2015-05-02 03:32:11,003 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch
 (ContainersLauncher #0): Container exited with a non-zero exit code 1
2015-05-02 03:32:11,004 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container 
(AsyncDispatcher event handler): Container 
container_1430537363277_0001_01_01 transitioned from RUNNING to 
EXITED_WITH_FAILURE
2015-05-02 03:32:11,032 WARN 
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger (AsyncDispatcher event 
handler): USER=hadoop   OPERATION=Container Finished - Failed   
TARGET=ContainerImplRESULT=FAILURE  DESCRIPTION=Container failed with 
state: EXITED_WITH_FAILUREAPPID=application_1430537363277_0001   
CONTAINERID=container_1430537363277_0001_01_01
2015-05-02 03:32:11,032 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container 
(AsyncDispatcher event handler): Container 
container_1430537363277_0001_01_01 transitioned from EXITED_WITH_FAILURE to 
DONE
{quote}
and, from ./daemons/i-…/yarn-hadoop-resourcemanager-ip-….log.gz
{quote}
2015-05-02 03:32:10,493 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl 
(AsyncDispatcher event handler): Updating application attempt 
appattempt_1430537363277_0001_01 with final state: FINISHING, and exit 
status: -1000
{quote}

Now, the whole application nonetheless strangely returns a 0 exit code, in 
./task-attempts/application_1430537363277_0001/container_1430537363277_0001_01_01/stderr.gz
:
{quote}
15/05/02 03:32:10 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, 
exitCode: 0, (reason: Shutdown hook called before final status was reported.)
{quote}

The reason for this error hiding is maybe given by this last reason (early 
shutdown hook). Now, is this a possible YARN bug? or is it more likely that 
something is happening with the AWS EMR cluster manager that I am using (maybe 
it detects a task failure before YARN and shuts down the PySpark application 
that was running on YARN?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3571) AM does not re-blacklist NMs after ignoring-blacklist event happens?

2015-05-02 Thread Hao Zhu (JIRA)
Hao Zhu created YARN-3571:
-

 Summary: AM does not re-blacklist NMs after ignoring-blacklist 
event happens?
 Key: YARN-3571
 URL: https://issues.apache.org/jira/browse/YARN-3571
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.5.1
Reporter: Hao Zhu


Detailed analysis are in item 3 Will AM re-blacklist NMs after 
ignoring-blacklist event happens? of below link:
http://www.openkb.info/2015/05/when-will-application-master-blacklist.html

The current behavior is : if that Node Manager has ever been blacklisted 
before, then it will not be blacklisted again after ignore-blacklist happens; 
Else, it will be blacklisted.

 The code logic is in function containerFailedOnHost(String hostName) of 
RMContainerRequestor.java:
{code}
  protected void containerFailedOnHost(String hostName) {
if (!nodeBlacklistingEnabled) {
  return;
}
if (blacklistedNodes.contains(hostName)) {
  if (LOG.isDebugEnabled()) {
LOG.debug(Host  + hostName +  is already blacklisted.);
  }
  return; //already blacklisted
{code}

The reason of above behavior is in above item 2: when ignoring-blacklist 
happens, it only ask RM to clear blacklistAdditions, however it dose not 
clear the blacklistedNodes variable.

This behavior may cause the whole job/application to fail if the previous 
blacklisted NM was released after ignoring-blacklist event happens.
Imagine a serial murder is released from prison just because the prison is 33% 
full, and horribly he/she will never be put in prison again. Only new murder 
will be put in prison.


Example to prove:
Test 1:
One node(h4) has issue, other 3 nodes are healthy.
The job failed with below AM logs:
{code}
[root@h1 container_1430425729977_0006_01_01]# egrep -i 'failures on 
node|blacklist|FATAL' syslog
2015-05-02 18:38:41,246 INFO [main] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
nodeBlacklistingEnabled:true
2015-05-02 18:38:41,246 INFO [main] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
blacklistDisablePercent is 1
2015-05-02 18:39:07,249 FATAL [IPC Server handler 3 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_02_0 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:39:07,297 INFO [Thread-49] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 1 failures on node 
h4.poc.com
2015-05-02 18:39:07,950 FATAL [IPC Server handler 16 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_08_0 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:39:07,954 INFO [Thread-49] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 2 failures on node 
h4.poc.com
2015-05-02 18:39:08,148 FATAL [IPC Server handler 17 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_07_0 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:39:08,152 INFO [Thread-49] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 3 failures on node 
h4.poc.com
2015-05-02 18:39:08,152 INFO [Thread-49] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Blacklisted host 
h4.poc.com
2015-05-02 18:39:08,561 INFO [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Update the 
blacklist for application_1430425729977_0006: blacklistAdditions=1 
blacklistRemovals=0
2015-05-02 18:39:08,561 INFO [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Ignore blacklisting 
set to true. Known: 4, Blacklisted: 1, 25%
2015-05-02 18:39:09,563 INFO [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Update the 
blacklist for application_1430425729977_0006: blacklistAdditions=0 
blacklistRemovals=1
2015-05-02 18:39:32,912 FATAL [IPC Server handler 19 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_02_1 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:39:35,076 FATAL [IPC Server handler 1 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_09_0 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:39:35,133 FATAL [IPC Server handler 5 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_08_1 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:39:57,308 FATAL [IPC Server handler 17 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_02_2 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:40:00,174 FATAL [IPC Server handler 10 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_09_1 - exited : java.io.IOException: Spill 
failed
2015-05-02 

[jira] [Commented] (YARN-2123) Progress bars in Web UI always at 100% due to non-US locale

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525494#comment-14525494
 ] 

Hadoop QA commented on YARN-2123:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 37s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 49s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 24s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | mapreduce tests |   8m 42s | Tests passed in 
hadoop-mapreduce-client-app. |
| {color:green}+1{color} | yarn tests |   2m  1s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-server-common. |
| {color:red}-1{color} | yarn tests |  62m 23s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 115m  4s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729948/YARN-2123-004.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 6ae2a0d |
| hadoop-mapreduce-client-app test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7657/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7657/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7657/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7657/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7657/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7657/console |


This message was automatically generated.

 Progress bars in Web UI always at 100% due to non-US locale
 ---

 Key: YARN-2123
 URL: https://issues.apache.org/jira/browse/YARN-2123
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 2.3.0
Reporter: Johannes Simon
Assignee: Akira AJISAKA
 Attachments: NaN_after_launching_RM.png, YARN-2123-001.patch, 
 YARN-2123-002.patch, YARN-2123-003.patch, YARN-2123-004.patch, 
 fair-scheduler-ajisaka.xml, screenshot-noPatch.png, screenshot-patch.png, 
 screenshot.png, yarn-site-ajisaka.xml


 In our cluster setup, the YARN web UI always shows progress bars at 100% (see 
 screenshot, progress of the reduce step is roughly at 32.82%). I opened the 
 HTML source code to check (also see screenshot), and it seems the problem is 
 that it uses a comma as decimal mark, where most browsers expect a dot for 
 floating-point numbers. This could possibly be due to localized number 
 formatting being used in the wrong place, which would also explain why this 
 bug is not always visible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3097) Logging of resource recovery on NM restart has redundancies

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525461#comment-14525461
 ] 

Hadoop QA commented on YARN-3097:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 36s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 37s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  2s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   5m 50s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  41m 42s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729950/YARN-3097.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 6ae2a0d |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7658/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7658/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7658/console |


This message was automatically generated.

 Logging of resource recovery on NM restart has redundancies
 ---

 Key: YARN-3097
 URL: https://issues.apache.org/jira/browse/YARN-3097
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Eric Payne
Priority: Minor
  Labels: newbie
 Attachments: YARN-3097.001.patch


 ResourceLocalizationService logs that it is recovering a resource with the 
 remote and local paths, but then very shortly afterwards the 
 LocalizedResource emits an INIT-LOCALIZED transition that also logs the same 
 remote and local paths.  The recovery message should be a debug message, 
 since it's not conveying any useful information that isn't already covered by 
 the resource state transition log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3385) Race condition: KeeperException$NoNodeException will cause RM shutdown during ZK node deletion.

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525477#comment-14525477
 ] 

Hadoop QA commented on YARN-3385:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  9s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 49s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  2s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 48s | The applied patch generated  1 
new checkstyle issues (total was 42, now 43). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 15s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:red}-1{color} | yarn tests |  49m 49s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  87m 25s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps |
|   | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729901/YARN-3385.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 6ae2a0d |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7656/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7656/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7656/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7656/console |


This message was automatically generated.

 Race condition: KeeperException$NoNodeException will cause RM shutdown during 
 ZK node deletion.
 ---

 Key: YARN-3385
 URL: https://issues.apache.org/jira/browse/YARN-3385
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3385.000.patch, YARN-3385.001.patch


 Race condition: KeeperException$NoNodeException will cause RM shutdown during 
 ZK node deletion(Op.delete).
 The race condition is similar as YARN-3023.
 since the race condition exists for ZK node creation, it should also exist 
 for  ZK node deletion.
 We see this issue with the following stack trace:
 {code}
 2015-03-17 19:18:58,958 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a 
 org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type 
 STATE_STORE_OP_FAILED. Cause:
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:945)
   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:911)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:857)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:854)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:973)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:992)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:854)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.removeApplicationStateInternal(ZKRMStateStore.java:647)
   at 
 

[jira] [Commented] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525491#comment-14525491
 ] 

Hadoop QA commented on YARN-3480:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 37s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 34s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 13s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 37s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |  52m 17s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  91m 30s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729945/YARN-3480.03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 6ae2a0d |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7659/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7659/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7659/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7659/console |


This message was automatically generated.

 Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable
 

 Key: YARN-3480
 URL: https://issues.apache.org/jira/browse/YARN-3480
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3480.01.patch, YARN-3480.02.patch, 
 YARN-3480.03.patch


 When RM HA is enabled and running containers are kept across attempts, apps 
 are more likely to finish successfully with more retries(attempts), so it 
 will be better to set 'yarn.resourcemanager.am.max-attempts' larger. However 
 it will make RMStateStore(FileSystem/HDFS/ZK) store more attempts, and make 
 RM recover process much slower. It might be better to set max attempts to be 
 stored in RMStateStore.
 BTW: When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to 
 a small value, retried attempts might be very large. So we need to delete 
 some attempts stored in RMStateStore and RMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1832) wrong MockLocalizerStatus.equals() method implementation

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525495#comment-14525495
 ] 

Hadoop QA commented on YARN-1832:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   5m 14s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 35s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  0s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   5m 47s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  22m 36s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12634678/YARN-1832.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / 6ae2a0d |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7661/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7661/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7661/console |


This message was automatically generated.

 wrong MockLocalizerStatus.equals() method implementation
 

 Key: YARN-1832
 URL: https://issues.apache.org/jira/browse/YARN-1832
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
Reporter: Hong Zhiguo
Assignee: Hong Zhiguo
Priority: Trivial
 Attachments: YARN-1832.patch


 return getLocalizerId().equals(other)  ...; should be
 return getLocalizerId().equals(other. getLocalizerId())  ...;
 getLocalizerId() returns String. It's expected to compare 
 this.getLocalizerId() against other.getLocalizerId().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1993) Cross-site scripting vulnerability in TextView.java

2015-05-02 Thread Tsuyoshi Ozawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated YARN-1993:
-
Assignee: Kenji Kikushima

 Cross-site scripting vulnerability in TextView.java
 ---

 Key: YARN-1993
 URL: https://issues.apache.org/jira/browse/YARN-1993
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Ted Yu
Assignee: Kenji Kikushima
 Attachments: YARN-1993.patch


 In 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/TextView.java
  , method echo() e.g. :
 {code}
 for (Object s : args) {
   out.print(s);
 }
 {code}
 Printing s to an HTML page allows cross-site scripting, because it was not 
 properly sanitized for context HTML attribute name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-679) add an entry point that can start any Yarn service

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525543#comment-14525543
 ] 

Hadoop QA commented on YARN-679:


\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 44s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 31 new or modified test files. |
| {color:red}-1{color} | javac |   7m 32s | The applied patch generated  130  
additional warning messages. |
| {color:red}-1{color} | javadoc |   9m 33s | The applied patch generated  3  
additional warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m  6s | The applied patch generated  
150 new checkstyle issues (total was 140, now 287). |
| {color:red}-1{color} | whitespace |   0m  7s | The patch has 5  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 40s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:red}-1{color} | common tests |  22m 21s | Tests failed in 
hadoop-common. |
| | |  59m 38s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.service.launcher.TestServiceLaunchNoArgsAllowed |
|   | hadoop.service.launcher.TestServiceLaunchedRunning |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12653051/YARN-679-003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 6ae2a0d |
| javac | 
https://builds.apache.org/job/PreCommit-YARN-Build/7664/artifact/patchprocess/diffJavacWarnings.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/7664/artifact/patchprocess/diffJavadocWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7664/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7664/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7664/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7664/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7664/console |


This message was automatically generated.

 add an entry point that can start any Yarn service
 --

 Key: YARN-679
 URL: https://issues.apache.org/jira/browse/YARN-679
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Affects Versions: 2.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran
 Attachments: YARN-679-001.patch, YARN-679-002.patch, 
 YARN-679-002.patch, YARN-679-003.patch, org.apache.hadoop.servic...mon 
 3.0.0-SNAPSHOT API).pdf

  Time Spent: 72h
  Remaining Estimate: 0h

 There's no need to write separate .main classes for every Yarn service, given 
 that the startup mechanism should be identical: create, init, start, wait for 
 stopped -with an interrupt handler to trigger a clean shutdown on a control-c 
 interrrupt.
 Provide one that takes any classname, and a list of config files/options



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1993) Cross-site scripting vulnerability in TextView.java

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525526#comment-14525526
 ] 

Hadoop QA commented on YARN-1993:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 13s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:red}-1{color} | javac |   7m 47s | The applied patch generated  173  
additional warning messages. |
| {color:red}-1{color} | javadoc |  10m  4s | The applied patch generated  14  
additional warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 53s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 24s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   1m 58s | Tests passed in 
hadoop-yarn-common. |
| | |  39m 51s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12644792/YARN-1993.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 6ae2a0d |
| javac | 
https://builds.apache.org/job/PreCommit-YARN-Build/7663/artifact/patchprocess/diffJavacWarnings.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/7663/artifact/patchprocess/diffJavadocWarnings.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7663/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7663/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7663/console |


This message was automatically generated.

 Cross-site scripting vulnerability in TextView.java
 ---

 Key: YARN-1993
 URL: https://issues.apache.org/jira/browse/YARN-1993
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Ted Yu
 Attachments: YARN-1993.patch


 In 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/TextView.java
  , method echo() e.g. :
 {code}
 for (Object s : args) {
   out.print(s);
 }
 {code}
 Printing s to an HTML page allows cross-site scripting, because it was not 
 properly sanitized for context HTML attribute name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1878) Yarn standby RM taking long to transition to active

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525527#comment-14525527
 ] 

Hadoop QA commented on YARN-1878:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 36s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 31s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 52s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 42s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   0m 22s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| | |  41m  4s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12636856/YARN-1878.1.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 6ae2a0d |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7662/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7662/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7662/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7662/console |


This message was automatically generated.

 Yarn standby RM taking long to transition to active
 ---

 Key: YARN-1878
 URL: https://issues.apache.org/jira/browse/YARN-1878
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Xuan Gong
 Attachments: YARN-1878.1.patch


 In our HA tests we are noticing that some times it can take upto 10s for the 
 standby RM to transition to active.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1993) Cross-site scripting vulnerability in TextView.java

2015-05-02 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525542#comment-14525542
 ] 

Tsuyoshi Ozawa commented on YARN-1993:
--

+1, committing this shortly.

 Cross-site scripting vulnerability in TextView.java
 ---

 Key: YARN-1993
 URL: https://issues.apache.org/jira/browse/YARN-1993
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Ted Yu
Assignee: Kenji Kikushima
 Attachments: YARN-1993.patch


 In 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/TextView.java
  , method echo() e.g. :
 {code}
 for (Object s : args) {
   out.print(s);
 }
 {code}
 Printing s to an HTML page allows cross-site scripting, because it was not 
 properly sanitized for context HTML attribute name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1106) The RM should point the tracking url to the RM app page if its empty

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1452#comment-1452
 ] 

Hadoop QA commented on YARN-1106:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 33s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | javac |   3m 18s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12637253/YARN-1106.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 6ae2a0d |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7665/console |


This message was automatically generated.

 The RM should point the tracking url to the RM app page if its empty
 

 Key: YARN-1106
 URL: https://issues.apache.org/jira/browse/YARN-1106
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.0-beta, 0.23.9
Reporter: Thomas Graves
Assignee: Thomas Graves
 Attachments: YARN-1106.patch, YARN-1106.patch


 It would be nice if the Resourcemanager set the tracking url to the RM app 
 page if the application master doesn't pass one or passes the empty string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525660#comment-14525660
 ] 

Hadoop QA commented on YARN-2821:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 44s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 22s | The applied patch generated  3 
new checkstyle issues (total was 46, now 49). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 36s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   6m 58s | Tests passed in 
hadoop-yarn-applications-distributedshell. |
| | |  42m 18s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12680098/apache-yarn-2821.1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e8d0ee5 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7669/artifact/patchprocess/diffcheckstylehadoop-yarn-applications-distributedshell.txt
 |
| hadoop-yarn-applications-distributedshell test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7669/artifact/patchprocess/testrun_hadoop-yarn-applications-distributedshell.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7669/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7669/console |


This message was automatically generated.

 Distributed shell app master becomes unresponsive sometimes
 ---

 Key: YARN-2821
 URL: https://issues.apache.org/jira/browse/YARN-2821
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.5.1
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-2821.0.patch, apache-yarn-2821.1.patch


 We've noticed that once in a while the distributed shell app master becomes 
 unresponsive and is eventually killed by the RM. snippet of the logs -
 {noformat}
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
 appattempt_1415123350094_0017_01 received 0 previous attempts' running 
 containers on AM registration.
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez2:45454
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_02, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 

[jira] [Commented] (YARN-2768) optimize FSAppAttempt.updateDemand by avoid clone of Resource which takes 85% of computing time of update thread

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525672#comment-14525672
 ] 

Hadoop QA commented on YARN-2768:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 38s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 14s | The applied patch generated  1 
new checkstyle issues (total was 6, now 7). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 38s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   1m 57s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |  52m  8s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  92m 33s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12677855/YARN-2768.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e8d0ee5 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7668/artifact/patchprocess/diffcheckstylehadoop-yarn-common.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7668/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7668/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7668/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7668/console |


This message was automatically generated.

 optimize FSAppAttempt.updateDemand by avoid clone of Resource which takes 85% 
 of computing time of update thread
 

 Key: YARN-2768
 URL: https://issues.apache.org/jira/browse/YARN-2768
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Hong Zhiguo
Assignee: Hong Zhiguo
Priority: Minor
 Attachments: YARN-2768.patch, profiling_FairScheduler_update.png


 See the attached picture of profiling result. The clone of Resource object 
 within Resources.multiply() takes up **85%** (19.2 / 22.6) CPU time of the 
 function FairScheduler.update().
 The code of FSAppAttempt.updateDemand:
 {code}
 public void updateDemand() {
 demand = Resources.createResource(0);
 // Demand is current consumption plus outstanding requests
 Resources.addTo(demand, app.getCurrentConsumption());
 // Add up outstanding resource requests
 synchronized (app) {
   for (Priority p : app.getPriorities()) {
 for (ResourceRequest r : app.getResourceRequests(p).values()) {
   Resource total = Resources.multiply(r.getCapability(), 
 r.getNumContainers());
   Resources.addTo(demand, total);
 }
   }
 }
   }
 {code}
 The code of Resources.multiply:
 {code}
 public static Resource multiply(Resource lhs, double by) {
 return multiplyTo(clone(lhs), by);
 }
 {code}
 The clone could be skipped by directly update the value of this.demand.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3571) AM does not re-blacklist NMs after ignoring-blacklist event happens?

2015-05-02 Thread Hao Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hao Zhu updated YARN-3571:
--
Description: 
Detailed analysis are in item 3 Will AM re-blacklist NMs after 
ignoring-blacklist event happens? of below link:
http://www.openkb.info/2015/05/when-will-application-master-blacklist.html

The current behavior is : if that Node Manager has ever been blacklisted 
before, then it will not be blacklisted again after ignore-blacklist happens; 
Else, it will be blacklisted.
However I think the right behavior should be : AM can re-blacklist NMs even 
after ignoring-blacklist happens once.

 The code logic is in function containerFailedOnHost(String hostName) of 
RMContainerRequestor.java:
{code}
  protected void containerFailedOnHost(String hostName) {
if (!nodeBlacklistingEnabled) {
  return;
}
if (blacklistedNodes.contains(hostName)) {
  if (LOG.isDebugEnabled()) {
LOG.debug(Host  + hostName +  is already blacklisted.);
  }
  return; //already blacklisted
{code}

The reason of above behavior is in above item 2: when ignoring-blacklist 
happens, it only ask RM to clear blacklistAdditions, however it dose not 
clear the blacklistedNodes variable.

This behavior may cause the whole job/application to fail if the previous 
blacklisted NM was released after ignoring-blacklist event happens.
Imagine a serial murder is released from prison just because the prison is 33% 
full, and horribly he/she will never be put in prison again. Only new murder 
will be put in prison.


Example to prove:
Test 1:
One node(h4) has issue, other 3 nodes are healthy.
The job failed with below AM logs:
{code}
[root@h1 container_1430425729977_0006_01_01]# egrep -i 'failures on 
node|blacklist|FATAL' syslog
2015-05-02 18:38:41,246 INFO [main] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
nodeBlacklistingEnabled:true
2015-05-02 18:38:41,246 INFO [main] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 
blacklistDisablePercent is 1
2015-05-02 18:39:07,249 FATAL [IPC Server handler 3 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_02_0 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:39:07,297 INFO [Thread-49] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 1 failures on node 
h4.poc.com
2015-05-02 18:39:07,950 FATAL [IPC Server handler 16 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_08_0 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:39:07,954 INFO [Thread-49] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 2 failures on node 
h4.poc.com
2015-05-02 18:39:08,148 FATAL [IPC Server handler 17 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_07_0 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:39:08,152 INFO [Thread-49] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 3 failures on node 
h4.poc.com
2015-05-02 18:39:08,152 INFO [Thread-49] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Blacklisted host 
h4.poc.com
2015-05-02 18:39:08,561 INFO [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Update the 
blacklist for application_1430425729977_0006: blacklistAdditions=1 
blacklistRemovals=0
2015-05-02 18:39:08,561 INFO [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Ignore blacklisting 
set to true. Known: 4, Blacklisted: 1, 25%
2015-05-02 18:39:09,563 INFO [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Update the 
blacklist for application_1430425729977_0006: blacklistAdditions=0 
blacklistRemovals=1
2015-05-02 18:39:32,912 FATAL [IPC Server handler 19 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_02_1 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:39:35,076 FATAL [IPC Server handler 1 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_09_0 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:39:35,133 FATAL [IPC Server handler 5 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_08_1 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:39:57,308 FATAL [IPC Server handler 17 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_02_2 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:40:00,174 FATAL [IPC Server handler 10 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1430425729977_0006_m_09_1 - exited : java.io.IOException: Spill 
failed
2015-05-02 18:40:00,227 FATAL [IPC Server handler 12 on 41696] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 

[jira] [Resolved] (YARN-556) [Umbrella] RM Restart phase 2 - Work preserving restart

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-556.
--
Resolution: Fixed
  Assignee: (was: Bikas Saha)

Makes sense. Resolved as fixed. Keeping it unassigned given multiple 
contributors. No fix-version given the tasks spanned across releases.

 [Umbrella] RM Restart phase 2 - Work preserving restart
 ---

 Key: YARN-556
 URL: https://issues.apache.org/jira/browse/YARN-556
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Bikas Saha
 Attachments: Work Preserving RM Restart.pdf, 
 WorkPreservingRestartPrototype.001.patch, YARN-1372.prelim.patch


 YARN-128 covered storing the state needed for the RM to recover critical 
 information. This umbrella jira will track changes needed to recover the 
 running state of the cluster so that work can be preserved across RM restarts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-556) [Umbrella] RM Restart phase 2 - Work preserving restart

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-556:
-
Summary: [Umbrella] RM Restart phase 2 - Work preserving restart  (was: RM 
Restart phase 2 - Work preserving restart)

 [Umbrella] RM Restart phase 2 - Work preserving restart
 ---

 Key: YARN-556
 URL: https://issues.apache.org/jira/browse/YARN-556
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: Work Preserving RM Restart.pdf, 
 WorkPreservingRestartPrototype.001.patch, YARN-1372.prelim.patch


 YARN-128 covered storing the state needed for the RM to recover critical 
 information. This umbrella jira will track changes needed to recover the 
 running state of the cluster so that work can be preserved across RM restarts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-59) Text File Busy errors launching MR tasks

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-59?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-59.
-
Resolution: Duplicate

From what I know, this is fixed via YARN-1271 + YARN-1295. Resolving as dup. 
Please reopen if you disagree.

 Text File Busy errors launching MR tasks
 --

 Key: YARN-59
 URL: https://issues.apache.org/jira/browse/YARN-59
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Todd Lipcon
Assignee: Andy Isaacson

 Some very small percentage of tasks fail with a Text file busy error.
 The following was the original diagnosis:
 {quote}
 Our use of PrintWriter in TaskController.writeCommand is unsafe, since that 
 class swallows all IO exceptions. We're not currently checking for errors, 
 which I'm seeing result in occasional task failures with the message Text 
 file busy - assumedly because the close() call is failing silently for some 
 reason.
 {quote}
 .. but turned out to be another issue as well (see below)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3513) Remove unused variables in ContainersMonitorImpl

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3513:
--
Fix Version/s: (was: 2.8.0)

Removing fix-version, use the target-version field for specifying your intent.

 Remove unused variables in ContainersMonitorImpl
 

 Key: YARN-3513
 URL: https://issues.apache.org/jira/browse/YARN-3513
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
Priority: Trivial
  Labels: newbie
 Attachments: YARN-3513.20150421-1.patch


 class members :  {{private final Context context;}}
 and some local variables in MonitoringThread.run()  : {{vmemStillInUsage and 
 pmemStillInUsage}} are not used and just updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1993) Cross-site scripting vulnerability in TextView.java

2015-05-02 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525592#comment-14525592
 ] 

Tsuyoshi Ozawa commented on YARN-1993:
--

Warnings by javac and javadoc are not related to the patch.

 Cross-site scripting vulnerability in TextView.java
 ---

 Key: YARN-1993
 URL: https://issues.apache.org/jira/browse/YARN-1993
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Ted Yu
Assignee: Kenji Kikushima
 Attachments: YARN-1993.patch


 In 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/TextView.java
  , method echo() e.g. :
 {code}
 for (Object s : args) {
   out.print(s);
 }
 {code}
 Printing s to an HTML page allows cross-site scripting, because it was not 
 properly sanitized for context HTML attribute name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-148) CapacityScheduler shouldn't explicitly need YarnConfiguration

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-148.
--
Resolution: Invalid

I don't see this issue anymore, seems like it got resolved along the way.

Resolving this old ticket.

 CapacityScheduler shouldn't explicitly need YarnConfiguration
 -

 Key: YARN-148
 URL: https://issues.apache.org/jira/browse/YARN-148
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli

 This was done in MAPREDUCE-3773. None of our service APIs warrant 
 YarnConfiguration. We affect the proper loading of yarn-site.xml by 
 explicitly creating YarnConfiguration in all the main classes - 
 ResourceManager, NodeManager etc.
 Due to this extra dependency, tests are failing, see 
 https://builds.apache.org/job/PreCommit-YARN-Build/74//testReport/org.apache.hadoop.yarn.client/TestYarnClient/testClientStop/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-128) [Umbrella] RM Restart Phase 1: State storage and non-work-preserving recovery

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-128.
--
Resolution: Fixed

Resolving this umbrella JIRA. RM recovery has largely been nearly 
complete/stable in YARN since this ticket was opened, what with its ultimate 
usage for rolling-upgrades (YARN-666).
 - As new issues come in, we can open new tickets.
- Will leave the open sub-tasks as they are.
- No fix-version as this was done across releases.

 [Umbrella] RM Restart Phase 1: State storage and non-work-preserving recovery
 -

 Key: YARN-128
 URL: https://issues.apache.org/jira/browse/YARN-128
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
 Attachments: MR-4343.1.patch, RM-recovery-initial-thoughts.txt, 
 RMRestartPhase1.pdf, YARN-128.full-code-4.patch, YARN-128.full-code.3.patch, 
 YARN-128.full-code.5.patch, YARN-128.new-code-added-4.patch, 
 YARN-128.new-code-added.3.patch, YARN-128.old-code-removed.3.patch, 
 YARN-128.old-code-removed.4.patch, YARN-128.patch, 
 restart-12-11-zkstore.patch, restart-fs-store-11-17.patch, 
 restart-zk-store-11-17.patch


 This umbrella jira tracks the work needed to preserve critical state 
 information and reload them upon RM restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2331) Distinguish shutdown during supervision vs. shutdown for rolling upgrade

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525627#comment-14525627
 ] 

Hadoop QA commented on YARN-2331:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 39s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | javac |   2m 58s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12673407/YARN-2331v2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e8d0ee5 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7666/console |


This message was automatically generated.

 Distinguish shutdown during supervision vs. shutdown for rolling upgrade
 

 Key: YARN-2331
 URL: https://issues.apache.org/jira/browse/YARN-2331
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-2331.patch, YARN-2331v2.patch


 When the NM is shutting down with restart support enabled there are scenarios 
 we'd like to distinguish and behave accordingly:
 # The NM is running under supervision.  In that case containers should be 
 preserved so the automatic restart can recover them.
 # The NM is not running under supervision and a rolling upgrade is not being 
 performed.  In that case the shutdown should kill all containers since it is 
 unlikely the NM will be restarted in a timely manner to recover them.
 # The NM is not running under supervision and a rolling upgrade is being 
 performed.  In that case the shutdown should not kill all containers since a 
 restart is imminent due to the rolling upgrade and the containers will be 
 recovered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2775) There is no close method in NMWebServices#getLogs()

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525641#comment-14525641
 ] 

Hadoop QA commented on YARN-2775:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 40s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 31s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 37s | The applied patch generated  1 
new checkstyle issues (total was 7, now 8). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   1m  3s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   5m 51s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  41m 47s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-nodemanager |
|  |  Nullcheck of NMWebServices$1.val$fis at line 251 of value previously 
dereferenced in 
org.apache.hadoop.yarn.server.nodemanager.webapp.NMWebServices$1.write(OutputStream)
  At NMWebServices.java:251 of value previously dereferenced in 
org.apache.hadoop.yarn.server.nodemanager.webapp.NMWebServices$1.write(OutputStream)
  At NMWebServices.java:[line 247] |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12678151/YARN-2775_001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e8d0ee5 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7667/artifact/patchprocess/diffcheckstylehadoop-yarn-server-nodemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7667/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7667/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7667/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7667/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7667/console |


This message was automatically generated.

 There is no close method in NMWebServices#getLogs()
 ---

 Key: YARN-2775
 URL: https://issues.apache.org/jira/browse/YARN-2775
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: skrho
Priority: Minor
 Attachments: YARN-2775_001.patch


 If getLogs method is called,  fileInputStream object is accumulated in 
 memory..
 Because fileinputStream object is not closed..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1426) YARN Components need to unregister their beans upon shutdown

2015-05-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525560#comment-14525560
 ] 

Hadoop QA commented on YARN-1426:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 39s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 13s | The applied patch generated  1 
new checkstyle issues (total was 76, now 72). |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 56s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | mapreduce tests | 108m 26s | Tests passed in 
hadoop-mapreduce-client-jobclient. |
| {color:green}+1{color} | yarn tests |  52m 13s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | | 198m  7s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12637242/YARN-1426.patch |
| Optional Tests | javac unit findbugs checkstyle javadoc |
| git revision | trunk / 6ae2a0d |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7660/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7660/artifact/patchprocess/whitespace.txt
 |
| hadoop-mapreduce-client-jobclient test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7660/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7660/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7660/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7660/console |


This message was automatically generated.

 YARN Components need to unregister their beans upon shutdown
 

 Key: YARN-1426
 URL: https://issues.apache.org/jira/browse/YARN-1426
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.3.0
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Attachments: YARN-1426.patch, YARN-1426.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-47) [Umbrella] Security issues in YARN

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-47:

Assignee: (was: Vinod Kumar Vavilapalli)
 Summary: [Umbrella] Security issues in YARN  (was:  Security issues in 
YARN)

Resolving this very old umbrella JIRA. Security (auth + authz) has largely been 
nearly complete/stable in YARN since this ticket was opened. And as new 
requirements come in, we can open new tickets.
 - Will leave the open sub-tasks as they are.
 - Unassigning from me given multiple contributors on the tasks.
 - No fix-version as this was done across releases.

 [Umbrella] Security issues in YARN
 --

 Key: YARN-47
 URL: https://issues.apache.org/jira/browse/YARN-47
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli

 JIRA tracking YARN related security issues.
 Moving over YARN only stuff from MAPREDUCE-3101.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-47) [Umbrella] Security issues in YARN

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-47.
-
Resolution: Fixed

 [Umbrella] Security issues in YARN
 --

 Key: YARN-47
 URL: https://issues.apache.org/jira/browse/YARN-47
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli

 JIRA tracking YARN related security issues.
 Moving over YARN only stuff from MAPREDUCE-3101.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-100) container-executor should deal with stdout, stderr better

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-100.
--
Resolution: Later

LOGFILE and ERRORFILE were always this way, and it has worked out for long 
enough.

I don't see requests to change them to point them to other files, going to 
close it as later for now. Please revert back if you disagree.

 container-executor should deal with stdout, stderr better
 -

 Key: YARN-100
 URL: https://issues.apache.org/jira/browse/YARN-100
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Priority: Minor

 container-executor.c contains the following code:
 {code}
   fclose(stdin);
   fflush(LOGFILE);
   if (LOGFILE != stdout) {
 fclose(stdout);
   }
   if (ERRORFILE != stderr) {
 fclose(stderr);
   }
   if (chdir(primary_app_dir) != 0) {
 fprintf(LOGFILE, Failed to chdir to app dir - %s\n, strerror(errno));
 return -1;
   }
   execvp(args[0], args);
 {code}
 Whenever you open a new file descriptor, its number is the lowest available 
 number.  So if {{stdout}} (fd number 1) has been closed, and you do 
 open(/my/important/file), you'll get assigned file descriptor 1.  This 
 means that any printf statements in the program will be now printing to 
 /my/important/file.  Oops!
 The correct way to get rid of stdin, stdout, or stderr is not to close them, 
 but to make them point to /dev/null.  {{dup2}} can be used for this purpose.
 It looks like LOGFILE and ERRORFILE are always set to stdout and stderr at 
 the moment.  However, this is a latent bug that should be fixed in case these 
 are ever made configurable (which seems to have been the intent).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-149) [Umbrella] ResourceManager (RM) Fail-over

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-149:
-
Summary: [Umbrella] ResourceManager (RM) Fail-over  (was: ResourceManager 
(RM) High-Availability (HA))

 [Umbrella] ResourceManager (RM) Fail-over
 -

 Key: YARN-149
 URL: https://issues.apache.org/jira/browse/YARN-149
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Harsh J
  Labels: patch
 Attachments: YARN ResourceManager Automatic 
 Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic 
 Failover-rev-08-04-13.pdf, rm-ha-phase1-approach-draft1.pdf, 
 rm-ha-phase1-draft2.pdf


 This jira tracks work needed to be done to support one RM instance failing 
 over to another RM instance so that we can have RM HA. Work includes leader 
 election, transfer of control to leader and client re-direction to new leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-149) [Umbrella] ResourceManager (RM) Fail-over

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-149.
--
Resolution: Fixed

Resolving this umbrella JIRA. RM failover has largely been complete/stable in 
YARN since this ticket was opened. And as new requirements/bugs come in, we can 
open new tickets.
- Will leave the open sub-tasks as they are.
- No fix-version as this was done across releases.

 [Umbrella] ResourceManager (RM) Fail-over
 -

 Key: YARN-149
 URL: https://issues.apache.org/jira/browse/YARN-149
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Harsh J
  Labels: patch
 Attachments: YARN ResourceManager Automatic 
 Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic 
 Failover-rev-08-04-13.pdf, rm-ha-phase1-approach-draft1.pdf, 
 rm-ha-phase1-draft2.pdf


 This jira tracks work needed to be done to support one RM instance failing 
 over to another RM instance so that we can have RM HA. Work includes leader 
 election, transfer of control to leader and client re-direction to new leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-149) [Umbrella] ResourceManager (RM) Fail-over

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525595#comment-14525595
 ] 

Vinod Kumar Vavilapalli commented on YARN-149:
--

And tx to [~kasha] and [~xgong] for bulk of the work here..

 [Umbrella] ResourceManager (RM) Fail-over
 -

 Key: YARN-149
 URL: https://issues.apache.org/jira/browse/YARN-149
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Harsh J
  Labels: patch
 Attachments: YARN ResourceManager Automatic 
 Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic 
 Failover-rev-08-04-13.pdf, rm-ha-phase1-approach-draft1.pdf, 
 rm-ha-phase1-draft2.pdf


 This jira tracks work needed to be done to support one RM instance failing 
 over to another RM instance so that we can have RM HA. Work includes leader 
 election, transfer of control to leader and client re-direction to new leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-156) WebAppProxyServlet does not support http methods other than GET

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-156.
--
Resolution: Duplicate

Seems like there is more movement on this issue at YARN-2031. Given this, I am 
closing this as dup even though this was the earlier ticket to be created. 
Please revert back if you disagree.

 WebAppProxyServlet does not support http methods other than GET
 ---

 Key: YARN-156
 URL: https://issues.apache.org/jira/browse/YARN-156
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: Thomas Weise

 Should support all methods so that applications can use it for full web 
 service access to master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-501) Application Master getting killed randomly reporting excess usage of memory

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-501.
--
Resolution: Not A Problem

Haven't gotten a response on my last comment in a while. IAC, it is unlikely 
YARN can do much in this situation. Closing this again as not-a-problem.

 Application Master getting killed randomly reporting excess usage of memory
 ---

 Key: YARN-501
 URL: https://issues.apache.org/jira/browse/YARN-501
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell, nodemanager
Affects Versions: 2.0.3-alpha
Reporter: Krishna Kishore Bonagiri
Assignee: Omkar Vinit Joshi

 I am running a date command using the Distributed Shell example in a loop of 
 500 times. It ran successfully all the times except one time where it gave 
 the following error.
 2013-03-22 04:33:25,280 INFO  [main] distributedshell.Client 
 (Client.java:monitorApplication(605)) - Got application report from ASM for, 
 appId=222, clientToken=null, appDiagnostics=Application 
 application_1363938200742_0222 failed 1 times due to AM Container for 
 appattempt_1363938200742_0222_01 exited with  exitCode: 143 due to: 
 Container [pid=21141,containerID=container_1363938200742_0222_01_01] is 
 running beyond virtual memory limits. Current usage: 47.3 Mb of 128 Mb 
 physical memory used; 611.6 Mb of 268.8 Mb virtual memory used. Killing 
 container.
 Dump of the process-tree for container_1363938200742_0222_01_01 :
 |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
 |- 21147 21141 21141 21141 (java) 244 12 532643840 11802 
 /home_/dsadm/yarn/jdk//bin/java -Xmx128m 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
 --container_memory 10 --num_containers 2 --priority 0 --shell_command date
 |- 21141 8433 21141 21141 (bash) 0 0 108642304 298 /bin/bash -c 
 /home_/dsadm/yarn/jdk//bin/java -Xmx128m 
 org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
 --container_memory 10 --num_containers 2 --priority 0 --shell_command date 
 1/tmp/logs/application_1363938200742_0222/container_1363938200742_0222_01_01/AppMaster.stdout
  
 2/tmp/logs/application_1363938200742_0222/container_1363938200742_0222_01_01/AppMaster.stderr



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-526) [Umbrella] Improve test coverage in YARN

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-526.
--
Resolution: Fixed
  Assignee: Andrey Klochkov

Assigning the umbrella also to [~aklochkov] and closing it as fixed as all 
sub-tasks currently present are done.

 [Umbrella] Improve test coverage in YARN
 

 Key: YARN-526
 URL: https://issues.apache.org/jira/browse/YARN-526
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Vinod Kumar Vavilapalli
Assignee: Andrey Klochkov





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-526) [Umbrella] Improve test coverage in YARN

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525607#comment-14525607
 ] 

Vinod Kumar Vavilapalli commented on YARN-526:
--

And no fix-versions as sub-tasks spanned releases.

 [Umbrella] Improve test coverage in YARN
 

 Key: YARN-526
 URL: https://issues.apache.org/jira/browse/YARN-526
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Vinod Kumar Vavilapalli
Assignee: Andrey Klochkov





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-543) [Umbrella] NodeManager localization related issues

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-543.
--
Resolution: Fixed

Resolving this very old umbrella JIRA. Most of the originally identified issues 
are resolved. And as new bugs come in, we can open new tickets.
- Will leave the open sub-tasks as they are.
- No fix-version as this was done across releases.

 [Umbrella] NodeManager localization related issues
 --

 Key: YARN-543
 URL: https://issues.apache.org/jira/browse/YARN-543
 Project: Hadoop YARN
  Issue Type: Task
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli

 Seeing a bunch of localization related issues being worked on, this is the 
 tracking ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-128) [Umbrella] RM Restart Phase 1: State storage and non-work-preserving recovery

2015-05-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-128:
-
Summary: [Umbrella] RM Restart Phase 1: State storage and 
non-work-preserving recovery  (was: RM Restart)

 [Umbrella] RM Restart Phase 1: State storage and non-work-preserving recovery
 -

 Key: YARN-128
 URL: https://issues.apache.org/jira/browse/YARN-128
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
 Attachments: MR-4343.1.patch, RM-recovery-initial-thoughts.txt, 
 RMRestartPhase1.pdf, YARN-128.full-code-4.patch, YARN-128.full-code.3.patch, 
 YARN-128.full-code.5.patch, YARN-128.new-code-added-4.patch, 
 YARN-128.new-code-added.3.patch, YARN-128.old-code-removed.3.patch, 
 YARN-128.old-code-removed.4.patch, YARN-128.patch, 
 restart-12-11-zkstore.patch, restart-fs-store-11-17.patch, 
 restart-zk-store-11-17.patch


 This umbrella jira tracks the work needed to preserve critical state 
 information and reload them upon RM restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)