date:20140629

[jira] [Updated] (YARN-1366) AM should implement Resync with the ApplicationMasterService instead of shutting down

2014-06-29 Thread Rohith (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-1366:
-

Attachment: YARN-1366.7.patch

 AM should implement Resync with the ApplicationMasterService instead of 
 shutting down
 -

 Key: YARN-1366
 URL: https://issues.apache.org/jira/browse/YARN-1366
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Rohith
 Attachments: YARN-1366.1.patch, YARN-1366.2.patch, YARN-1366.3.patch, 
 YARN-1366.4.patch, YARN-1366.5.patch, YARN-1366.6.patch, YARN-1366.7.patch, 
 YARN-1366.patch, YARN-1366.prototype.patch, YARN-1366.prototype.patch


 The ApplicationMasterService currently sends a resync response to which the 
 AM responds by shutting down. The AM behavior is expected to change to 
 calling resyncing with the RM. Resync means resetting the allocate RPC 
 sequence number to 0 and the AM should send its entire outstanding request to 
 the RM. Note that if the AM is making its first allocate call to the RM then 
 things should proceed like normal without needing a resync. The RM will 
 return all containers that have completed since the RM last synced with the 
 AM. Some container completions may be reported more than once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1366) AM should implement Resync with the ApplicationMasterService instead of shutting down

2014-06-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047072#comment-14047072
 ] 

Hadoop QA commented on YARN-1366:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12653045/YARN-1366.7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4132//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4132//console

This message is automatically generated.

 AM should implement Resync with the ApplicationMasterService instead of 
 shutting down
 -

 Key: YARN-1366
 URL: https://issues.apache.org/jira/browse/YARN-1366
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Rohith
 Attachments: YARN-1366.1.patch, YARN-1366.2.patch, YARN-1366.3.patch, 
 YARN-1366.4.patch, YARN-1366.5.patch, YARN-1366.6.patch, YARN-1366.7.patch, 
 YARN-1366.patch, YARN-1366.prototype.patch, YARN-1366.prototype.patch


 The ApplicationMasterService currently sends a resync response to which the 
 AM responds by shutting down. The AM behavior is expected to change to 
 calling resyncing with the RM. Resync means resetting the allocate RPC 
 sequence number to 0 and the AM should send its entire outstanding request to 
 the RM. Note that if the AM is making its first allocate call to the RM then 
 things should proceed like normal without needing a resync. The RM will 
 return all containers that have completed since the RM last synced with the 
 AM. Some container completions may be reported more than once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (YARN-2231) Provide feature to limit MRJob's stdout/stderr size

2014-06-29 Thread huozhanfeng (JIRA)

huozhanfeng created YARN-2231:
-

 Summary: Provide feature  to limit MRJob's stdout/stderr size
 Key: YARN-2231
 URL: https://issues.apache.org/jira/browse/YARN-2231
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: log-aggregation, nodemanager
Affects Versions: 2.3.0
 Environment: CentOS release 5.8 (Final)
Reporter: huozhanfeng


When a MRJob print too much stdout or stderr log, the disk will be filled. Now 
it has influence our platform management.

I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come 
from@org.apache.hadoop.mapred.TaskLog) to generate the execute cmd
as follows:
exec /bin/bash -c ( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true 
-Dhadoop.metrics.log.level=WARN  -Xmx1024m -Djava.io.tmpdir=$PWD/tmp 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02
 -Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA 
org.apache.hadoop.mapred.YarnChild 10.106.24.108 53911 
attempt_1403930653208_0003_m_00_0 2 | tail -c 102 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stdout
 ; exit $PIPESTATUS ) 21 |  tail -c 10240 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stderr
 ; exit $PIPESTATUS 

But it doesn't take effect.

And then, when I use export YARN_NODEMANAGER_OPTS=-Xdebug 
-Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y for debuging 
NodeManager, I find when I set the BreakPoints at 
org.apache.hadoop.util.Shell(line 450:process = builder.start()) and 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch(line
 161:ListString newCmds = new ArrayListString(command.size())) the cmd will 
work.

I doubt there's concurrency problem caused with pipe shell will not perform 
properly. It matters, and I need your help.

my email: huozhanf...@gmail.com

thanks



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2231) Provide feature to limit MRJob's stdout/stderr size

2014-06-29 Thread huozhanfeng (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huozhanfeng updated YARN-2231:
--

Description: 
When a MRJob print too much stdout or stderr log, the disk will be filled. Now 
it has influence our platform management.

I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come 
from@org.apache.hadoop.mapred.TaskLog) to generate the execute cmd
as follows:
exec /bin/bash -c ( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true 
-Dhadoop.metrics.log.level=WARN  -Xmx1024m -Djava.io.tmpdir=$PWD/tmp 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02
 -Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA 
org.apache.hadoop.mapred.YarnChild 10.106.24.108 53911 
attempt_1403930653208_0003_m_00_0 2 | tail -c 102 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stdout
 ; exit $PIPESTATUS ) 21 |  tail -c 10240 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stderr
 ; exit $PIPESTATUS 

But it doesn't take effect.

And then, when I use export YARN_NODEMANAGER_OPTS=-Xdebug 
-Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y for debuging 
NodeManager, I find when I set the BreakPoints at 
org.apache.hadoop.util.Shell(line 450:process = builder.start()) and 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch(line
 161:ListString newCmds = new ArrayListString(command.size())) the cmd will 
work.

I doubt there's concurrency problem caused  pipe shell will not perform 
properly. It matters, and I need your help.

my email: huozhanf...@gmail.com

thanks

  was:
When a MRJob print too much stdout or stderr log, the disk will be filled. Now 
it has influence our platform management.

I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come 
from@org.apache.hadoop.mapred.TaskLog) to generate the execute cmd
as follows:
exec /bin/bash -c ( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true 
-Dhadoop.metrics.log.level=WARN  -Xmx1024m -Djava.io.tmpdir=$PWD/tmp 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02
 -Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA 
org.apache.hadoop.mapred.YarnChild 10.106.24.108 53911 
attempt_1403930653208_0003_m_00_0 2 | tail -c 102 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stdout
 ; exit $PIPESTATUS ) 21 |  tail -c 10240 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stderr
 ; exit $PIPESTATUS 

But it doesn't take effect.

And then, when I use export YARN_NODEMANAGER_OPTS=-Xdebug 
-Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y for debuging 
NodeManager, I find when I set the BreakPoints at 
org.apache.hadoop.util.Shell(line 450:process = builder.start()) and 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch(line
 161:ListString newCmds = new ArrayListString(command.size())) the cmd will 
work.

I doubt there's concurrency problem caused with pipe shell will not perform 
properly. It matters, and I need your help.

my email: huozhanf...@gmail.com

thanks


 Provide feature  to limit MRJob's stdout/stderr size
 

 Key: YARN-2231
 URL: https://issues.apache.org/jira/browse/YARN-2231
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: log-aggregation, nodemanager
Affects Versions: 2.3.0
 Environment: CentOS release 5.8 (Final)
Reporter: huozhanfeng
  Labels: features

 When a MRJob print too much stdout or stderr log, the disk will be filled. 
 Now it has influence our platform management.
 I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come 
 from@org.apache.hadoop.mapred.TaskLog) to generate the execute cmd
 as follows:
 exec /bin/bash -c ( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true 
 -Dhadoop.metrics.log.level=WARN  -Xmx1024m -Djava.io.tmpdir=$PWD/tmp 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02
  -Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA 
 org.apache.hadoop.mapred.YarnChild 10.106.24.108 53911 
 attempt_1403930653208_0003_m_00_0 2 | tail -c 102 
 /logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stdout
  ; exit $PIPESTATUS ) 21 |  tail -c 10240 
 /logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stderr
  ; exit $PIPESTATUS 
 But it doesn't take

[jira] [Commented] (YARN-2231) Provide feature to limit MRJob's stdout/stderr size

2014-06-29 Thread huozhanfeng (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047081#comment-14047081
 ] 

huozhanfeng commented on YARN-2231:
---

Index: MapReduceChildJVM.java
===
--- MapReduceChildJVM.java  (revision 1387)
+++ MapReduceChildJVM.java  (revision 1388)
@@ -37,6 +37,7 @@
 @SuppressWarnings(deprecation)
 public class MapReduceChildJVM {
 
+   private static final String tailCommand = tail;
   private static String getTaskLogFile(LogName filter) {
 return ApplicationConstants.LOG_DIR_EXPANSION_VAR + Path.SEPARATOR + 
 filter.toString();
@@ -161,9 +162,12 @@
 
 TaskAttemptID attemptID = task.getTaskID();
 JobConf conf = task.conf;
-
+long logSize = TaskLog.getTaskLogLength(conf);
+
 VectorString vargs = new VectorString(8);
-
+if(logSize  0){
+   vargs.add(();
+}
 vargs.add(Environment.JAVA_HOME.$() + /bin/java);
 
 // Add child (task) java-vm options.
@@ -206,7 +210,6 @@
 vargs.add(-Djava.io.tmpdir= + childTmpDir);
 
 // Setup the log4j prop
-long logSize = TaskLog.getTaskLogLength(conf);
 setupLog4jProperties(task, vargs, logSize);
 
 if (conf.getProfileEnabled()) {
@@ -229,8 +232,22 @@
 
 // Finally add the jvmID
 vargs.add(String.valueOf(jvmID.getId()));
-vargs.add(1 + getTaskLogFile(TaskLog.LogName.STDOUT));
-vargs.add(2 + getTaskLogFile(TaskLog.LogName.STDERR));
+if (logSize  0) {
+   vargs.add(|);
+vargs.add(tailCommand);
+vargs.add(-c);
+vargs.add(String.valueOf(logSize));
+vargs.add(+getTaskLogFile(TaskLog.LogName.STDOUT));
+vargs.add(; exit $PIPESTATUS ) 21 | );
+vargs.add(tailCommand);
+vargs.add(-c);
+vargs.add(String.valueOf(logSize));
+vargs.add(+getTaskLogFile(TaskLog.LogName.STDERR));
+vargs.add(; exit $PIPESTATUS);
+  } else {
+   vargs.add(1 + getTaskLogFile(TaskLog.LogName.STDOUT));
+   vargs.add(2 + getTaskLogFile(TaskLog.LogName.STDERR));
+  }
 
 // Final commmand
 StringBuilder mergedCommand = new StringBuilder();


 Provide feature  to limit MRJob's stdout/stderr size
 

 Key: YARN-2231
 URL: https://issues.apache.org/jira/browse/YARN-2231
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: log-aggregation, nodemanager
Affects Versions: 2.3.0
 Environment: CentOS release 5.8 (Final)
Reporter: huozhanfeng
  Labels: features

 When a MRJob print too much stdout or stderr log, the disk will be filled. 
 Now it has influence our platform management.
 I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come 
 from@org.apache.hadoop.mapred.TaskLog) to generate the execute cmd
 as follows:
 exec /bin/bash -c ( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true 
 -Dhadoop.metrics.log.level=WARN  -Xmx1024m -Djava.io.tmpdir=$PWD/tmp 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02
  -Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA 
 org.apache.hadoop.mapred.YarnChild 10.106.24.108 53911 
 attempt_1403930653208_0003_m_00_0 2 | tail -c 102 
 /logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stdout
  ; exit $PIPESTATUS ) 21 |  tail -c 10240 
 /logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stderr
  ; exit $PIPESTATUS 
 But it doesn't take effect.
 And then, when I use export YARN_NODEMANAGER_OPTS=-Xdebug 
 -Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y for debuging 
 NodeManager, I find when I set the BreakPoints at 
 org.apache.hadoop.util.Shell(line 450:process = builder.start()) and 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch(line
  161:ListString newCmds = new ArrayListString(command.size())) the cmd 
 will work.
 I doubt there's concurrency problem caused  pipe shell will not perform 
 properly. It matters, and I need your help.
 my email: huozhanf...@gmail.com
 thanks



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2231) Provide feature to limit MRJob's stdout/stderr size

2014-06-29 Thread huozhanfeng (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huozhanfeng updated YARN-2231:
--

Description: 
When a MRJob print too much stdout or stderr log, the disk will be filled. Now 
it has influence our platform management.

I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come 
from@org.apache.hadoop.mapred.TaskLog) to generate the execute cmd
as follows:
exec /bin/bash -c ( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true 
-Dhadoop.metrics.log.level=WARN  -Xmx1024m -Djava.io.tmpdir=$PWD/tmp 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02
 -Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA 
org.apache.hadoop.mapred.YarnChild ${test_IP} 53911 
attempt_1403930653208_0003_m_00_0 2 | tail -c 102 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stdout
 ; exit $PIPESTATUS ) 21 |  tail -c 10240 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stderr
 ; exit $PIPESTATUS 

But it doesn't take effect.

And then, when I use export YARN_NODEMANAGER_OPTS=-Xdebug 
-Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y for debuging 
NodeManager, I find when I set the BreakPoints at 
org.apache.hadoop.util.Shell(line 450:process = builder.start()) and 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch(line
 161:ListString newCmds = new ArrayListString(command.size())) the cmd will 
work.

I doubt there's concurrency problem caused  pipe shell will not perform 
properly. It matters, and I need your help.

my email: huozhanf...@gmail.com

thanks

  was:
When a MRJob print too much stdout or stderr log, the disk will be filled. Now 
it has influence our platform management.

I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come 
from@org.apache.hadoop.mapred.TaskLog) to generate the execute cmd
as follows:
exec /bin/bash -c ( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true 
-Dhadoop.metrics.log.level=WARN  -Xmx1024m -Djava.io.tmpdir=$PWD/tmp 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02
 -Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA 
org.apache.hadoop.mapred.YarnChild 10.106.24.108 53911 
attempt_1403930653208_0003_m_00_0 2 | tail -c 102 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stdout
 ; exit $PIPESTATUS ) 21 |  tail -c 10240 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stderr
 ; exit $PIPESTATUS 

But it doesn't take effect.

And then, when I use export YARN_NODEMANAGER_OPTS=-Xdebug 
-Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y for debuging 
NodeManager, I find when I set the BreakPoints at 
org.apache.hadoop.util.Shell(line 450:process = builder.start()) and 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch(line
 161:ListString newCmds = new ArrayListString(command.size())) the cmd will 
work.

I doubt there's concurrency problem caused  pipe shell will not perform 
properly. It matters, and I need your help.

my email: huozhanf...@gmail.com

thanks


 Provide feature  to limit MRJob's stdout/stderr size
 

 Key: YARN-2231
 URL: https://issues.apache.org/jira/browse/YARN-2231
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: log-aggregation, nodemanager
Affects Versions: 2.3.0
 Environment: CentOS release 5.8 (Final)
Reporter: huozhanfeng
  Labels: features

 When a MRJob print too much stdout or stderr log, the disk will be filled. 
 Now it has influence our platform management.
 I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come 
 from@org.apache.hadoop.mapred.TaskLog) to generate the execute cmd
 as follows:
 exec /bin/bash -c ( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true 
 -Dhadoop.metrics.log.level=WARN  -Xmx1024m -Djava.io.tmpdir=$PWD/tmp 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02
  -Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA 
 org.apache.hadoop.mapred.YarnChild ${test_IP} 53911 
 attempt_1403930653208_0003_m_00_0 2 | tail -c 102 
 /logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stdout
  ; exit $PIPESTATUS ) 21 |  tail -c 10240 
 /logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stderr
  ; exit $PIPESTATUS 
 But it doesn't take effect.

[jira] [Updated] (YARN-2231) Provide feature to limit MRJob's stdout/stderr size

2014-06-29 Thread huozhanfeng (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huozhanfeng updated YARN-2231:
--

Description: 
When a MRJob print too much stdout or stderr log, the disk will be filled. Now 
it has influence our platform management.

I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come 
from@org.apache.hadoop.mapred.TaskLog) to generate the execute cmd
as follows:
exec /bin/bash -c ( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true 
-Dhadoop.metrics.log.level=WARN  -Xmx1024m -Djava.io.tmpdir=$PWD/tmp 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02
 -Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA 
org.apache.hadoop.mapred.YarnChild $test_IP 53911 
attempt_1403930653208_0003_m_00_0 2 | tail -c 102 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stdout
 ; exit $PIPESTATUS ) 21 |  tail -c 10240 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stderr
 ; exit $PIPESTATUS 

But it doesn't take effect.

And then, when I use export YARN_NODEMANAGER_OPTS=-Xdebug 
-Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y for debuging 
NodeManager, I find when I set the BreakPoints at 
org.apache.hadoop.util.Shell(line 450:process = builder.start()) and 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch(line
 161:ListString newCmds = new ArrayListString(command.size())) the cmd will 
work.

I doubt there's concurrency problem caused  pipe shell will not perform 
properly. It matters, and I need your help.

my email: huozhanf...@gmail.com

thanks

  was:
When a MRJob print too much stdout or stderr log, the disk will be filled. Now 
it has influence our platform management.

I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come 
from@org.apache.hadoop.mapred.TaskLog) to generate the execute cmd
as follows:
exec /bin/bash -c ( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true 
-Dhadoop.metrics.log.level=WARN  -Xmx1024m -Djava.io.tmpdir=$PWD/tmp 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02
 -Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA 
org.apache.hadoop.mapred.YarnChild ${test_IP} 53911 
attempt_1403930653208_0003_m_00_0 2 | tail -c 102 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stdout
 ; exit $PIPESTATUS ) 21 |  tail -c 10240 
/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stderr
 ; exit $PIPESTATUS 

But it doesn't take effect.

And then, when I use export YARN_NODEMANAGER_OPTS=-Xdebug 
-Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y for debuging 
NodeManager, I find when I set the BreakPoints at 
org.apache.hadoop.util.Shell(line 450:process = builder.start()) and 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch(line
 161:ListString newCmds = new ArrayListString(command.size())) the cmd will 
work.

I doubt there's concurrency problem caused  pipe shell will not perform 
properly. It matters, and I need your help.

my email: huozhanf...@gmail.com

thanks


 Provide feature  to limit MRJob's stdout/stderr size
 

 Key: YARN-2231
 URL: https://issues.apache.org/jira/browse/YARN-2231
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: log-aggregation, nodemanager
Affects Versions: 2.3.0
 Environment: CentOS release 5.8 (Final)
Reporter: huozhanfeng
  Labels: features

 When a MRJob print too much stdout or stderr log, the disk will be filled. 
 Now it has influence our platform management.
 I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come 
 from@org.apache.hadoop.mapred.TaskLog) to generate the execute cmd
 as follows:
 exec /bin/bash -c ( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true 
 -Dhadoop.metrics.log.level=WARN  -Xmx1024m -Djava.io.tmpdir=$PWD/tmp 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02
  -Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA 
 org.apache.hadoop.mapred.YarnChild $test_IP 53911 
 attempt_1403930653208_0003_m_00_0 2 | tail -c 102 
 /logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stdout
  ; exit $PIPESTATUS ) 21 |  tail -c 10240 
 /logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_02/stderr
  ; exit $PIPESTATUS 
 But it doesn't take effect.
 And

[jira] [Commented] (YARN-614) Separate AM failures from hardware failure or YARN error and do not count them to AM retry count

2014-06-29 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047096#comment-14047096
 ] 

Hudson commented on YARN-614:
-

FAILURE: Integrated in Hadoop-Yarn-trunk #598 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/598/])
YARN-614. Changed ResourceManager to not count disk failure, node loss and RM 
restart towards app failures. Contributed by Xuan Gong (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1606407)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java


 Separate AM failures from hardware failure or YARN error and do not count 
 them to AM retry count
 

 Key: YARN-614
 URL: https://issues.apache.org/jira/browse/YARN-614
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Bikas Saha
Assignee: Xuan Gong
 Fix For: 2.5.0

 Attachments: YARN-614-0.patch, YARN-614-1.patch, YARN-614-2.patch, 
 YARN-614-3.patch, YARN-614-4.patch, YARN-614-5.patch, YARN-614-6.patch, 
 YARN-614.10.patch, YARN-614.11.patch, YARN-614.12.patch, YARN-614.13.patch, 
 YARN-614.7.patch, YARN-614.8.patch, YARN-614.9.patch


 Attempts can fail due to a large number of user errors and they should not be 
 retried unnecessarily. The only reason YARN should retry an attempt is when 
 the hardware fails or YARN has an error. NM failing, lost NM and NM disk 
 errors are the hardware errors that come to mind.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-679) add an entry point that can start any Yarn service

2014-06-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated YARN-679:


Attachment: YARN-679-003.patch

Revised patch
# uses reflection to load the HDFS and YARN configurations if present -so 
forcing in their resources
# uses {{GenericOptionsParser}} to parse the options -so the command line is 
now consistent with ToolRunner. (There's one extra constraint -that all configs 
resolve to valid paths in the filesystem)
# {{GenericOptionsParser}} adds a flag to indicate whether or not the parse 
worked...until now it looks like an invalid set of generic options could still 
get handed down to the tool

 add an entry point that can start any Yarn service
 --

 Key: YARN-679
 URL: https://issues.apache.org/jira/browse/YARN-679
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Affects Versions: 2.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran
 Attachments: YARN-679-001.patch, YARN-679-002.patch, 
 YARN-679-002.patch, YARN-679-003.patch, org.apache.hadoop.servic...mon 
 3.0.0-SNAPSHOT API).pdf

  Time Spent: 72h
  Remaining Estimate: 0h

 There's no need to write separate .main classes for every Yarn service, given 
 that the startup mechanism should be identical: create, init, start, wait for 
 stopped -with an interrupt handler to trigger a clean shutdown on a control-c 
 interrrupt.
 Provide one that takes any classname, and a list of config files/options



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2065) AM cannot create new containers after restart-NM token from previous attempt used

2014-06-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047111#comment-14047111
 ] 

Steve Loughran commented on YARN-2065:
--

I'll try to run my code against this patch this week

 AM cannot create new containers after restart-NM token from previous attempt 
 used
 -

 Key: YARN-2065
 URL: https://issues.apache.org/jira/browse/YARN-2065
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Steve Loughran
Assignee: Jian He
 Attachments: YARN-2065.1.patch


 Slider AM Restart failing (SLIDER-34). The AM comes back up, but it cannot 
 create new containers.
 The Slider minicluster test {{TestKilledAM}} can replicate this reliably -it 
 kills the AM, then kills a container while the AM is down, which triggers a 
 reallocation of a container, leading to this failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-679) add an entry point that can start any Yarn service

2014-06-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047118#comment-14047118
 ] 

Hadoop QA commented on YARN-679:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12653051/YARN-679-003.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 31 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1267 javac 
compiler warnings (more than the trunk's current 1258 warnings).

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 5 
warning messages.
See 
https://builds.apache.org/job/PreCommit-YARN-Build/4133//artifact/trunk/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.service.launcher.TestServiceLaunchedRunning
  
org.apache.hadoop.service.launcher.TestServiceLaunchNoArgsAllowed

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4133//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/4133//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4133//console

This message is automatically generated.

 add an entry point that can start any Yarn service
 --

 Key: YARN-679
 URL: https://issues.apache.org/jira/browse/YARN-679
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Affects Versions: 2.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran
 Attachments: YARN-679-001.patch, YARN-679-002.patch, 
 YARN-679-002.patch, YARN-679-003.patch, org.apache.hadoop.servic...mon 
 3.0.0-SNAPSHOT API).pdf

  Time Spent: 72h
  Remaining Estimate: 0h

 There's no need to write separate .main classes for every Yarn service, given 
 that the startup mechanism should be identical: create, init, start, wait for 
 stopped -with an interrupt handler to trigger a clean shutdown on a control-c 
 interrrupt.
 Provide one that takes any classname, and a list of config files/options



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-614) Separate AM failures from hardware failure or YARN error and do not count them to AM retry count

2014-06-29 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047127#comment-14047127
 ] 

Hudson commented on YARN-614:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk #1789 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1789/])
YARN-614. Changed ResourceManager to not count disk failure, node loss and RM 
restart towards app failures. Contributed by Xuan Gong (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1606407)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java


 Separate AM failures from hardware failure or YARN error and do not count 
 them to AM retry count
 

 Key: YARN-614
 URL: https://issues.apache.org/jira/browse/YARN-614
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Bikas Saha
Assignee: Xuan Gong
 Fix For: 2.5.0

 Attachments: YARN-614-0.patch, YARN-614-1.patch, YARN-614-2.patch, 
 YARN-614-3.patch, YARN-614-4.patch, YARN-614-5.patch, YARN-614-6.patch, 
 YARN-614.10.patch, YARN-614.11.patch, YARN-614.12.patch, YARN-614.13.patch, 
 YARN-614.7.patch, YARN-614.8.patch, YARN-614.9.patch


 Attempts can fail due to a large number of user errors and they should not be 
 retried unnecessarily. The only reason YARN should retry an attempt is when 
 the hardware fails or YARN has an error. NM failing, lost NM and NM disk 
 errors are the hardware errors that come to mind.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-614) Separate AM failures from hardware failure or YARN error and do not count them to AM retry count

2014-06-29 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047137#comment-14047137
 ] 

Hudson commented on YARN-614:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1816 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1816/])
YARN-614. Changed ResourceManager to not count disk failure, node loss and RM 
restart towards app failures. Contributed by Xuan Gong (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1606407)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java


 Separate AM failures from hardware failure or YARN error and do not count 
 them to AM retry count
 

 Key: YARN-614
 URL: https://issues.apache.org/jira/browse/YARN-614
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Bikas Saha
Assignee: Xuan Gong
 Fix For: 2.5.0

 Attachments: YARN-614-0.patch, YARN-614-1.patch, YARN-614-2.patch, 
 YARN-614-3.patch, YARN-614-4.patch, YARN-614-5.patch, YARN-614-6.patch, 
 YARN-614.10.patch, YARN-614.11.patch, YARN-614.12.patch, YARN-614.13.patch, 
 YARN-614.7.patch, YARN-614.8.patch, YARN-614.9.patch


 Attempts can fail due to a large number of user errors and they should not be 
 retried unnecessarily. The only reason YARN should retry an attempt is when 
 the hardware fails or YARN has an error. NM failing, lost NM and NM disk 
 errors are the hardware errors that come to mind.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2065) AM cannot create new containers after restart-NM token from previous attempt used

2014-06-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated YARN-2065:
-

Attachment: YARN-2065-002.patch

This is the previous patch, in sync with trunk

 AM cannot create new containers after restart-NM token from previous attempt 
 used
 -

 Key: YARN-2065
 URL: https://issues.apache.org/jira/browse/YARN-2065
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Steve Loughran
Assignee: Jian He
 Attachments: YARN-2065-002.patch, YARN-2065.1.patch


 Slider AM Restart failing (SLIDER-34). The AM comes back up, but it cannot 
 create new containers.
 The Slider minicluster test {{TestKilledAM}} can replicate this reliably -it 
 kills the AM, then kills a container while the AM is down, which triggers a 
 reallocation of a container, leading to this failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2065) AM cannot create new containers after restart-NM token from previous attempt used

2014-06-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047171#comment-14047171
 ] 

Hadoop QA commented on YARN-2065:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12653066/YARN-2065-002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4134//console

This message is automatically generated.

 AM cannot create new containers after restart-NM token from previous attempt 
 used
 -

 Key: YARN-2065
 URL: https://issues.apache.org/jira/browse/YARN-2065
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Steve Loughran
Assignee: Jian He
 Attachments: YARN-2065-002.patch, YARN-2065.1.patch


 Slider AM Restart failing (SLIDER-34). The AM comes back up, but it cannot 
 create new containers.
 The Slider minicluster test {{TestKilledAM}} can replicate this reliably -it 
 kills the AM, then kills a container while the AM is down, which triggers a 
 reallocation of a container, leading to this failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

2014-06-29 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047202#comment-14047202
 ] 

Hudson commented on YARN-2052:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5799 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5799/])
YARN-2052. Embedded an epoch number in container id to ensure the uniqueness of 
container id after RM restarts. Contributed by Tsuyoshi OZAWA (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1606557)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/Epoch.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/impl/pb/EpochPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerApplicationAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSSchedulerApp.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestMaxRunningAppsEnforcer.java


 ContainerId creation after work preserving restart is broken
 

 Key: YARN-2052
 URL: https://issues.apache.org/jira/browse/YARN-2052
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Tsuyoshi OZAWA

[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

2014-06-29 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047207#comment-14047207
 ] 

Vinod Kumar Vavilapalli commented on YARN-2052:
---

Shouldn't epoch have a default value in the proto file? What is the default if 
it isn't provided? Thinking from backwards compatibility point of view..

 ContainerId creation after work preserving restart is broken
 

 Key: YARN-2052
 URL: https://issues.apache.org/jira/browse/YARN-2052
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Fix For: 2.5.0

 Attachments: YARN-2052.1.patch, YARN-2052.10.patch, 
 YARN-2052.11.patch, YARN-2052.12.patch, YARN-2052.2.patch, YARN-2052.3.patch, 
 YARN-2052.4.patch, YARN-2052.5.patch, YARN-2052.6.patch, YARN-2052.7.patch, 
 YARN-2052.8.patch, YARN-2052.9.patch, YARN-2052.9.patch


 Container ids are made unique by using the app identifier and appending a 
 monotonically increasing sequence number to it. Since container creation is a 
 high churn activity the RM does not store the sequence number per app. So 
 after restart it does not know what the new sequence number should be for new 
 allocations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

2014-06-29 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047210#comment-14047210
 ] 

Jian He commented on YARN-2052:
---

bq. For numeric types, the default value is zero.
Copied from protobuffer guide. In this case, it should be fine. we can 
explicitly add too if needed.

 ContainerId creation after work preserving restart is broken
 

 Key: YARN-2052
 URL: https://issues.apache.org/jira/browse/YARN-2052
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Fix For: 2.5.0

 Attachments: YARN-2052.1.patch, YARN-2052.10.patch, 
 YARN-2052.11.patch, YARN-2052.12.patch, YARN-2052.2.patch, YARN-2052.3.patch, 
 YARN-2052.4.patch, YARN-2052.5.patch, YARN-2052.6.patch, YARN-2052.7.patch, 
 YARN-2052.8.patch, YARN-2052.9.patch, YARN-2052.9.patch


 Container ids are made unique by using the app identifier and appending a 
 monotonically increasing sequence number to it. Since container creation is a 
 high churn activity the RM does not store the sequence number per app. So 
 after restart it does not know what the new sequence number should be for new 
 allocations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2142) Add one service to check the nodes' TRUST status

2014-06-29 Thread anders (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anders updated YARN-2142:
-

Attachment: trust002.patch

modified the xml file,for testing

 Add one service to check the nodes' TRUST status 
 -

 Key: YARN-2142
 URL: https://issues.apache.org/jira/browse/YARN-2142
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager, scheduler
Affects Versions: 2.2.0
 Environment: OS:Ubuntu 13.04; 
 JAVA:OpenJDK 7u51-2.4.4-0
Reporter: anders
Priority: Minor
  Labels: patch
 Fix For: 2.2.0

 Attachments: test.patch, trust.patch, trust.patch, trust.patch, 
 trust001.patch, trust002.patch

   Original Estimate: 1m
  Remaining Estimate: 1m

 Because of critical computing environment ,we must test every node's TRUST 
 status in the cluster (We can get the TRUST status by the API of OAT 
 sever),So I add this feature into hadoop's schedule .
 By the TRUST check service ,node can get the TRUST status of itself,
 then through the heartbeat ,send the TRUST status to resource manager for 
 scheduling.
 In the scheduling step,if the node's TRUST status is 'false', it will be 
 abandoned until it's TRUST status turn to 'true'.
 ***The logic of this feature is similar to node's health checkservice.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2142) Add one service to check the nodes' TRUST status

2014-06-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047348#comment-14047348
 ] 

Hadoop QA commented on YARN-2142:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12653086/trust002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

  {color:red}-1 javac{color}.  The applied patch generated 1266 javac 
compiler warnings (more than the trunk's current 1258 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4135//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/4135//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4135//console

This message is automatically generated.

 Add one service to check the nodes' TRUST status 
 -

 Key: YARN-2142
 URL: https://issues.apache.org/jira/browse/YARN-2142
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager, scheduler
Affects Versions: 2.2.0
 Environment: OS:Ubuntu 13.04; 
 JAVA:OpenJDK 7u51-2.4.4-0
Reporter: anders
Priority: Minor
  Labels: patch
 Fix For: 2.2.0

 Attachments: test.patch, trust.patch, trust.patch, trust.patch, 
 trust001.patch, trust002.patch

   Original Estimate: 1m
  Remaining Estimate: 1m

 Because of critical computing environment ,we must test every node's TRUST 
 status in the cluster (We can get the TRUST status by the API of OAT 
 sever),So I add this feature into hadoop's schedule .
 By the TRUST check service ,node can get the TRUST status of itself,
 then through the heartbeat ,send the TRUST status to resource manager for 
 scheduling.
 In the scheduling step,if the node's TRUST status is 'false', it will be 
 abandoned until it's TRUST status turn to 'true'.
 ***The logic of this feature is similar to node's health checkservice.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2181) Add preemption info to RM Web UI

2014-06-29 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2181:
-

Description: We need add preemption info to RM web page to make 
administrator/user get more understanding about preemption happened on app, 
etc.   (was: We need add preemption info to RM web page to make 
administrator/user get more understanding about preemption happened on 
app/queue, etc. )

 Add preemption info to RM Web UI
 

 Key: YARN-2181
 URL: https://issues.apache.org/jira/browse/YARN-2181
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
Affects Versions: 2.4.0
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 YARN-2181.patch, YARN-2181.patch, application page.png, queue page.png


 We need add preemption info to RM web page to make administrator/user get 
 more understanding about preemption happened on app, etc. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2181) Add preemption info to RM Web UI

2014-06-29 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047353#comment-14047353
 ] 

Wangda Tan commented on YARN-2181:
--

The previous comment,
bq. We can address preemption info in separated JIRA.
should be 
We can address preemption info *of queues* in separated JIRA.

 Add preemption info to RM Web UI
 

 Key: YARN-2181
 URL: https://issues.apache.org/jira/browse/YARN-2181
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
Affects Versions: 2.4.0
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, application page.png, 
 queue page.png


 We need add preemption info to RM web page to make administrator/user get 
 more understanding about preemption happened on app, etc. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2181) Add preemption info to RM Web UI

2014-06-29 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2181:
-

Attachment: YARN-2181.patch

Offline discussed with [~jianhe], we decided to remove queue metrics from RM 
web UI because we cannot make metrics info consistent on queue page / app page, 
(it is possible that sum of preempted resource from apps under a queue is not 
equal to preempted resource in a queue).
We can address preemption info in separated JIRA.

 Add preemption info to RM Web UI
 

 Key: YARN-2181
 URL: https://issues.apache.org/jira/browse/YARN-2181
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
Affects Versions: 2.4.0
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, application page.png, 
 queue page.png


 We need add preemption info to RM web page to make administrator/user get 
 more understanding about preemption happened on app, etc. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2181) Add preemption info to RM Web UI

2014-06-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047354#comment-14047354
 ] 

Hadoop QA commented on YARN-2181:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12653089/YARN-2181.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4136//console

This message is automatically generated.

 Add preemption info to RM Web UI
 

 Key: YARN-2181
 URL: https://issues.apache.org/jira/browse/YARN-2181
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
Affects Versions: 2.4.0
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, application page.png, 
 queue page.png


 We need add preemption info to RM web page to make administrator/user get 
 more understanding about preemption happened on app, etc. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2181) Add preemption info to RM Web UI

2014-06-29 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2181:
-

Attachment: YARN-2181.patch

Rebased against trunk

 Add preemption info to RM Web UI
 

 Key: YARN-2181
 URL: https://issues.apache.org/jira/browse/YARN-2181
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
Affects Versions: 2.4.0
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 application page.png, queue page.png


 We need add preemption info to RM web page to make administrator/user get 
 more understanding about preemption happened on app, etc. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2209) Replace allocate#resync command with ApplicationMasterNotRegisteredException to indicate AM to re-register on RM restart

2014-06-29 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2209:
--

Attachment: YARN-2209.1.patch

Patch to replace the AM_RESYNC command with the 
ApplicationMasterNotRegisteredException.
and AM_SHUTDOWN command with ApplicationNotFoundException.

 Replace allocate#resync command with ApplicationMasterNotRegisteredException 
 to indicate AM to re-register on RM restart
 

 Key: YARN-2209
 URL: https://issues.apache.org/jira/browse/YARN-2209
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-2209.1.patch


 YARN-1365 introduced an ApplicationMasterNotRegisteredException to indicate 
 application to re-register on RM restart. we should do the same for 
 AMS#allocate call also.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1366) AM should implement Resync with the ApplicationMasterService instead of shutting down

2014-06-29 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047366#comment-14047366
 ] 

Jian He commented on YARN-1366:
---

Thanks for working on the patch ! some comments:
- isApplicationMasterRegistered is actually not an argument, may be throw 
ApplicationMasterNotRegsiteredException in this case ?
{code}
Preconditions.checkArgument(isApplicationMasterRegistered,
Application Master is trying to unregister before registering.);
{code}
- pom.xml format: use spaces instead of tabs
{code}
+dependency
+   groupIdorg.apache.hadoop/groupId
+   artifactIdhadoop-yarn-common/artifactId
+   typetest-jar/type
+   scopetest/scope
+   /dependency
{code}
-  testAMRMClientResendsRequestsOnRMRestart seems not testing re-sending 
pendingReleases across RM restart, because the pending releases seems already 
decremented to zero before restart happens.
- Not related to this jira. Current ApplicationMasterService does not allow 
multiple registers. Application may want to update its tracking url etc.  
Should we make AMS accept multiple registers  ? 
{code} Preconditions.checkArgument(!isApplicationMasterRegistered,
ApplicationMaster is already registered);
{code}

 AM should implement Resync with the ApplicationMasterService instead of 
 shutting down
 -

 Key: YARN-1366
 URL: https://issues.apache.org/jira/browse/YARN-1366
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Rohith
 Attachments: YARN-1366.1.patch, YARN-1366.2.patch, YARN-1366.3.patch, 
 YARN-1366.4.patch, YARN-1366.5.patch, YARN-1366.6.patch, YARN-1366.7.patch, 
 YARN-1366.patch, YARN-1366.prototype.patch, YARN-1366.prototype.patch


 The ApplicationMasterService currently sends a resync response to which the 
 AM responds by shutting down. The AM behavior is expected to change to 
 calling resyncing with the RM. Resync means resetting the allocate RPC 
 sequence number to 0 and the AM should send its entire outstanding request to 
 the RM. Note that if the AM is making its first allocate call to the RM then 
 things should proceed like normal without needing a resync. The RM will 
 return all containers that have completed since the RM last synced with the 
 AM. Some container completions may be reported more than once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2142) Add one service to check the nodes' TRUST status

2014-06-29 Thread anders (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anders updated YARN-2142:
-

Component/s: webapp

 Add one service to check the nodes' TRUST status 
 -

 Key: YARN-2142
 URL: https://issues.apache.org/jira/browse/YARN-2142
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager, scheduler, webapp
Affects Versions: 2.2.0
 Environment: OS:Ubuntu 13.04; 
 JAVA:OpenJDK 7u51-2.4.4-0
Reporter: anders
Priority: Minor
  Labels: patch
 Fix For: 2.2.0

 Attachments: test.patch, trust.patch, trust.patch, trust.patch, 
 trust001.patch, trust002.patch

   Original Estimate: 1m
  Remaining Estimate: 1m

 Because of critical computing environment ,we must test every node's TRUST 
 status in the cluster (We can get the TRUST status by the API of OAT 
 sever),So I add this feature into hadoop's schedule .
 By the TRUST check service ,node can get the TRUST status of itself,
 then through the heartbeat ,send the TRUST status to resource manager for 
 scheduling.
 In the scheduling step,if the node's TRUST status is 'false', it will be 
 abandoned until it's TRUST status turn to 'true'.
 ***The logic of this feature is similar to node's health checkservice.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2181) Add preemption info to RM Web UI

2014-06-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047384#comment-14047384
 ] 

Hadoop QA commented on YARN-2181:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12653092/YARN-2181.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4137//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4137//console

This message is automatically generated.

 Add preemption info to RM Web UI
 

 Key: YARN-2181
 URL: https://issues.apache.org/jira/browse/YARN-2181
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
Affects Versions: 2.4.0
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, YARN-2181.patch, 
 application page.png, queue page.png


 We need add preemption info to RM web page to make administrator/user get 
 more understanding about preemption happened on app, etc. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2209) Replace allocate#resync command with ApplicationMasterNotRegisteredException to indicate AM to re-register on RM restart

2014-06-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047402#comment-14047402
 ] 

Hadoop QA commented on YARN-2209:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12653095/YARN-2209.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1261 javac 
compiler warnings (more than the trunk's current 1258 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4138//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/4138//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4138//console

This message is automatically generated.

 Replace allocate#resync command with ApplicationMasterNotRegisteredException 
 to indicate AM to re-register on RM restart
 

 Key: YARN-2209
 URL: https://issues.apache.org/jira/browse/YARN-2209
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-2209.1.patch


 YARN-1365 introduced an ApplicationMasterNotRegisteredException to indicate 
 application to re-register on RM restart. we should do the same for 
 AMS#allocate call also.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1366) AM should implement Resync with the ApplicationMasterService instead of shutting down

[jira] [Commented] (YARN-1366) AM should implement Resync with the ApplicationMasterService instead of shutting down

[jira] [Created] (YARN-2231) Provide feature to limit MRJob's stdout/stderr size

[jira] [Updated] (YARN-2231) Provide feature to limit MRJob's stdout/stderr size

[jira] [Commented] (YARN-2231) Provide feature to limit MRJob's stdout/stderr size

[jira] [Updated] (YARN-2231) Provide feature to limit MRJob's stdout/stderr size

[jira] [Updated] (YARN-2231) Provide feature to limit MRJob's stdout/stderr size

[jira] [Commented] (YARN-614) Separate AM failures from hardware failure or YARN error and do not count them to AM retry count

[jira] [Updated] (YARN-679) add an entry point that can start any Yarn service

[jira] [Commented] (YARN-2065) AM cannot create new containers after restart-NM token from previous attempt used

[jira] [Commented] (YARN-679) add an entry point that can start any Yarn service

[jira] [Commented] (YARN-614) Separate AM failures from hardware failure or YARN error and do not count them to AM retry count

[jira] [Commented] (YARN-614) Separate AM failures from hardware failure or YARN error and do not count them to AM retry count

[jira] [Updated] (YARN-2065) AM cannot create new containers after restart-NM token from previous attempt used

[jira] [Commented] (YARN-2065) AM cannot create new containers after restart-NM token from previous attempt used

[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

[jira] [Updated] (YARN-2142) Add one service to check the nodes' TRUST status

[jira] [Commented] (YARN-2142) Add one service to check the nodes' TRUST status

[jira] [Updated] (YARN-2181) Add preemption info to RM Web UI

[jira] [Commented] (YARN-2181) Add preemption info to RM Web UI

[jira] [Updated] (YARN-2181) Add preemption info to RM Web UI

[jira] [Commented] (YARN-2181) Add preemption info to RM Web UI

[jira] [Updated] (YARN-2181) Add preemption info to RM Web UI

[jira] [Updated] (YARN-2209) Replace allocate#resync command with ApplicationMasterNotRegisteredException to indicate AM to re-register on RM restart

[jira] [Commented] (YARN-1366) AM should implement Resync with the ApplicationMasterService instead of shutting down

[jira] [Updated] (YARN-2142) Add one service to check the nodes' TRUST status

[jira] [Commented] (YARN-2181) Add preemption info to RM Web UI

[jira] [Commented] (YARN-2209) Replace allocate#resync command with ApplicationMasterNotRegisteredException to indicate AM to re-register on RM restart

30 matches

Site Navigation

Mail list logo

Footer information