date:20131004


[ 
https://issues.apache.org/jira/browse/YARN-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785928#comment-13785928
 ] 

Hudson commented on YARN-1219:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4537 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4537/])
YARN-1219. FSDownload changes file suffix making FileUtil.unTar() throw 
exception. Contributed by Shanyu Zhao. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529084)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java


 FSDownload changes file suffix making FileUtil.unTar() throw exception
 --

 Key: YARN-1219
 URL: https://issues.apache.org/jira/browse/YARN-1219
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.1.1-beta, 2.1.2-beta
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 2.1.2-beta

 Attachments: YARN-1219.patch


 While running a Hive join operation on Yarn, I saw exception as described 
 below. This is caused by FSDownload copy the files into a temp file and 
 change the suffix into .tmp before unpacking it. In unpack(), it uses 
 FileUtil.unTar() which will determine if the file is gzipped by looking at 
 the file suffix:
 {code}
 boolean gzipped = inFile.toString().endsWith(gz);
 {code}
 To fix this problem, we can remove the .tmp in the temp file name.
 Here is the detailed exception:
 org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:240)
   at org.apache.hadoop.fs.FileUtil.unTarUsingJava(FileUtil.java:676)
   at org.apache.hadoop.fs.FileUtil.unTar(FileUtil.java:625)
   at org.apache.hadoop.yarn.util.FSDownload.unpack(FSDownload.java:203)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:287)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1232) Configuration to support multiple RMs


[ 
https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785931#comment-13785931
 ] 

Hadoop QA commented on YARN-1232:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12606732/yarn-1232-7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2092//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2092//console

This message is automatically generated.

 Configuration to support multiple RMs
 -

 Key: YARN-1232
 URL: https://issues.apache.org/jira/browse/YARN-1232
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1232-1.patch, yarn-1232-2.patch, yarn-1232-3.patch, 
 yarn-1232-4.patch, yarn-1232-5.patch, yarn-1232-6.patch, yarn-1232-7.patch, 
 yarn-1232-7.patch


 We should augment the configuration to allow users specify two RMs and the 
 individual RPC addresses for them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (YARN-1219) FSDownload changes file suffix making FileUtil.unTar() throw exception

2013-10-04 Thread Chris Nauroth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth resolved YARN-1219.
-

Resolution: Fixed

I've committed this to trunk, branch-2, and branch-2.1-beta.  Shanyu, thank you 
for the patch.  Omkar, thank you for help with code review.

 FSDownload changes file suffix making FileUtil.unTar() throw exception
 --

 Key: YARN-1219
 URL: https://issues.apache.org/jira/browse/YARN-1219
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.1.1-beta, 2.1.2-beta
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 2.1.2-beta

 Attachments: YARN-1219.patch


 While running a Hive join operation on Yarn, I saw exception as described 
 below. This is caused by FSDownload copy the files into a temp file and 
 change the suffix into .tmp before unpacking it. In unpack(), it uses 
 FileUtil.unTar() which will determine if the file is gzipped by looking at 
 the file suffix:
 {code}
 boolean gzipped = inFile.toString().endsWith(gz);
 {code}
 To fix this problem, we can remove the .tmp in the temp file name.
 Here is the detailed exception:
 org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:240)
   at org.apache.hadoop.fs.FileUtil.unTarUsingJava(FileUtil.java:676)
   at org.apache.hadoop.fs.FileUtil.unTar(FileUtil.java:625)
   at org.apache.hadoop.yarn.util.FSDownload.unpack(FSDownload.java:203)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:287)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-7) Add support for DistributedShell to ask for CPUs along with memory


 [ 
https://issues.apache.org/jira/browse/YARN-7?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-7:
--

Attachment: YARN-7-v3.patch

Sync up the patch, remove unnecessary white space/tab changes.

 Add support for DistributedShell to ask for CPUs along with memory
 --

 Key: YARN-7
 URL: https://issues.apache.org/jira/browse/YARN-7
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Arun C Murthy
Assignee: Junping Du
  Labels: patch
 Attachments: YARN-7.patch, YARN-7-v2.patch, YARN-7-v3.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-7) Add support for DistributedShell to ask for CPUs along with memory


[ 
https://issues.apache.org/jira/browse/YARN-7?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785988#comment-13785988
 ] 

Hadoop QA commented on YARN-7:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12606739/YARN-7-v3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell:

  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2094//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2094//console

This message is automatically generated.

 Add support for DistributedShell to ask for CPUs along with memory
 --

 Key: YARN-7
 URL: https://issues.apache.org/jira/browse/YARN-7
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Arun C Murthy
Assignee: Junping Du
  Labels: patch
 Attachments: YARN-7.patch, YARN-7-v2.patch, YARN-7-v3.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty


 [ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1167:


Attachment: YARN-1167.5.patch

Got lucky to pass the test on my local machine.
---
 T E S T S
---

---
 T E S T S
---
Running 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 32.464 sec - in 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell

Results :

Tests run: 2, Failures: 0, Errors: 0, Skipped: 0

 Submitted distributed shell application shows appMasterHost = empty
 ---

 Key: YARN-1167
 URL: https://issues.apache.org/jira/browse/YARN-1167
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
 YARN-1167.4.patch, YARN-1167.5.patch


 Submit distributed shell application. Once the application turns to be 
 RUNNING state, app master host should not be empty. In reality, it is empty.
 ==console logs==
 distributedshell.Client: Got application report from ASM for, appId=12, 
 clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
 appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
 distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'

2013-10-04 Thread Zhijie Shen (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhijie Shen updated YARN-1166:
--

Attachment: YARN-1166.4.patch

[~ajisakaa], thanks!

I've uploaded a new patch to do the following two things:

1. Change appsFailed to be a counter

2. Expose RMContext to AppSchedulerInfo, such that QueueMetrics can use the app
info to determine whether it is a last attempt or not. The counter only
increase at the last attempt.

Modified the test cases to verify the logic.

It's a compromise to do the trick here. I've considered to correct the logic to
only the increment on last attempt failure, but it turns out to be a lot
changes on the path of the APP_REMOVE event from RMApp/RMAppAttempt to
QueueMetrics. IMHO, I'm conservative to do such kind of change when release
2.2.0 is coming.

YARN 'appsFailed' metric should be of type 'counter'

Key: YARN-1166
URL: https://issues.apache.org/jira/browse/YARN-1166
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Zhijie Shen
Priority: Blocker
Attachments: YARN-1166.2.patch, YARN-1166.3.patch, YARN-1166.4.patch,
YARN-1166.patch

Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of
type 'guage' - which means the exact value will be reported.
All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled)
are all of type 'counter' - meaning Ganglia will use slope to provide deltas
between time-points.
To be consistent, AppsFailed metric should also be of type 'counter'.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty


[ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786015#comment-13786015
 ] 

Hadoop QA commented on YARN-1167:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12606760/YARN-1167.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell:

  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2096//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2096//console

This message is automatically generated.

 Submitted distributed shell application shows appMasterHost = empty
 ---

 Key: YARN-1167
 URL: https://issues.apache.org/jira/browse/YARN-1167
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
 YARN-1167.4.patch, YARN-1167.5.patch


 Submit distributed shell application. Once the application turns to be 
 RUNNING state, app master host should not be empty. In reality, it is empty.
 ==console logs==
 distributedshell.Client: Got application report from ASM for, appId=12, 
 clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
 appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
 distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'


[ 
https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786020#comment-13786020
 ] 

Hadoop QA commented on YARN-1166:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12606761/YARN-1166.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2095//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2095//console

This message is automatically generated.

 YARN 'appsFailed' metric should be of type 'counter'
 

 Key: YARN-1166
 URL: https://issues.apache.org/jira/browse/YARN-1166
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Zhijie Shen
Priority: Blocker
 Attachments: YARN-1166.2.patch, YARN-1166.3.patch, YARN-1166.4.patch, 
 YARN-1166.patch


 Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of 
 type 'guage' - which means the exact value will be reported. 
 All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) 
 are all of type 'counter' - meaning Ganglia will use slope to provide deltas 
 between time-points.
 To be consistent, AppsFailed metric should also be of type 'counter'. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Assigned] (YARN-1197) Support changing resources of an allocated container

2013-10-04 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan reassigned YARN-1197:


Assignee: Wangda Tan

 Support changing resources of an allocated container
 

 Key: YARN-1197
 URL: https://issues.apache.org/jira/browse/YARN-1197
 Project: Hadoop YARN
  Issue Type: Task
  Components: api, nodemanager, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: yarn-1197.pdf, yarn-1197-v2.pdf


 Currently, YARN cannot support merge several containers in one node to a big 
 container, which can make us incrementally ask resources, merge them to a 
 bigger one, and launch our processes. The user scenario is described in the 
 comments.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1197) Support changing resources of an allocated container

2013-10-04 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-1197:
-

Attachment: yarn-1197-v2.pdf

Guys, I attached an updated doc basic on our discussion, mainly focused 
workflow diagram and detailed API changes, 
many thanks to [~bikassaha], [~sandyr] and [~tucu00]. Hope to get your 
feedback. I'll start working on it.

 Support changing resources of an allocated container
 

 Key: YARN-1197
 URL: https://issues.apache.org/jira/browse/YARN-1197
 Project: Hadoop YARN
  Issue Type: Task
  Components: api, nodemanager, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Wangda Tan
 Attachments: yarn-1197.pdf, yarn-1197-v2.pdf


 Currently, YARN cannot support merge several containers in one node to a big 
 container, which can make us incrementally ask resources, merge them to a 
 bigger one, and launch our processes. The user scenario is described in the 
 comments.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-621) RM triggers web auth failure before first job


[ 
https://issues.apache.org/jira/browse/YARN-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786060#comment-13786060
 ] 

Hudson commented on YARN-621:
-

FAILURE: Integrated in Hadoop-Yarn-trunk #352 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/352/])
YARN-621. Changed YARN web app to not add paths that can cause duplicate 
additions of authenticated filters there by causing kerberos replay errors. 
Contributed by Omkar Vinit Joshi. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529030)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java


 RM triggers web auth failure before first job
 -

 Key: YARN-621
 URL: https://issues.apache.org/jira/browse/YARN-621
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.0.4-alpha
Reporter: Allen Wittenauer
Assignee: Omkar Vinit Joshi
Priority: Critical
 Fix For: 2.1.2-beta

 Attachments: YARN-621.20131001.1.patch


 On a secure YARN setup, before the first job is executed, going to the web 
 interface of the resource manager triggers authentication errors.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-677) Increase coverage to FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786057#comment-13786057
 ] 

Hudson commented on YARN-677:
-

FAILURE: Integrated in Hadoop-Yarn-trunk #352 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/352/])
Revert YARN-677. Increase coverage to FairScheduler (Vadim Bondarev and Dennis 
Y via jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1528914)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


 Increase coverage to FairScheduler
 --

 Key: YARN-677
 URL: https://issues.apache.org/jira/browse/YARN-677
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
Reporter: Vadim Bondarev
Assignee: Andrey Klochkov
 Attachments: HADOOP-4536-branch-2-a.patch, 
 HADOOP-4536-branch-2c.patch, HADOOP-4536-trunk-a.patch, 
 HADOOP-4536-trunk-c.patch, HDFS-4536-branch-2--N7.patch, 
 HDFS-4536-branch-2--N8.patch, HDFS-4536-branch-2-N9.patch, 
 HDFS-4536-trunk--N6.patch, HDFS-4536-trunk--N7.patch, 
 HDFS-4536-trunk--N8.patch, HDFS-4536-trunk-N9.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1236) FairScheduler setting queue name in RMApp is not working


[ 
https://issues.apache.org/jira/browse/YARN-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786061#comment-13786061
 ] 

Hudson commented on YARN-1236:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #352 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/352/])
YARN-1236. FairScheduler setting queue name in RMApp is not working. (Sandy 
Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529034)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


 FairScheduler setting queue name in RMApp is not working 
 -

 Key: YARN-1236
 URL: https://issues.apache.org/jira/browse/YARN-1236
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.1.2-beta

 Attachments: YARN-1236.patch


 The fair scheduler sometimes picks a different queue than the one an 
 application was submitted to, such as when user-as-default-queue is turned 
 on.  It needs to update the queue name in the RMApp so that this choice will 
 be reflected in the UI.
 This isn't working because the scheduler is looking up the RMApp by 
 application attempt id instead of app id and failing to find it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1199) Make NM/RM Versions Available


[ 
https://issues.apache.org/jira/browse/YARN-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786055#comment-13786055
 ] 

Hudson commented on YARN-1199:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #352 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/352/])
YARN-1199. Make NM/RM Versions Available (Mit Desai via jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529003)
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestRMNMInfo.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/nodemanager/NodeInfo.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMNMInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodeInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java


 Make NM/RM Versions Available
 -

 Key: YARN-1199
 URL: https://issues.apache.org/jira/browse/YARN-1199
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Mit Desai
Assignee: Mit Desai
 Fix For: 3.0.0, 2.3.0

 Attachments: YARN-1199.patch, YARN-1199.patch, YARN-1199.patch, 
 YARN-1199.patch


 Now as we have the NM and RM Versions available, we can display the YARN 
 version of nodes running in the cluster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1219) FSDownload changes file suffix making FileUtil.unTar() throw exception


[ 
https://issues.apache.org/jira/browse/YARN-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786059#comment-13786059
 ] 

Hudson commented on YARN-1219:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #352 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/352/])
YARN-1219. FSDownload changes file suffix making FileUtil.unTar() throw 
exception. Contributed by Shanyu Zhao. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529084)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java


 FSDownload changes file suffix making FileUtil.unTar() throw exception
 --

 Key: YARN-1219
 URL: https://issues.apache.org/jira/browse/YARN-1219
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.1.1-beta, 2.1.2-beta
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 2.1.2-beta

 Attachments: YARN-1219.patch


 While running a Hive join operation on Yarn, I saw exception as described 
 below. This is caused by FSDownload copy the files into a temp file and 
 change the suffix into .tmp before unpacking it. In unpack(), it uses 
 FileUtil.unTar() which will determine if the file is gzipped by looking at 
 the file suffix:
 {code}
 boolean gzipped = inFile.toString().endsWith(gz);
 {code}
 To fix this problem, we can remove the .tmp in the temp file name.
 Here is the detailed exception:
 org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:240)
   at org.apache.hadoop.fs.FileUtil.unTarUsingJava(FileUtil.java:676)
   at org.apache.hadoop.fs.FileUtil.unTar(FileUtil.java:625)
   at org.apache.hadoop.yarn.util.FSDownload.unpack(FSDownload.java:203)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:287)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-890) The roundup for memory values on resource manager UI is misleading


[ 
https://issues.apache.org/jira/browse/YARN-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786064#comment-13786064
 ] 

Hudson commented on YARN-890:
-

FAILURE: Integrated in Hadoop-Yarn-trunk #352 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/352/])
YARN-890. Ensure CapacityScheduler doesn't round-up metric for available 
resources. Contributed by Xuan Gong  Hitesh Shah. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529015)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java


 The roundup for memory values on resource manager UI is misleading
 --

 Key: YARN-890
 URL: https://issues.apache.org/jira/browse/YARN-890
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Trupti Dhavle
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: Screen Shot 2013-07-10 at 10.43.34 AM.png, 
 YARN-890.1.patch, YARN-890.2.patch


 From the yarn-site.xml, I see following values-
 property
 nameyarn.nodemanager.resource.memory-mb/name
 value4192/value
 /property
 property
 nameyarn.scheduler.maximum-allocation-mb/name
 value4192/value
 /property
 property
 nameyarn.scheduler.minimum-allocation-mb/name
 value1024/value
 /property
 However the resourcemanager UI shows total memory as 5MB 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1256) NM silently ignores non-existent service in StartContainerRequest


[ 
https://issues.apache.org/jira/browse/YARN-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786066#comment-13786066
 ] 

Hudson commented on YARN-1256:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #352 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/352/])
Addendum for missing file YARN-1256. NM silently ignores non-existent service 
in StartContainerRequest (Xuan Gong via bikas) (bikas: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529048)
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/exceptions/InvalidAuxServiceException.java
YARN-1256. NM silently ignores non-existent service in StartContainerRequest 
(Xuan Gong via bikas) (bikas: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529039)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/AuxiliaryServiceHelper.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerManagerWithLCE.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java


 NM silently ignores non-existent service in StartContainerRequest
 -

 Key: YARN-1256
 URL: https://issues.apache.org/jira/browse/YARN-1256
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.1.1-beta
Reporter: Bikas Saha
Assignee: Xuan Gong
Priority: Critical
 Fix For: 2.1.2-beta

 Attachments: YARN-1256.1.patch, YARN-1256.2.patch, YARN-1256.3.patch, 
 YARN-1256.4.patch, YARN-1256.5.patch


 A container can set token service metadata for a service, say 
 shuffle_service. If that service does not exist then the errors is silently 
 ignored. Later, when the next container wants to access data written to 
 shuffle_service by the first task, then it fails because the service does not 
 have the token that was supposed to be set by the first task.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1131) $yarn logs command should return an appropriate error message if YARN application is still running


[ 
https://issues.apache.org/jira/browse/YARN-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786058#comment-13786058
 ] 

Hudson commented on YARN-1131:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #352 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/352/])
YARN-1131.  logs command should return an appropriate error message if YARN 
application is still running. Contributed by Siddharth Seth. (hitesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529068)
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/tools/CLI.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/LogsCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestLogsCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogCLIHelpers.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogDumper.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/logaggregation/TestLogDumper.java


 $yarn logs command should return an appropriate error message if YARN 
 application is still running
 --

 Key: YARN-1131
 URL: https://issues.apache.org/jira/browse/YARN-1131
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Siddharth Seth
Priority: Minor
 Fix For: 2.1.2-beta

 Attachments: YARN-1131.1.txt, YARN-1131.2.txt, YARN-1131.3.txt


 In the case when log aggregation is enabled, if a user submits MapReduce job 
 and runs $ yarn logs -applicationId app ID while the YARN application is 
 running, the command will return no message and return user back to shell. It 
 is nice to tell the user that log aggregation is in progress.
 {code}
 -bash-4.1$ /usr/bin/yarn logs -applicationId application_1377900193583_0002
 -bash-4.1$
 {code}
 At the same time, if invalid application ID is given, YARN CLI should say 
 that the application ID is incorrect rather than throwing 
 NoSuchElementException.
 {code}
 $ /usr/bin/yarn logs -applicationId application_0
 Exception in thread main java.util.NoSuchElementException
 at com.google.common.base.AbstractIterator.next(AbstractIterator.java:75)
 at 
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:124)
 at 
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:119)
 at org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:110)
 at org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:255)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1271) Text file busy errors launching containers again


[ 
https://issues.apache.org/jira/browse/YARN-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786062#comment-13786062
 ] 

Hudson commented on YARN-1271:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #352 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/352/])
YARN-1271. Text file busy errors launching containers again (Sandy Ryza) 
(sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529058)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java


 Text file busy errors launching containers again
 --

 Key: YARN-1271
 URL: https://issues.apache.org/jira/browse/YARN-1271
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.1.2-beta

 Attachments: YARN-1271.patch


 The error is shown below in the comments.
 MAPREDUCE-2374 fixed this by removing -c when running the container launch 
 script.  It looks like the -c got brought back during the windows branch 
 merge, so we should remove it again.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1149) NM throws InvalidStateTransitonException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING


[ 
https://issues.apache.org/jira/browse/YARN-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786065#comment-13786065
 ] 

Hudson commented on YARN-1149:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #352 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/352/])
YARN-1149. NM throws InvalidStateTransitonException: Invalid event: 
APPLICATION_LOG_HANDLING_FINISHED at RUNNING. Contributed by Xuan Gong. 
(hitesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529043)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/CMgrCompletedAppsEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/CMgrCompletedContainersEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerReboot.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerShutdown.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/TestApplication.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java


 NM throws InvalidStateTransitonException: Invalid event: 
 APPLICATION_LOG_HANDLING_FINISHED at RUNNING
 -

 Key: YARN-1149
 URL: https://issues.apache.org/jira/browse/YARN-1149
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ramya Sunil
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1149.1.patch, YARN-1149.2.patch, YARN-1149.3.patch, 
 YARN-1149.4.patch, YARN-1149.5.patch, YARN-1149.6.patch, YARN-1149.7.patch, 
 YARN-1149.8.patch, YARN-1149.9.patch, YARN-1149_branch-2.1-beta.1.patch


 When nodemanager receives a kill signal when an application has finished 
 execution but log aggregation has not kicked in, 
 InvalidStateTransitonException: Invalid event: 
 APPLICATION_LOG_HANDLING_FINISHED at RUNNING is thrown
 {noformat}
 2013-08-25 20:45:00,875 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:finishLogAggregation(254)) - Application just 
 finished : application_1377459190746_0118
 2013-08-25 20:45:00,876 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:uploadLogsForContainer(105)) - Starting aggregate 
 log-file for app application_1377459190746_0118 at 
 /app-logs/foo/logs/application_1377459190746_0118/host_45454.tmp
 2013-08-25 20:45:00,876 INFO

[jira] [Commented] (YARN-677) Increase coverage to FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786135#comment-13786135
 ] 

Hudson commented on YARN-677:
-

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1542 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1542/])
Revert YARN-677. Increase coverage to FairScheduler (Vadim Bondarev and Dennis 
Y via jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1528914)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


 Increase coverage to FairScheduler
 --

 Key: YARN-677
 URL: https://issues.apache.org/jira/browse/YARN-677
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
Reporter: Vadim Bondarev
Assignee: Andrey Klochkov
 Attachments: HADOOP-4536-branch-2-a.patch, 
 HADOOP-4536-branch-2c.patch, HADOOP-4536-trunk-a.patch, 
 HADOOP-4536-trunk-c.patch, HDFS-4536-branch-2--N7.patch, 
 HDFS-4536-branch-2--N8.patch, HDFS-4536-branch-2-N9.patch, 
 HDFS-4536-trunk--N6.patch, HDFS-4536-trunk--N7.patch, 
 HDFS-4536-trunk--N8.patch, HDFS-4536-trunk-N9.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1131) $yarn logs command should return an appropriate error message if YARN application is still running


[ 
https://issues.apache.org/jira/browse/YARN-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786136#comment-13786136
 ] 

Hudson commented on YARN-1131:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1542 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1542/])
YARN-1131.  logs command should return an appropriate error message if YARN 
application is still running. Contributed by Siddharth Seth. (hitesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529068)
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/tools/CLI.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/LogsCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestLogsCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogCLIHelpers.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogDumper.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/logaggregation/TestLogDumper.java


 $yarn logs command should return an appropriate error message if YARN 
 application is still running
 --

 Key: YARN-1131
 URL: https://issues.apache.org/jira/browse/YARN-1131
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Siddharth Seth
Priority: Minor
 Fix For: 2.1.2-beta

 Attachments: YARN-1131.1.txt, YARN-1131.2.txt, YARN-1131.3.txt


 In the case when log aggregation is enabled, if a user submits MapReduce job 
 and runs $ yarn logs -applicationId app ID while the YARN application is 
 running, the command will return no message and return user back to shell. It 
 is nice to tell the user that log aggregation is in progress.
 {code}
 -bash-4.1$ /usr/bin/yarn logs -applicationId application_1377900193583_0002
 -bash-4.1$
 {code}
 At the same time, if invalid application ID is given, YARN CLI should say 
 that the application ID is incorrect rather than throwing 
 NoSuchElementException.
 {code}
 $ /usr/bin/yarn logs -applicationId application_0
 Exception in thread main java.util.NoSuchElementException
 at com.google.common.base.AbstractIterator.next(AbstractIterator.java:75)
 at 
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:124)
 at 
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:119)
 at org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:110)
 at org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:255)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-621) RM triggers web auth failure before first job


[ 
https://issues.apache.org/jira/browse/YARN-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786138#comment-13786138
 ] 

Hudson commented on YARN-621:
-

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1542 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1542/])
YARN-621. Changed YARN web app to not add paths that can cause duplicate 
additions of authenticated filters there by causing kerberos replay errors. 
Contributed by Omkar Vinit Joshi. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529030)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java


 RM triggers web auth failure before first job
 -

 Key: YARN-621
 URL: https://issues.apache.org/jira/browse/YARN-621
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.0.4-alpha
Reporter: Allen Wittenauer
Assignee: Omkar Vinit Joshi
Priority: Critical
 Fix For: 2.1.2-beta

 Attachments: YARN-621.20131001.1.patch


 On a secure YARN setup, before the first job is executed, going to the web 
 interface of the resource manager triggers authentication errors.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1219) FSDownload changes file suffix making FileUtil.unTar() throw exception


[ 
https://issues.apache.org/jira/browse/YARN-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786137#comment-13786137
 ] 

Hudson commented on YARN-1219:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1542 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1542/])
YARN-1219. FSDownload changes file suffix making FileUtil.unTar() throw 
exception. Contributed by Shanyu Zhao. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529084)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java


 FSDownload changes file suffix making FileUtil.unTar() throw exception
 --

 Key: YARN-1219
 URL: https://issues.apache.org/jira/browse/YARN-1219
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.1.1-beta, 2.1.2-beta
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 2.1.2-beta

 Attachments: YARN-1219.patch


 While running a Hive join operation on Yarn, I saw exception as described 
 below. This is caused by FSDownload copy the files into a temp file and 
 change the suffix into .tmp before unpacking it. In unpack(), it uses 
 FileUtil.unTar() which will determine if the file is gzipped by looking at 
 the file suffix:
 {code}
 boolean gzipped = inFile.toString().endsWith(gz);
 {code}
 To fix this problem, we can remove the .tmp in the temp file name.
 Here is the detailed exception:
 org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:240)
   at org.apache.hadoop.fs.FileUtil.unTarUsingJava(FileUtil.java:676)
   at org.apache.hadoop.fs.FileUtil.unTar(FileUtil.java:625)
   at org.apache.hadoop.yarn.util.FSDownload.unpack(FSDownload.java:203)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:287)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1256) NM silently ignores non-existent service in StartContainerRequest


[ 
https://issues.apache.org/jira/browse/YARN-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786144#comment-13786144
 ] 

Hudson commented on YARN-1256:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1542 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1542/])
Addendum for missing file YARN-1256. NM silently ignores non-existent service 
in StartContainerRequest (Xuan Gong via bikas) (bikas: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529048)
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/exceptions/InvalidAuxServiceException.java
YARN-1256. NM silently ignores non-existent service in StartContainerRequest 
(Xuan Gong via bikas) (bikas: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529039)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/AuxiliaryServiceHelper.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerManagerWithLCE.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java


 NM silently ignores non-existent service in StartContainerRequest
 -

 Key: YARN-1256
 URL: https://issues.apache.org/jira/browse/YARN-1256
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.1.1-beta
Reporter: Bikas Saha
Assignee: Xuan Gong
Priority: Critical
 Fix For: 2.1.2-beta

 Attachments: YARN-1256.1.patch, YARN-1256.2.patch, YARN-1256.3.patch, 
 YARN-1256.4.patch, YARN-1256.5.patch


 A container can set token service metadata for a service, say 
 shuffle_service. If that service does not exist then the errors is silently 
 ignored. Later, when the next container wants to access data written to 
 shuffle_service by the first task, then it fails because the service does not 
 have the token that was supposed to be set by the first task.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-890) The roundup for memory values on resource manager UI is misleading


[ 
https://issues.apache.org/jira/browse/YARN-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786142#comment-13786142
 ] 

Hudson commented on YARN-890:
-

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1542 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1542/])
YARN-890. Ensure CapacityScheduler doesn't round-up metric for available 
resources. Contributed by Xuan Gong  Hitesh Shah. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529015)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java


 The roundup for memory values on resource manager UI is misleading
 --

 Key: YARN-890
 URL: https://issues.apache.org/jira/browse/YARN-890
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Trupti Dhavle
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: Screen Shot 2013-07-10 at 10.43.34 AM.png, 
 YARN-890.1.patch, YARN-890.2.patch


 From the yarn-site.xml, I see following values-
 property
 nameyarn.nodemanager.resource.memory-mb/name
 value4192/value
 /property
 property
 nameyarn.scheduler.maximum-allocation-mb/name
 value4192/value
 /property
 property
 nameyarn.scheduler.minimum-allocation-mb/name
 value1024/value
 /property
 However the resourcemanager UI shows total memory as 5MB 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1149) NM throws InvalidStateTransitonException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING


[ 
https://issues.apache.org/jira/browse/YARN-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786143#comment-13786143
 ] 

Hudson commented on YARN-1149:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1542 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1542/])
YARN-1149. NM throws InvalidStateTransitonException: Invalid event: 
APPLICATION_LOG_HANDLING_FINISHED at RUNNING. Contributed by Xuan Gong. 
(hitesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529043)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/CMgrCompletedAppsEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/CMgrCompletedContainersEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerReboot.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerShutdown.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/TestApplication.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java


 NM throws InvalidStateTransitonException: Invalid event: 
 APPLICATION_LOG_HANDLING_FINISHED at RUNNING
 -

 Key: YARN-1149
 URL: https://issues.apache.org/jira/browse/YARN-1149
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ramya Sunil
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1149.1.patch, YARN-1149.2.patch, YARN-1149.3.patch, 
 YARN-1149.4.patch, YARN-1149.5.patch, YARN-1149.6.patch, YARN-1149.7.patch, 
 YARN-1149.8.patch, YARN-1149.9.patch, YARN-1149_branch-2.1-beta.1.patch


 When nodemanager receives a kill signal when an application has finished 
 execution but log aggregation has not kicked in, 
 InvalidStateTransitonException: Invalid event: 
 APPLICATION_LOG_HANDLING_FINISHED at RUNNING is thrown
 {noformat}
 2013-08-25 20:45:00,875 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:finishLogAggregation(254)) - Application just 
 finished : application_1377459190746_0118
 2013-08-25 20:45:00,876 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:uploadLogsForContainer(105)) - Starting aggregate 
 log-file for app application_1377459190746_0118 at 
 /app-logs/foo/logs/application_1377459190746_0118/host_45454.tmp
 2013-08-25 20:45:00,876 INFO

[jira] [Updated] (YARN-1251) TestDistributedShell#TestDSShell failed with timeout


 [ 
https://issues.apache.org/jira/browse/YARN-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-1251:
-

Attachment: YARN-1225-kickOffTestDS.patch

The test will get failed on Jenkins. Attach a test patch to kick off test and 
reproduce the failure although I cannot reproduce it locally.

 TestDistributedShell#TestDSShell failed with timeout
 

 Key: YARN-1251
 URL: https://issues.apache.org/jira/browse/YARN-1251
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Junping Du
 Attachments: YARN-1225-kickOffTestDS.patch


 The Stacktrace
 {code}
 java.lang.Exception: test timed out after 9 milliseconds
   at 
 com.google.protobuf.LiteralByteString.init(LiteralByteString.java:234)
   at com.google.protobuf.ByteString.copyFromUtf8(ByteString.java:255)
   at 
 org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getMethodNameBytes(ProtobufRpcEngineProtos.java:286)
   at 
 org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getSerializedSize(ProtobufRpcEngineProtos.java:462)
   at 
 com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:84)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$RpcMessageWithHeader.write(ProtobufRpcEngine.java:302)
   at 
 org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:989)
   at org.apache.hadoop.ipc.Client.call(Client.java:1377)
   at org.apache.hadoop.ipc.Client.call(Client.java:1357)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
   at $Proxy70.getApplicationReport(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:137)
   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:185)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
   at $Proxy71.getApplicationReport(Unknown Source)
   at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:195)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.Client.monitorApplication(Client.java:622)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:597)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:125)
 {code}
 For details, please refer:
 https://builds.apache.org/job/PreCommit-YARN-Build/2039//testReport/



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1236) FairScheduler setting queue name in RMApp is not working


[ 
https://issues.apache.org/jira/browse/YARN-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786139#comment-13786139
 ] 

Hudson commented on YARN-1236:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1542 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1542/])
YARN-1236. FairScheduler setting queue name in RMApp is not working. (Sandy 
Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529034)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


 FairScheduler setting queue name in RMApp is not working 
 -

 Key: YARN-1236
 URL: https://issues.apache.org/jira/browse/YARN-1236
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.1.2-beta

 Attachments: YARN-1236.patch


 The fair scheduler sometimes picks a different queue than the one an 
 application was submitted to, such as when user-as-default-queue is turned 
 on.  It needs to update the queue name in the RMApp so that this choice will 
 be reflected in the UI.
 This isn't working because the scheduler is looking up the RMApp by 
 application attempt id instead of app id and failing to find it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1199) Make NM/RM Versions Available


[ 
https://issues.apache.org/jira/browse/YARN-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786133#comment-13786133
 ] 

Hudson commented on YARN-1199:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1542 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1542/])
YARN-1199. Make NM/RM Versions Available (Mit Desai via jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529003)
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestRMNMInfo.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/nodemanager/NodeInfo.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMNMInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodeInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java


 Make NM/RM Versions Available
 -

 Key: YARN-1199
 URL: https://issues.apache.org/jira/browse/YARN-1199
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Mit Desai
Assignee: Mit Desai
 Fix For: 3.0.0, 2.3.0

 Attachments: YARN-1199.patch, YARN-1199.patch, YARN-1199.patch, 
 YARN-1199.patch


 Now as we have the NM and RM Versions available, we can display the YARN 
 version of nodes running in the cluster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1271) Text file busy errors launching containers again


[ 
https://issues.apache.org/jira/browse/YARN-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786140#comment-13786140
 ] 

Hudson commented on YARN-1271:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1542 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1542/])
YARN-1271. Text file busy errors launching containers again (Sandy Ryza) 
(sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529058)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java


 Text file busy errors launching containers again
 --

 Key: YARN-1271
 URL: https://issues.apache.org/jira/browse/YARN-1271
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.1.2-beta

 Attachments: YARN-1271.patch


 The error is shown below in the comments.
 MAPREDUCE-2374 fixed this by removing -c when running the container launch 
 script.  It looks like the -c got brought back during the windows branch 
 merge, so we should remove it again.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-7) Add support for DistributedShell to ask for CPUs along with memory


[ 
https://issues.apache.org/jira/browse/YARN-7?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786146#comment-13786146
 ] 

Junping Du commented on YARN-7:
---

The test failure are unrelated as YARN-1251 shows.

 Add support for DistributedShell to ask for CPUs along with memory
 --

 Key: YARN-7
 URL: https://issues.apache.org/jira/browse/YARN-7
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Arun C Murthy
Assignee: Junping Du
  Labels: patch
 Attachments: YARN-7.patch, YARN-7-v2.patch, YARN-7-v3.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1251) TestDistributedShell#TestDSShell failed with timeout


 [ 
https://issues.apache.org/jira/browse/YARN-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-1251:
-

Description: 
TestDistributedShell#TestDSShell on trunk Jenkins are failed consistently 
recently.
The Stacktrace is:
{code}
java.lang.Exception: test timed out after 9 milliseconds
at 
com.google.protobuf.LiteralByteString.init(LiteralByteString.java:234)
at com.google.protobuf.ByteString.copyFromUtf8(ByteString.java:255)
at 
org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getMethodNameBytes(ProtobufRpcEngineProtos.java:286)
at 
org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getSerializedSize(ProtobufRpcEngineProtos.java:462)
at 
com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:84)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$RpcMessageWithHeader.write(ProtobufRpcEngine.java:302)
at 
org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:989)
at org.apache.hadoop.ipc.Client.call(Client.java:1377)
at org.apache.hadoop.ipc.Client.call(Client.java:1357)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at $Proxy70.getApplicationReport(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:137)
at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:185)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
at $Proxy71.getApplicationReport(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:195)
at 
org.apache.hadoop.yarn.applications.distributedshell.Client.monitorApplication(Client.java:622)
at 
org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:597)
at 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:125)
{code}
For details, please refer:
https://builds.apache.org/job/PreCommit-YARN-Build/2039//testReport/

  was:
The Stacktrace
{code}
java.lang.Exception: test timed out after 9 milliseconds
at 
com.google.protobuf.LiteralByteString.init(LiteralByteString.java:234)
at com.google.protobuf.ByteString.copyFromUtf8(ByteString.java:255)
at 
org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getMethodNameBytes(ProtobufRpcEngineProtos.java:286)
at 
org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getSerializedSize(ProtobufRpcEngineProtos.java:462)
at 
com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:84)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$RpcMessageWithHeader.write(ProtobufRpcEngine.java:302)
at 
org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:989)
at org.apache.hadoop.ipc.Client.call(Client.java:1377)
at org.apache.hadoop.ipc.Client.call(Client.java:1357)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at $Proxy70.getApplicationReport(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:137)
at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:185)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
at $Proxy71.getApplicationReport(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:195)
at 
org.apache.hadoop.yarn.applications.distributedshell.Client.monitorApplication(Client.java:622)
at 
org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:597)
at 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:125)
{code}
For details, please refer:
https://builds.apache.org/job/PreCommit-YARN-Build/2039//testReport/


 TestDistributedShell#TestDSShell failed with timeout

[jira] [Commented] (YARN-1251) TestDistributedShell#TestDSShell failed with timeout


[ 
https://issues.apache.org/jira/browse/YARN-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786159#comment-13786159
 ] 

Hadoop QA commented on YARN-1251:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12606782/YARN-1225-kickOffTestDS.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell:

  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2097//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2097//console

This message is automatically generated.

 TestDistributedShell#TestDSShell failed with timeout
 

 Key: YARN-1251
 URL: https://issues.apache.org/jira/browse/YARN-1251
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Junping Du
 Attachments: YARN-1225-kickOffTestDS.patch


 TestDistributedShell#TestDSShell on trunk Jenkins are failed consistently 
 recently.
 The Stacktrace is:
 {code}
 java.lang.Exception: test timed out after 9 milliseconds
   at 
 com.google.protobuf.LiteralByteString.init(LiteralByteString.java:234)
   at com.google.protobuf.ByteString.copyFromUtf8(ByteString.java:255)
   at 
 org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getMethodNameBytes(ProtobufRpcEngineProtos.java:286)
   at 
 org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getSerializedSize(ProtobufRpcEngineProtos.java:462)
   at 
 com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:84)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$RpcMessageWithHeader.write(ProtobufRpcEngine.java:302)
   at 
 org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:989)
   at org.apache.hadoop.ipc.Client.call(Client.java:1377)
   at org.apache.hadoop.ipc.Client.call(Client.java:1357)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
   at $Proxy70.getApplicationReport(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:137)
   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:185)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
   at $Proxy71.getApplicationReport(Unknown Source)
   at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:195)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.Client.monitorApplication(Client.java:622)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:597)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:125)
 {code}
 For details, please refer:
 https://builds.apache.org/job/PreCommit-YARN-Build/2039//testReport/



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1225) FinishApplicationMasterRequest should also have a final IPC/RPC address.


[ 
https://issues.apache.org/jira/browse/YARN-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786155#comment-13786155
 ] 

Junping Du commented on YARN-1225:
--

Hi [~vinodkv], would you help to review the patch here? I guess any protocol 
changes should be best to happen before branch-2 GA. Isn't it?
The test failure on TestDistributedShell seems to be unrelated (also appear on 
previous JIRA like YARN-49). Thanks!

 FinishApplicationMasterRequest should also have a final IPC/RPC address.
 

 Key: YARN-1225
 URL: https://issues.apache.org/jira/browse/YARN-1225
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Reporter: Vinod Kumar Vavilapalli
Assignee: Junping Du
 Attachments: YARN-1225-kickOffTestDS.patch, YARN-1225-v1.patch, 
 YARN-1225-v2.patch, YARN-1225-v3.patch


 AMs already can report final Http URL via FinishApplicationMasterRequest, but 
 there is no field to report an IPC/RPC address.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-7) Add support for DistributedShell to ask for CPUs along with memory


 [ 
https://issues.apache.org/jira/browse/YARN-7?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-7:
--

 Target Version/s: 2.1.2-beta  (was: 2.1.0-beta, 2.0.4-alpha)
Affects Version/s: (was: 2.0.3-alpha)
   2.1.1-beta

 Add support for DistributedShell to ask for CPUs along with memory
 --

 Key: YARN-7
 URL: https://issues.apache.org/jira/browse/YARN-7
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.1.1-beta
Reporter: Arun C Murthy
Assignee: Junping Du
  Labels: patch
 Attachments: YARN-7.patch, YARN-7-v2.patch, YARN-7-v3.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-890) The roundup for memory values on resource manager UI is misleading


[ 
https://issues.apache.org/jira/browse/YARN-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786182#comment-13786182
 ] 

Hudson commented on YARN-890:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1568 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1568/])
YARN-890. Ensure CapacityScheduler doesn't round-up metric for available 
resources. Contributed by Xuan Gong  Hitesh Shah. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529015)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java


 The roundup for memory values on resource manager UI is misleading
 --

 Key: YARN-890
 URL: https://issues.apache.org/jira/browse/YARN-890
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Trupti Dhavle
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: Screen Shot 2013-07-10 at 10.43.34 AM.png, 
 YARN-890.1.patch, YARN-890.2.patch


 From the yarn-site.xml, I see following values-
 property
 nameyarn.nodemanager.resource.memory-mb/name
 value4192/value
 /property
 property
 nameyarn.scheduler.maximum-allocation-mb/name
 value4192/value
 /property
 property
 nameyarn.scheduler.minimum-allocation-mb/name
 value1024/value
 /property
 However the resourcemanager UI shows total memory as 5MB 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1271) Text file busy errors launching containers again


[ 
https://issues.apache.org/jira/browse/YARN-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786180#comment-13786180
 ] 

Hudson commented on YARN-1271:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1568 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1568/])
YARN-1271. Text file busy errors launching containers again (Sandy Ryza) 
(sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529058)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java


 Text file busy errors launching containers again
 --

 Key: YARN-1271
 URL: https://issues.apache.org/jira/browse/YARN-1271
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.1.2-beta

 Attachments: YARN-1271.patch


 The error is shown below in the comments.
 MAPREDUCE-2374 fixed this by removing -c when running the container launch 
 script.  It looks like the -c got brought back during the windows branch 
 merge, so we should remove it again.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1219) FSDownload changes file suffix making FileUtil.unTar() throw exception


[ 
https://issues.apache.org/jira/browse/YARN-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786177#comment-13786177
 ] 

Hudson commented on YARN-1219:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1568 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1568/])
YARN-1219. FSDownload changes file suffix making FileUtil.unTar() throw 
exception. Contributed by Shanyu Zhao. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529084)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java


 FSDownload changes file suffix making FileUtil.unTar() throw exception
 --

 Key: YARN-1219
 URL: https://issues.apache.org/jira/browse/YARN-1219
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.1.1-beta, 2.1.2-beta
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 2.1.2-beta

 Attachments: YARN-1219.patch


 While running a Hive join operation on Yarn, I saw exception as described 
 below. This is caused by FSDownload copy the files into a temp file and 
 change the suffix into .tmp before unpacking it. In unpack(), it uses 
 FileUtil.unTar() which will determine if the file is gzipped by looking at 
 the file suffix:
 {code}
 boolean gzipped = inFile.toString().endsWith(gz);
 {code}
 To fix this problem, we can remove the .tmp in the temp file name.
 Here is the detailed exception:
 org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:240)
   at org.apache.hadoop.fs.FileUtil.unTarUsingJava(FileUtil.java:676)
   at org.apache.hadoop.fs.FileUtil.unTar(FileUtil.java:625)
   at org.apache.hadoop.yarn.util.FSDownload.unpack(FSDownload.java:203)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:287)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-621) RM triggers web auth failure before first job


[ 
https://issues.apache.org/jira/browse/YARN-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786178#comment-13786178
 ] 

Hudson commented on YARN-621:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1568 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1568/])
YARN-621. Changed YARN web app to not add paths that can cause duplicate 
additions of authenticated filters there by causing kerberos replay errors. 
Contributed by Omkar Vinit Joshi. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529030)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java


 RM triggers web auth failure before first job
 -

 Key: YARN-621
 URL: https://issues.apache.org/jira/browse/YARN-621
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.0.4-alpha
Reporter: Allen Wittenauer
Assignee: Omkar Vinit Joshi
Priority: Critical
 Fix For: 2.1.2-beta

 Attachments: YARN-621.20131001.1.patch


 On a secure YARN setup, before the first job is executed, going to the web 
 interface of the resource manager triggers authentication errors.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1236) FairScheduler setting queue name in RMApp is not working


[ 
https://issues.apache.org/jira/browse/YARN-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786179#comment-13786179
 ] 

Hudson commented on YARN-1236:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1568 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1568/])
YARN-1236. FairScheduler setting queue name in RMApp is not working. (Sandy 
Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529034)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


 FairScheduler setting queue name in RMApp is not working 
 -

 Key: YARN-1236
 URL: https://issues.apache.org/jira/browse/YARN-1236
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.1.2-beta

 Attachments: YARN-1236.patch


 The fair scheduler sometimes picks a different queue than the one an 
 application was submitted to, such as when user-as-default-queue is turned 
 on.  It needs to update the queue name in the RMApp so that this choice will 
 be reflected in the UI.
 This isn't working because the scheduler is looking up the RMApp by 
 application attempt id instead of app id and failing to find it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-677) Increase coverage to FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786175#comment-13786175
 ] 

Hudson commented on YARN-677:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1568 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1568/])
Revert YARN-677. Increase coverage to FairScheduler (Vadim Bondarev and Dennis 
Y via jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1528914)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


 Increase coverage to FairScheduler
 --

 Key: YARN-677
 URL: https://issues.apache.org/jira/browse/YARN-677
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
Reporter: Vadim Bondarev
Assignee: Andrey Klochkov
 Attachments: HADOOP-4536-branch-2-a.patch, 
 HADOOP-4536-branch-2c.patch, HADOOP-4536-trunk-a.patch, 
 HADOOP-4536-trunk-c.patch, HDFS-4536-branch-2--N7.patch, 
 HDFS-4536-branch-2--N8.patch, HDFS-4536-branch-2-N9.patch, 
 HDFS-4536-trunk--N6.patch, HDFS-4536-trunk--N7.patch, 
 HDFS-4536-trunk--N8.patch, HDFS-4536-trunk-N9.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1256) NM silently ignores non-existent service in StartContainerRequest


[ 
https://issues.apache.org/jira/browse/YARN-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786184#comment-13786184
 ] 

Hudson commented on YARN-1256:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1568 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1568/])
Addendum for missing file YARN-1256. NM silently ignores non-existent service 
in StartContainerRequest (Xuan Gong via bikas) (bikas: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529048)
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/exceptions/InvalidAuxServiceException.java
YARN-1256. NM silently ignores non-existent service in StartContainerRequest 
(Xuan Gong via bikas) (bikas: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529039)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/AuxiliaryServiceHelper.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerManagerWithLCE.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java


 NM silently ignores non-existent service in StartContainerRequest
 -

 Key: YARN-1256
 URL: https://issues.apache.org/jira/browse/YARN-1256
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.1.1-beta
Reporter: Bikas Saha
Assignee: Xuan Gong
Priority: Critical
 Fix For: 2.1.2-beta

 Attachments: YARN-1256.1.patch, YARN-1256.2.patch, YARN-1256.3.patch, 
 YARN-1256.4.patch, YARN-1256.5.patch


 A container can set token service metadata for a service, say 
 shuffle_service. If that service does not exist then the errors is silently 
 ignored. Later, when the next container wants to access data written to 
 shuffle_service by the first task, then it fails because the service does not 
 have the token that was supposed to be set by the first task.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1131) $yarn logs command should return an appropriate error message if YARN application is still running


[ 
https://issues.apache.org/jira/browse/YARN-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786176#comment-13786176
 ] 

Hudson commented on YARN-1131:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1568 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1568/])
YARN-1131.  logs command should return an appropriate error message if YARN 
application is still running. Contributed by Siddharth Seth. (hitesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529068)
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/tools/CLI.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/LogsCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestLogsCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogCLIHelpers.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogDumper.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/logaggregation/TestLogDumper.java


 $yarn logs command should return an appropriate error message if YARN 
 application is still running
 --

 Key: YARN-1131
 URL: https://issues.apache.org/jira/browse/YARN-1131
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Siddharth Seth
Priority: Minor
 Fix For: 2.1.2-beta

 Attachments: YARN-1131.1.txt, YARN-1131.2.txt, YARN-1131.3.txt


 In the case when log aggregation is enabled, if a user submits MapReduce job 
 and runs $ yarn logs -applicationId app ID while the YARN application is 
 running, the command will return no message and return user back to shell. It 
 is nice to tell the user that log aggregation is in progress.
 {code}
 -bash-4.1$ /usr/bin/yarn logs -applicationId application_1377900193583_0002
 -bash-4.1$
 {code}
 At the same time, if invalid application ID is given, YARN CLI should say 
 that the application ID is incorrect rather than throwing 
 NoSuchElementException.
 {code}
 $ /usr/bin/yarn logs -applicationId application_0
 Exception in thread main java.util.NoSuchElementException
 at com.google.common.base.AbstractIterator.next(AbstractIterator.java:75)
 at 
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:124)
 at 
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:119)
 at org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:110)
 at org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:255)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1199) Make NM/RM Versions Available


[ 
https://issues.apache.org/jira/browse/YARN-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786173#comment-13786173
 ] 

Hudson commented on YARN-1199:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1568 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1568/])
YARN-1199. Make NM/RM Versions Available (Mit Desai via jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529003)
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestRMNMInfo.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/nodemanager/NodeInfo.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMNMInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodeInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java


 Make NM/RM Versions Available
 -

 Key: YARN-1199
 URL: https://issues.apache.org/jira/browse/YARN-1199
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Mit Desai
Assignee: Mit Desai
 Fix For: 3.0.0, 2.3.0

 Attachments: YARN-1199.patch, YARN-1199.patch, YARN-1199.patch, 
 YARN-1199.patch


 Now as we have the NM and RM Versions available, we can display the YARN 
 version of nodes running in the cluster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1149) NM throws InvalidStateTransitonException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING


[ 
https://issues.apache.org/jira/browse/YARN-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786183#comment-13786183
 ] 

Hudson commented on YARN-1149:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1568 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1568/])
YARN-1149. NM throws InvalidStateTransitonException: Invalid event: 
APPLICATION_LOG_HANDLING_FINISHED at RUNNING. Contributed by Xuan Gong. 
(hitesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529043)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/CMgrCompletedAppsEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/CMgrCompletedContainersEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerReboot.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerShutdown.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/TestApplication.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java


 NM throws InvalidStateTransitonException: Invalid event: 
 APPLICATION_LOG_HANDLING_FINISHED at RUNNING
 -

 Key: YARN-1149
 URL: https://issues.apache.org/jira/browse/YARN-1149
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ramya Sunil
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1149.1.patch, YARN-1149.2.patch, YARN-1149.3.patch, 
 YARN-1149.4.patch, YARN-1149.5.patch, YARN-1149.6.patch, YARN-1149.7.patch, 
 YARN-1149.8.patch, YARN-1149.9.patch, YARN-1149_branch-2.1-beta.1.patch


 When nodemanager receives a kill signal when an application has finished 
 execution but log aggregation has not kicked in, 
 InvalidStateTransitonException: Invalid event: 
 APPLICATION_LOG_HANDLING_FINISHED at RUNNING is thrown
 {noformat}
 2013-08-25 20:45:00,875 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:finishLogAggregation(254)) - Application just 
 finished : application_1377459190746_0118
 2013-08-25 20:45:00,876 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:uploadLogsForContainer(105)) - Starting aggregate 
 log-file for app application_1377459190746_0118 at 
 /app-logs/foo/logs/application_1377459190746_0118/host_45454.tmp
 2013-08-25 20:45:00,876 INFO

[jira] [Assigned] (YARN-913) Add a way to register long-lived services in a YARN cluster

2013-10-04 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans reassigned YARN-913:


Assignee: Robert Joseph Evans

 Add a way to register long-lived services in a YARN cluster
 ---

 Key: YARN-913
 URL: https://issues.apache.org/jira/browse/YARN-913
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: api
Affects Versions: 3.0.0
Reporter: Steve Loughran
Assignee: Robert Joseph Evans

 In a YARN cluster you can't predict where services will come up -or on what 
 ports. The services need to work those things out as they come up and then 
 publish them somewhere.
 Applications need to be able to find the service instance they are to bond to 
 -and not any others in the cluster.
 Some kind of service registry -in the RM, in ZK, could do this. If the RM 
 held the write access to the ZK nodes, it would be more secure than having 
 apps register with ZK themselves.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1270) TestSLSRunner test is failing

2013-10-04 Thread Mit Desai (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-1270:


Description: Added in the YARN-1021 patch, the test TestSLSRunner is now 
failing.  (was: Added in the YARn-1021 patch, the test TestSLSRunner is now 
failing.)

 TestSLSRunner test is failing
 -

 Key: YARN-1270
 URL: https://issues.apache.org/jira/browse/YARN-1270
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Mit Desai

 Added in the YARN-1021 patch, the test TestSLSRunner is now failing.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-913) Add a way to register long-lived services in a YARN cluster

2013-10-04 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786229#comment-13786229
 ] 

Steve Loughran commented on YARN-913:
-

what I'm doing right now is just enumerating all instances of my app's type and 
verifying that the (username, instance-name) is unique : 
[https://github.com/hortonworks/hoya/blob/master/hoya-core/src/main/java/org/apache/hadoop/hoya/yarn/client/HoyaClient.java#L841]

That's got race condition built in to it

 Add a way to register long-lived services in a YARN cluster
 ---

 Key: YARN-913
 URL: https://issues.apache.org/jira/browse/YARN-913
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: api
Affects Versions: 3.0.0
Reporter: Steve Loughran
Assignee: Robert Joseph Evans

 In a YARN cluster you can't predict where services will come up -or on what 
 ports. The services need to work those things out as they come up and then 
 publish them somewhere.
 Applications need to be able to find the service instance they are to bond to 
 -and not any others in the cluster.
 Some kind of service registry -in the RM, in ZK, could do this. If the RM 
 held the write access to the ZK nodes, it would be more secure than having 
 apps register with ZK themselves.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-867) Isolation of failures in aux services

2013-10-04 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786288#comment-13786288
 ] 

Alejandro Abdelnur commented on YARN-867:
-

patch6 does not look good to me, the try/catch are not correct as an exception 
in ANY auxiliary service will halt delivery to the other auxiliary services. 
the try/catch should be done around each call to the auxiliary service 
interface methods as done in patch4.

 Isolation of failures in aux services 
 --

 Key: YARN-867
 URL: https://issues.apache.org/jira/browse/YARN-867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, 
 YARN-867.4.patch, YARN-867.5.patch, YARN-867.6.patch, 
 YARN-867.sampleCode.2.patch


 Today, a malicious application can bring down the NM by sending bad data to a 
 service. For example, sending data to the ShuffleService such that it results 
 any non-IOException will cause the NM's async dispatcher to exit as the 
 service's INIT APP event is not handled properly. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-913) Add a way to register long-lived services in a YARN cluster

2013-10-04 Thread Robert Joseph Evans (JIRA)

[
https://issues.apache.org/jira/browse/YARN-913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786299#comment-13786299
]

Robert Joseph Evans commented on YARN-913:
--

Yes it does have plenty of races. I'll try to get some detailed designs up
shortly but at a high level the general idea is to have a restful web service.
For the most common use case there just needs to be two interfaces.

- Register/Monitor a Service
- Query for Services

Part of the reason we need the service registry is to securely verify that a
client is talking to the real service, and no one has grabbed the service's
port after it registered. To do that I want to have the concept of a verified
service. For that we would need an admin interface for adding, updating, and
removing verified services.

The registry would provide a number of pluggable ways for services to
authenticate. Part of adding a verified service would include indicating which
authentication models the service can use to register and which users are
allowed to register that service.

The registry could also act like a trusted Certificate Authority. Another part
of adding in a verified service would include indicating how clients could
verify they are talking to the true service. This could include just
publishing an application id so the client can go to the RM and get a
delegation token. Another option would be having the service generate a
public/private key pair. When the service registers it would get the private
key and the public key would be available through the discovery interface.

The plan is to also have the registry monitor the service similar to ZK. The
service would heartbeat in to the registry periodically (could be on the order
of mins depending on the service) after a certain period of time of inactivity
the service would be removed from the registry. Perhaps we should add in an
explicit unregister as well.

I want to make sure that the data model it is generic enough that we could
support something like a web service on the gird where each server can register
itself and all of them would show up in the registry, so a service could have
one or more servers that are a part of it, and each server could have some
separate metadata about it.

I also want to have a plug-in interface for discovery, so we could potentially
make the registry look like a DNS server or an SSL Certificate Authority which
would make compatibility with existing applications and clients a lot simpler.

Add a way to register long-lived services in a YARN cluster
---

Key: YARN-913
URL: https://issues.apache.org/jira/browse/YARN-913
Project: Hadoop YARN
Issue Type: New Feature
Components: api
Affects Versions: 3.0.0
Reporter: Steve Loughran
Assignee: Robert Joseph Evans

In a YARN cluster you can't predict where services will come up -or on what
ports. The services need to work those things out as they come up and then
publish them somewhere.
Applications need to be able to find the service instance they are to bond to
-and not any others in the cluster.
Some kind of service registry -in the RM, in ZK, could do this. If the RM
held the write access to the ZK nodes, it would be more secure than having
apps register with ZK themselves.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1268) TestFairScheduler.testContinuousScheduling is flaky


 [ 
https://issues.apache.org/jira/browse/YARN-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1268:
-

Attachment: YARN-1268.patch

 TestFairScheduler.testContinuousScheduling is flaky
 ---

 Key: YARN-1268
 URL: https://issues.apache.org/jira/browse/YARN-1268
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Wei Yan
 Attachments: YARN-1268.patch


 It looks like there's a timeout in it that's causing it to be flaky.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'

2013-10-04 Thread Zhijie Shen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1166:
--

Attachment: YARN-1166.5.patch

Fix the test failure

 YARN 'appsFailed' metric should be of type 'counter'
 

 Key: YARN-1166
 URL: https://issues.apache.org/jira/browse/YARN-1166
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Zhijie Shen
Priority: Blocker
 Attachments: YARN-1166.2.patch, YARN-1166.3.patch, YARN-1166.4.patch, 
 YARN-1166.5.patch, YARN-1166.patch


 Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of 
 type 'guage' - which means the exact value will be reported. 
 All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) 
 are all of type 'counter' - meaning Ganglia will use slope to provide deltas 
 between time-points.
 To be consistent, AppsFailed metric should also be of type 'counter'. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1271) Text file busy errors launching containers again


[ 
https://issues.apache.org/jira/browse/YARN-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786315#comment-13786315
 ] 

Sandy Ryza commented on YARN-1271:
--

Posted branch-2 addendum

 Text file busy errors launching containers again
 --

 Key: YARN-1271
 URL: https://issues.apache.org/jira/browse/YARN-1271
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.1.2-beta

 Attachments: YARN-1271-branch-2.patch, YARN-1271.patch


 The error is shown below in the comments.
 MAPREDUCE-2374 fixed this by removing -c when running the container launch 
 script.  It looks like the -c got brought back during the windows branch 
 merge, so we should remove it again.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1271) Text file busy errors launching containers again


 [ 
https://issues.apache.org/jira/browse/YARN-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1271:
-

Attachment: YARN-1271-branch-2.patch

 Text file busy errors launching containers again
 --

 Key: YARN-1271
 URL: https://issues.apache.org/jira/browse/YARN-1271
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.1.2-beta

 Attachments: YARN-1271-branch-2.patch, YARN-1271.patch


 The error is shown below in the comments.
 MAPREDUCE-2374 fixed this by removing -c when running the container launch 
 script.  It looks like the -c got brought back during the windows branch 
 merge, so we should remove it again.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-867) Isolation of failures in aux services

2013-10-04 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786317#comment-13786317
 ] 

Bikas Saha commented on YARN-867:
-

tucu you comments were addressed in YARN-1256. This jira is now targeted for 
more elaborate changes.

 Isolation of failures in aux services 
 --

 Key: YARN-867
 URL: https://issues.apache.org/jira/browse/YARN-867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, 
 YARN-867.4.patch, YARN-867.5.patch, YARN-867.6.patch, 
 YARN-867.sampleCode.2.patch


 Today, a malicious application can bring down the NM by sending bad data to a 
 service. For example, sending data to the ShuffleService such that it results 
 any non-IOException will cause the NM's async dispatcher to exit as the 
 service's INIT APP event is not handled properly. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-445) Ability to signal containers

2013-10-04 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786318#comment-13786318
 ] 

Steve Loughran commented on YARN-445:
-

c-break is special in that it can talk to the whole process group:  
[http://msdn.microsoft.com/en-us/library/windows/desktop/ms683155(v=vs.85).aspx]

process-group signalling should be good (make it an option from the sender?) so 
that I can send a signal to a process started by its own bash script (e.g. 
bin/hbase-java). However, we do need to remember that some recent ubuntu 
versions (mistakenly) require a -- between signal and process group id

This is quite a significant patch -and it adds a feature that many will find 
useful - but it its going to need careful review by the YARN experts (of which 
I am not). Some quick points
# I wouldn't mark the interface/methods as stable yet
# some of the diffs in the tests look bigger than they should be 
-reformatting/refactoring? It just makes it harder to distinguish changes. 
Ideally all the existing tests should be left alone (that way we can be 
confident that they will catch regressions), with new tests underneath or in 
their own class

 Ability to signal containers
 

 Key: YARN-445
 URL: https://issues.apache.org/jira/browse/YARN-445
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Jason Lowe
 Attachments: YARN-445--n2.patch, YARN-445.patch


 It would be nice if an ApplicationMaster could send signals to contaniers 
 such as SIGQUIT, SIGUSR1, etc.
 For example, in order to replicate the jstack-on-task-timeout feature 
 implemented by MAPREDUCE-1119 in Hadoop 0.21 the NodeManager needs an 
 interface for sending SIGQUIT to a container.  For that specific feature we 
 could implement it as an additional field in the StopContainerRequest.  
 However that would not address other potential features like the ability for 
 an AM to trigger jstacks on arbitrary tasks *without* killing them.  The 
 latter feature would be a very useful debugging tool for users who do not 
 have shell access to the nodes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'


[ 
https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786348#comment-13786348
 ] 

Hadoop QA commented on YARN-1166:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12606822/YARN-1166.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2099//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2099//console

This message is automatically generated.

 YARN 'appsFailed' metric should be of type 'counter'
 

 Key: YARN-1166
 URL: https://issues.apache.org/jira/browse/YARN-1166
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Zhijie Shen
Priority: Blocker
 Attachments: YARN-1166.2.patch, YARN-1166.3.patch, YARN-1166.4.patch, 
 YARN-1166.5.patch, YARN-1166.patch


 Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of 
 type 'guage' - which means the exact value will be reported. 
 All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) 
 are all of type 'counter' - meaning Ganglia will use slope to provide deltas 
 between time-points.
 To be consistent, AppsFailed metric should also be of type 'counter'. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1251) TestDistributedShell#TestDSShell failed with timeout


 [ 
https://issues.apache.org/jira/browse/YARN-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1251:


Attachment: error.log

attaching thread dump

 TestDistributedShell#TestDSShell failed with timeout
 

 Key: YARN-1251
 URL: https://issues.apache.org/jira/browse/YARN-1251
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Junping Du
 Attachments: error.log, YARN-1225-kickOffTestDS.patch


 TestDistributedShell#TestDSShell on trunk Jenkins are failed consistently 
 recently.
 The Stacktrace is:
 {code}
 java.lang.Exception: test timed out after 9 milliseconds
   at 
 com.google.protobuf.LiteralByteString.init(LiteralByteString.java:234)
   at com.google.protobuf.ByteString.copyFromUtf8(ByteString.java:255)
   at 
 org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getMethodNameBytes(ProtobufRpcEngineProtos.java:286)
   at 
 org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getSerializedSize(ProtobufRpcEngineProtos.java:462)
   at 
 com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:84)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$RpcMessageWithHeader.write(ProtobufRpcEngine.java:302)
   at 
 org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:989)
   at org.apache.hadoop.ipc.Client.call(Client.java:1377)
   at org.apache.hadoop.ipc.Client.call(Client.java:1357)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
   at $Proxy70.getApplicationReport(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:137)
   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:185)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
   at $Proxy71.getApplicationReport(Unknown Source)
   at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:195)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.Client.monitorApplication(Client.java:622)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:597)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:125)
 {code}
 For details, please refer:
 https://builds.apache.org/job/PreCommit-YARN-Build/2039//testReport/



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1251) TestDistributedShell#TestDSShell failed with timeout


[ 
https://issues.apache.org/jira/browse/YARN-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786377#comment-13786377
 ] 

Hadoop QA commented on YARN-1251:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12606834/error.log
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2100//console

This message is automatically generated.

 TestDistributedShell#TestDSShell failed with timeout
 

 Key: YARN-1251
 URL: https://issues.apache.org/jira/browse/YARN-1251
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Junping Du
 Attachments: error.log, YARN-1225-kickOffTestDS.patch


 TestDistributedShell#TestDSShell on trunk Jenkins are failed consistently 
 recently.
 The Stacktrace is:
 {code}
 java.lang.Exception: test timed out after 9 milliseconds
   at 
 com.google.protobuf.LiteralByteString.init(LiteralByteString.java:234)
   at com.google.protobuf.ByteString.copyFromUtf8(ByteString.java:255)
   at 
 org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getMethodNameBytes(ProtobufRpcEngineProtos.java:286)
   at 
 org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getSerializedSize(ProtobufRpcEngineProtos.java:462)
   at 
 com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:84)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$RpcMessageWithHeader.write(ProtobufRpcEngine.java:302)
   at 
 org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:989)
   at org.apache.hadoop.ipc.Client.call(Client.java:1377)
   at org.apache.hadoop.ipc.Client.call(Client.java:1357)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
   at $Proxy70.getApplicationReport(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:137)
   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:185)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
   at $Proxy71.getApplicationReport(Unknown Source)
   at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:195)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.Client.monitorApplication(Client.java:622)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:597)
   at 
 org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:125)
 {code}
 For details, please refer:
 https://builds.apache.org/job/PreCommit-YARN-Build/2039//testReport/



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-867) Isolation of failures in aux services

2013-10-04 Thread Hitesh Shah (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-867:
-

Target Version/s: 2.3.0

 Isolation of failures in aux services 
 --

 Key: YARN-867
 URL: https://issues.apache.org/jira/browse/YARN-867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, 
 YARN-867.4.patch, YARN-867.5.patch, YARN-867.6.patch, 
 YARN-867.sampleCode.2.patch


 Today, a malicious application can bring down the NM by sending bad data to a 
 service. For example, sending data to the ShuffleService such that it results 
 any non-IOException will cause the NM's async dispatcher to exit as the 
 service's INIT APP event is not handled properly. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-867) Isolation of failures in aux services

2013-10-04 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786402#comment-13786402
 ] 

Alejandro Abdelnur commented on YARN-867:
-

[~bikassaha] got it, missed that was moved to another jira, thx

 Isolation of failures in aux services 
 --

 Key: YARN-867
 URL: https://issues.apache.org/jira/browse/YARN-867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, 
 YARN-867.4.patch, YARN-867.5.patch, YARN-867.6.patch, 
 YARN-867.sampleCode.2.patch


 Today, a malicious application can bring down the NM by sending bad data to a 
 service. For example, sending data to the ShuffleService such that it results 
 any non-IOException will cause the NM's async dispatcher to exit as the 
 service's INIT APP event is not handled properly. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Assigned] (YARN-1270) TestSLSRunner test is failing

2013-10-04 Thread Wei Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan reassigned YARN-1270:
-

Assignee: Wei Yan

 TestSLSRunner test is failing
 -

 Key: YARN-1270
 URL: https://issues.apache.org/jira/browse/YARN-1270
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Mit Desai
Assignee: Wei Yan

 Added in the YARN-1021 patch, the test TestSLSRunner is now failing.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2013-10-04 Thread Wei Yan (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786407#comment-13786407
]

Wei Yan commented on YARN-1021:
---

Thanks, [~mitdesai].
I'll look into it.

Yarn Scheduler Load Simulator
-

Key: YARN-1021
URL: https://issues.apache.org/jira/browse/YARN-1021
Project: Hadoop YARN
Issue Type: New Feature
Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
Fix For: 2.3.0

Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz,
YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf

The Yarn Scheduler is a fertile area of interest with different
implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile,
several optimizations are also made to improve scheduler performance for
different scenarios and workload. Each scheduler algorithm has its own set of
features, and drives scheduling decisions by many factors, such as fairness,
capacity guarantee, resource availability, etc. It is very important to
evaluate a scheduler algorithm very well before we deploy it in a production
cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling
algorithm. Evaluating in a real cluster is always time and cost consuming,
and it is also very hard to find a large-enough cluster. Hence, a simulator
which can predict how well a scheduler algorithm for some specific workload
would be quite useful.
We want to build a Scheduler Load Simulator to simulate large-scale Yarn
clusters and application loads in a single machine. This would be invaluable
in furthering Yarn by providing a tool for researchers and developers to
prototype new scheduler features and predict their behavior and performance
with reasonable amount of confidence, there-by aiding rapid innovation.
The simulator will exercise the real Yarn ResourceManager removing the
network factor by simulating NodeManagers and ApplicationMasters via handling
and dispatching NM/AMs heartbeat events from within the same JVM.
To keep tracking of scheduler behavior and performance, a scheduler wrapper
will wrap the real scheduler.
The simulator will produce real time metrics while executing, including:
* Resource usages for whole cluster and each queue, which can be utilized to
configure cluster and queue's capacity.
* The detailed application execution trace (recorded in relation to simulated
time), which can be analyzed to understand/validate the scheduler behavior
(individual jobs turn around time, throughput, fairness, capacity guarantee,
etc).
* Several key metrics of scheduler algorithm, such as time cost of each
scheduler operation (allocate, handle, etc), which can be utilized by Hadoop
developers to find the code spots and scalability limits.
The simulator will provide real time charts showing the behavior of the
scheduler and its performance.
A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing
how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty


 [ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1167:


Attachment: YARN-1167.6.patch

new patch contains a modified test case. Passing the TestDistributedShell test 
locally. Check if Jenkins like it 

 Submitted distributed shell application shows appMasterHost = empty
 ---

 Key: YARN-1167
 URL: https://issues.apache.org/jira/browse/YARN-1167
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
 YARN-1167.4.patch, YARN-1167.5.patch, YARN-1167.6.patch


 Submit distributed shell application. Once the application turns to be 
 RUNNING state, app master host should not be empty. In reality, it is empty.
 ==console logs==
 distributedshell.Client: Got application report from ASM for, appId=12, 
 clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
 appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
 distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty


[ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786434#comment-13786434
 ] 

Omkar Vinit Joshi commented on YARN-1167:
-

[~xgong] can you please verify why we had to reduce memory arguments and 
container nos? Is that because we don't have memory or some race condition?

 Submitted distributed shell application shows appMasterHost = empty
 ---

 Key: YARN-1167
 URL: https://issues.apache.org/jira/browse/YARN-1167
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
 YARN-1167.4.patch, YARN-1167.5.patch, YARN-1167.6.patch


 Submit distributed shell application. Once the application turns to be 
 RUNNING state, app master host should not be empty. In reality, it is empty.
 ==console logs==
 distributedshell.Client: Got application report from ASM for, appId=12, 
 clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
 appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
 distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty


 [ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1167:


Attachment: YARN-1167.7.patch

set AMRPCPort to -1 and restore all parameters for the MinYarnCluster 

 Submitted distributed shell application shows appMasterHost = empty
 ---

 Key: YARN-1167
 URL: https://issues.apache.org/jira/browse/YARN-1167
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
 YARN-1167.4.patch, YARN-1167.5.patch, YARN-1167.6.patch, YARN-1167.7.patch


 Submit distributed shell application. Once the application turns to be 
 RUNNING state, app master host should not be empty. In reality, it is empty.
 ==console logs==
 distributedshell.Client: Got application report from ASM for, appId=12, 
 clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
 appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
 distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty


[ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786453#comment-13786453
 ] 

Hadoop QA commented on YARN-1167:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12606848/YARN-1167.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2101//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2101//console

This message is automatically generated.

 Submitted distributed shell application shows appMasterHost = empty
 ---

 Key: YARN-1167
 URL: https://issues.apache.org/jira/browse/YARN-1167
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
 YARN-1167.4.patch, YARN-1167.5.patch, YARN-1167.6.patch, YARN-1167.7.patch


 Submit distributed shell application. Once the application turns to be 
 RUNNING state, app master host should not be empty. In reality, it is empty.
 ==console logs==
 distributedshell.Client: Got application report from ASM for, appId=12, 
 clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
 appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
 distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1232) Configuration to support multiple RMs

2013-10-04 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786459#comment-13786459
 ] 

Bikas Saha commented on YARN-1232:
--

Looks good. +1. Thanks for being patient with the reviews!

 Configuration to support multiple RMs
 -

 Key: YARN-1232
 URL: https://issues.apache.org/jira/browse/YARN-1232
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1232-1.patch, yarn-1232-2.patch, yarn-1232-3.patch, 
 yarn-1232-4.patch, yarn-1232-5.patch, yarn-1232-6.patch, yarn-1232-7.patch, 
 yarn-1232-7.patch


 We should augment the configuration to allow users specify two RMs and the 
 individual RPC addresses for them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty


[ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786487#comment-13786487
 ] 

Hadoop QA commented on YARN-1167:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12606854/YARN-1167.7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell:

  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2102//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2102//console

This message is automatically generated.

 Submitted distributed shell application shows appMasterHost = empty
 ---

 Key: YARN-1167
 URL: https://issues.apache.org/jira/browse/YARN-1167
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
 YARN-1167.4.patch, YARN-1167.5.patch, YARN-1167.6.patch, YARN-1167.7.patch


 Submit distributed shell application. Once the application turns to be 
 RUNNING state, app master host should not be empty. In reality, it is empty.
 ==console logs==
 distributedshell.Client: Got application report from ASM for, appId=12, 
 clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
 appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
 distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty


[ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786491#comment-13786491
 ] 

Omkar Vinit Joshi commented on YARN-1167:
-

bq. +  private int appMasterRpcPort = -1;
why?

 Submitted distributed shell application shows appMasterHost = empty
 ---

 Key: YARN-1167
 URL: https://issues.apache.org/jira/browse/YARN-1167
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
 YARN-1167.4.patch, YARN-1167.5.patch, YARN-1167.6.patch, YARN-1167.7.patch


 Submit distributed shell application. Once the application turns to be 
 RUNNING state, app master host should not be empty. In reality, it is empty.
 ==console logs==
 distributedshell.Client: Got application report from ASM for, appId=12, 
 clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
 appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
 distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty

2013-10-04 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786497#comment-13786497
 ] 

Vinod Kumar Vavilapalli commented on YARN-1167:
---

Latest patch looks good. Can you debug why TestDistributedShell works with 
patch #6 but not with #7 ?

 Submitted distributed shell application shows appMasterHost = empty
 ---

 Key: YARN-1167
 URL: https://issues.apache.org/jira/browse/YARN-1167
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
 YARN-1167.4.patch, YARN-1167.5.patch, YARN-1167.6.patch, YARN-1167.7.patch


 Submit distributed shell application. Once the application turns to be 
 RUNNING state, app master host should not be empty. In reality, it is empty.
 ==console logs==
 distributedshell.Client: Got application report from ASM for, appId=12, 
 clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
 appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
 distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-445) Ability to signal containers


[ 
https://issues.apache.org/jira/browse/YARN-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786506#comment-13786506
 ] 

Andrey Klochkov commented on YARN-445:
--

The large diffs in the tests are not due to reformatting but because of 
refactoring needed to implement an additional test without lots of copy/paste. 

 Ability to signal containers
 

 Key: YARN-445
 URL: https://issues.apache.org/jira/browse/YARN-445
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Jason Lowe
 Attachments: YARN-445--n2.patch, YARN-445.patch


 It would be nice if an ApplicationMaster could send signals to contaniers 
 such as SIGQUIT, SIGUSR1, etc.
 For example, in order to replicate the jstack-on-task-timeout feature 
 implemented by MAPREDUCE-1119 in Hadoop 0.21 the NodeManager needs an 
 interface for sending SIGQUIT to a container.  For that specific feature we 
 could implement it as an additional field in the StopContainerRequest.  
 However that would not address other potential features like the ability for 
 an AM to trigger jstacks on arbitrary tasks *without* killing them.  The 
 latter feature would be a very useful debugging tool for users who do not 
 have shell access to the nodes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (YARN-1272) Add a link to cluster/application page on node manager's list of application page

2013-10-04 Thread Paul Han (JIRA)

Paul Han created YARN-1272:
--

 Summary: Add a link to cluster/application page on node manager's 
list of application page
 Key: YARN-1272
 URL: https://issues.apache.org/jira/browse/YARN-1272
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.5-alpha
Reporter: Paul Han


On node manager's application/application page, the content is significant less 
than the content on resource managers's application page /cluster/application.

Adding a link from nodemanager's application page to resourcemanager's 
application page will help user get info faster and more efficient.

Please see the screenshot for the benefit.




--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy

2013-10-04 Thread Ravi Prakash (JIRA)

[
https://issues.apache.org/jira/browse/YARN-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786532#comment-13786532
]

Ravi Prakash commented on YARN-465:
---

Thanks for the updates Andrey! Here are some questions I had and comments.
- Why did you remove the proxy.join() from startup?
- If you removed proxy.join(), you didn't need to create a new method
(startServer). Just call main() on the class.
- In the test file in start(), did you mean to log the originalPort instead of
port? port would always be 0.
- Why did you have to create a core-default.xml file? Could you not have
hardcoded the port inside the test file? Also, could you please tell me where
hadoop.common.configuration.version is being used? I wasn't able to find it.
- Nit: Can you setName(proxy) - setName(Proxy for test);
-Nit: Could you please put a more detailed message in
WebAppProxyForTest.start() when the proxy server starts up?
- In testWebAppProxyServlet(), what is the significance of
proxyConn.setRequestProperty(Cookie, checked_application_0_=true); ?
The test passes after commenting out that line too.
- What is testWebAppProxyServer that testWebAppProxyServlet()'s first test
isn't testing?
- What is testWebAppProxyServerMainMethod actually testing? counter is set 0 on
the very first successful try. Shouldn't that be the expected behavior. If we
are having to start the proxy server more than 1 time for the test to pass,
that is bad, and should make the test fail.

fix coverage org.apache.hadoop.yarn.server.webproxy

Key: YARN-465
URL: https://issues.apache.org/jira/browse/YARN-465
Project: Hadoop YARN
Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
Assignee: Andrey Klochkov
Attachments: YARN-465-branch-0.23-a.patch,
YARN-465-branch-0.23.patch, YARN-465-branch-2-a.patch,
YARN-465-branch-2--n3.patch, YARN-465-branch-2.patch, YARN-465-trunk-a.patch,
YARN-465-trunk--n3.patch, YARN-465-trunk.patch

fix coverage org.apache.hadoop.yarn.server.webproxy
patch YARN-465-trunk.patch for trunk
patch YARN-465-branch-2.patch for branch-2
patch YARN-465-branch-0.23.patch for branch-0.23
There is issue in branch-0.23 . Patch does not creating .keep file.
For fix it need to run commands:
mkdir
yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy
touch
yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy/.keep

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1254) NM is polluting container's credentials

2013-10-04 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786539#comment-13786539
 ] 

Vinod Kumar Vavilapalli commented on YARN-1254:
---

Don't think this patch is correct. The fundamental problem is that 
ResourceLocalizationService.writeCredentials() is polluting container's 
credentials by adding LocalizerToken. We should just close container's 
credentials before writing the token file for the localizer.

 NM is polluting container's credentials
 ---

 Key: YARN-1254
 URL: https://issues.apache.org/jira/browse/YARN-1254
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Omkar Vinit Joshi
 Attachments: YARN-1254.20131030.1.patch


 Before launching the container, NM is using the same credential object and so 
 is polluting what container should see. We should fix this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1232) Configuration to support multiple RMs


[ 
https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786549#comment-13786549
 ] 

Hudson commented on YARN-1232:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4539 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4539/])
YARN-1232. Configuration to support multiple RMs (Karthik Kambatla via bikas) 
(bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529251)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/ClientRMProxy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/conf/TestHAUtil.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/ServerRMProxy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMHAProtocolService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java


 Configuration to support multiple RMs
 -

 Key: YARN-1232
 URL: https://issues.apache.org/jira/browse/YARN-1232
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1232-1.patch, yarn-1232-2.patch, yarn-1232-3.patch, 
 yarn-1232-4.patch, yarn-1232-5.patch, yarn-1232-6.patch, yarn-1232-7.patch, 
 yarn-1232-7.patch


 We should augment the configuration to allow users specify two RMs and the 
 individual RPC addresses for them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1268) TestFairScheduler.testContinuousScheduling is flaky


 [ 
https://issues.apache.org/jira/browse/YARN-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1268:
-

Attachment: YARN-1268-1.patch

 TestFairScheduler.testContinuousScheduling is flaky
 ---

 Key: YARN-1268
 URL: https://issues.apache.org/jira/browse/YARN-1268
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1268-1.patch, YARN-1268.patch


 It looks like there's a timeout in it that's causing it to be flaky.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-445) Ability to signal containers

[
https://issues.apache.org/jira/browse/YARN-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786572#comment-13786572
]

Andrey Klochkov commented on YARN-445:
--

Steve, the current implementation will send the signal to the java started with
bin/hbase as it sends it to all processes in the job object, e.g. all processes
of the main container process. It can be replaced with sending the signal to
all processes in the group instead, and I think the behavior will be the same.

BTW I don't know how to do the opposite - i.e. how to avoid sending the signal
to all processes of the container, on Windows (so the behavior on Linux is
different as bin/hbase will receive the signal). I think this is fine as long
as this difference is documented. In case of hbase the shell script can create
a custom hook for SIGTERM and do whatever is needed in that case (e.g. send
SIGTERM to the java process it started).

There is one caveat in ctrl+break handling in case of a batch file starting a
java process:
1. the batch file starts the java process
2. user sends ctrl+break to all processes in the group (or job object). java
process prints thread dump. batch file doesn't react yet.
3. the java processes completes successfully
4. the batch file will not exit, it will print Terminate batch job? (Y/N) as
it received the ctrl+break signal earlier.

The only way I see on how we can overcome this problem with batch file
processes is to identify them somehow (by executable name?) when walking
through the processes in the job object, and do not send them the signal.
Sending ctrl+break to batch file processes doesn't make sense anyway as in
newer Windows there's no way to disable or customize ctrl+break handling in
batch files.

Ability to signal containers

Key: YARN-445
URL: https://issues.apache.org/jira/browse/YARN-445
Project: Hadoop YARN
Issue Type: Sub-task
Components: nodemanager
Reporter: Jason Lowe
Attachments: YARN-445--n2.patch, YARN-445.patch

It would be nice if an ApplicationMaster could send signals to contaniers
such as SIGQUIT, SIGUSR1, etc.
For example, in order to replicate the jstack-on-task-timeout feature
implemented by MAPREDUCE-1119 in Hadoop 0.21 the NodeManager needs an
interface for sending SIGQUIT to a container. For that specific feature we
could implement it as an additional field in the StopContainerRequest.
However that would not address other potential features like the ability for
an AM to trigger jstacks on arbitrary tasks *without* killing them. The
latter feature would be a very useful debugging tool for users who do not
have shell access to the nodes.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-415) Capture memory utilization at the app-level for chargeback

2013-10-04 Thread Jason Lowe (JIRA)

[
https://issues.apache.org/jira/browse/YARN-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786577#comment-13786577
]

Jason Lowe commented on YARN-415:
-

The latest patch no longer applies to trunk. Could you please refresh it?
Some review comments:

General:
* Nit: the extra plurality of VirtualCoresSeconds sounds a bit odd, wondering
if it should be VirtualCoreSeconds or VcoreSeconds in the various places it
appears.

ApplicationCLI:
* UI wording: In the code it's vcore-seconds but the UI says CPU-seconds. I'm
wondering if users are going to interpret CPU to be a hardware core, and I'm
not sure a vcore will map to a hardware core in the typical case. The
configuration properties refer to vcores, so we should probably use
vcore-seconds here for consistency. Curious what others think about this, as I
could be convinced to leave it as CPU.

RMAppAttempt has just a spurious whitespace change

RMAppAttemptImpl:
* Nit: containerAllocated and containerFinished are private and always called
from transitions, so acquiring the write lock is unnecessary.
* ContainerFinishedTransition.transition does not call containerFinished when
it's the AM container. We leak the AM container and consider it always
running if an AM crashes.

RMContainerEvent:
* Nit: whitespace between the constructor definitions would be nice.

TestRMAppAttemptTransitions:
* Nit: it would be cleaner and easier to read if we add a new
allocateApplicationAttemptAtTime method and have the existing
allocateApplicationAttempt method simply call it with -1 rather than change all
those places to pass -1.

Speaking of leaking containers, is there something we can do to audit/assert
that applications that have completed don't have running containers? If we
lose track of a container finished event, the consumed resources are going to
keep increasing indefinitely. It's a bug in the RM either way but wondering if
there's some warning/sanity checking we can do to keep the metric from becoming
utterly useless when it occurs. Capping it at the end of the application would
at least prevent it from growing beyond the application lifetime. Then again,
letting it grow continuously at least is more indicative something went
terribly wrong with the accounting and therefore the metric can't be trusted.
Just thinking out loud, not sure what the best solution is.

Capture memory utilization at the app-level for chargeback
--

Key: YARN-415
URL: https://issues.apache.org/jira/browse/YARN-415
Project: Hadoop YARN
Issue Type: New Feature
Components: resourcemanager
Affects Versions: 0.23.6
Reporter: Kendall Thrapp
Assignee: Andrey Klochkov
Attachments: YARN-415--n2.patch, YARN-415--n3.patch, YARN-415.patch

For the purpose of chargeback, I'd like to be able to compute the cost of an
application in terms of cluster resource usage. To start out, I'd like to
get the memory utilization of an application. The unit should be MB-seconds
or something similar and, from a chargeback perspective, the memory amount
should be the memory reserved for the application, as even if the app didn't
use all that memory, no one else was able to use it.
(reserved ram for container 1 * lifetime of container 1) + (reserved ram for
container 2 * lifetime of container 2) + ... + (reserved ram for container n
* lifetime of container n)
It'd be nice to have this at the app level instead of the job level because:
1. We'd still be able to get memory usage for jobs that crashed (and wouldn't
appear on the job history server).
2. We'd be able to get memory usage for future non-MR jobs (e.g. Storm).
This new metric should be available both through the RM UI and RM Web
Services REST API.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty


[ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786596#comment-13786596
 ] 

Xuan Gong commented on YARN-1167:
-

[~vinodkv] We can set appMasterRpcPort as -1. Because there is the parameter 
check for AMRMClientImple.

{code}
Preconditions.checkArgument(appHostPort = 0,
Port number of the host should not be negative);
{code} 

 Submitted distributed shell application shows appMasterHost = empty
 ---

 Key: YARN-1167
 URL: https://issues.apache.org/jira/browse/YARN-1167
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
 YARN-1167.4.patch, YARN-1167.5.patch, YARN-1167.6.patch, YARN-1167.7.patch


 Submit distributed shell application. Once the application turns to be 
 RUNNING state, app master host should not be empty. In reality, it is empty.
 ==console logs==
 distributedshell.Client: Got application report from ASM for, appId=12, 
 clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
 appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
 distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty


[ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786597#comment-13786597
 ] 

Xuan Gong commented on YARN-1167:
-

Can not set the port as -1

 Submitted distributed shell application shows appMasterHost = empty
 ---

 Key: YARN-1167
 URL: https://issues.apache.org/jira/browse/YARN-1167
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
 YARN-1167.4.patch, YARN-1167.5.patch, YARN-1167.6.patch, YARN-1167.7.patch


 Submit distributed shell application. Once the application turns to be 
 RUNNING state, app master host should not be empty. In reality, it is empty.
 ==console logs==
 distributedshell.Client: Got application report from ASM for, appId=12, 
 clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
 appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
 distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1268) TestFairScheduler.testContinuousScheduling is flaky


[ 
https://issues.apache.org/jira/browse/YARN-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786621#comment-13786621
 ] 

Hadoop QA commented on YARN-1268:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12606878/YARN-1268-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2103//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2103//console

This message is automatically generated.

 TestFairScheduler.testContinuousScheduling is flaky
 ---

 Key: YARN-1268
 URL: https://issues.apache.org/jira/browse/YARN-1268
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1268-1.patch, YARN-1268.patch


 It looks like there's a timeout in it that's causing it to be flaky.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Assigned] (YARN-1183) MiniYARNCluster shutdown takes several minutes intermittently


 [ 
https://issues.apache.org/jira/browse/YARN-1183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov reassigned YARN-1183:
-

Assignee: Andrey Klochkov

 MiniYARNCluster shutdown takes several minutes intermittently
 -

 Key: YARN-1183
 URL: https://issues.apache.org/jira/browse/YARN-1183
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Andrey Klochkov
Assignee: Andrey Klochkov
 Attachments: YARN-1183--n2.patch, YARN-1183--n3.patch, 
 YARN-1183--n4.patch, YARN-1183.patch


 As described in MAPREDUCE-5501 sometimes M/R tests leave MRAppMaster java 
 processes living for several minutes after successful completion of the 
 corresponding test. There is a concurrency issue in MiniYARNCluster shutdown 
 logic which leads to this. Sometimes RM stops before an app master sends it's 
 last report, and then the app master keeps retrying for 6 minutes. In some 
 cases it leads to failures in subsequent tests, and it affects performance of 
 tests as app masters eat resources.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode


[ 
https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786679#comment-13786679
 ] 

Hudson commented on YARN-1253:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4541 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4541/])
YARN-1253. Changes to LinuxContainerExecutor to run containers as a single 
dedicated user in non-secure mode. (rvs via tucu) (tucu: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1529325)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/ClusterSetup.apt.vm
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java


 Changes to LinuxContainerExecutor to run containers as a single dedicated 
 user in non-secure mode
 -

 Key: YARN-1253
 URL: https://issues.apache.org/jira/browse/YARN-1253
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur
Assignee: Roman Shaposhnik
Priority: Blocker
 Fix For: 2.3.0

 Attachments: YARN-1253.patch.txt


 When using cgroups we require LCE to be configured in the cluster to start 
 containers. 
 When LCE starts containers as the user that submitted the job. While this 
 works correctly in a secure setup, in an un-secure setup this presents a 
 couple issues:
 * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes
 * Because users can impersonate other users, any user would have access to 
 any local file of other users
 Particularly, the second issue is not desirable as a user could get access to 
 ssh keys of other users in the nodes or if there are NFS mounts, get to other 
 users data outside of the cluster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy

[
https://issues.apache.org/jira/browse/YARN-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrey Klochkov updated YARN-465:
-

Attachment: YARN-465-trunk--n4.patch

Ravi, this is not my patch so please keep in mind I'm digging into this code as
you are. Alexey wouldn't be available to make fixes so I'm taking this on me so
the contribution wouldn't be lost.

1-2. As I see, WebAppProxy.start() method is used in another test, so that
should be the reason it's not a part of the main method. The join method is
removed as it's not used anymore.
3. I think that it is meant to log port, not originalPort. The port
variable is set in WebAppProxyForTest.start() to the actual port which the
server binds to.
4. Indeed core-default.xml is not needed. I'm replacing it with making this
configuration in the code of the test itself.
5. It must be setName(proxy) as this is the name of the webapp under
hadoop-yarn-common/src/main/resources/webapps. If you set it to anything else
that would lead to ClassNotFoundException. I made the message about the port
number more detailed.
6. I added the check which verifies that the cookie is present in one case and
is absent in another.
7. Yes, I don't see why testWebAppProxyServer is needed in the presense of
testWebAppProxyServlet. Removing.
8. The test testWebAppProxyServerMainMethod is testing that the server is
starting successfully. The counter is used to wait for the server to start.

Attaching the updated patch for trunk

fix coverage org.apache.hadoop.yarn.server.webproxy

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy


 [ 
https://issues.apache.org/jira/browse/YARN-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated YARN-465:
-

Attachment: YARN-465-branch-2--n4.patch

Attaching the updated patch for branch-2

 fix coverage  org.apache.hadoop.yarn.server.webproxy
 

 Key: YARN-465
 URL: https://issues.apache.org/jira/browse/YARN-465
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
Assignee: Andrey Klochkov
 Attachments: YARN-465-branch-0.23-a.patch, 
 YARN-465-branch-0.23.patch, YARN-465-branch-2-a.patch, 
 YARN-465-branch-2--n3.patch, YARN-465-branch-2--n4.patch, 
 YARN-465-branch-2.patch, YARN-465-trunk-a.patch, YARN-465-trunk--n3.patch, 
 YARN-465-trunk--n4.patch, YARN-465-trunk.patch


 fix coverage  org.apache.hadoop.yarn.server.webproxy
 patch YARN-465-trunk.patch for trunk
 patch YARN-465-branch-2.patch for branch-2
 patch YARN-465-branch-0.23.patch for branch-0.23
 There is issue in branch-0.23 . Patch does not creating .keep file.
 For fix it need to run commands:
 mkdir 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy
 touch 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy/.keep
  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy


[ 
https://issues.apache.org/jira/browse/YARN-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786720#comment-13786720
 ] 

Hadoop QA commented on YARN-465:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12606913/YARN-465-branch-2--n4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2104//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2104//console

This message is automatically generated.

 fix coverage  org.apache.hadoop.yarn.server.webproxy
 

 Key: YARN-465
 URL: https://issues.apache.org/jira/browse/YARN-465
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
Assignee: Andrey Klochkov
 Attachments: YARN-465-branch-0.23-a.patch, 
 YARN-465-branch-0.23.patch, YARN-465-branch-2-a.patch, 
 YARN-465-branch-2--n3.patch, YARN-465-branch-2--n4.patch, 
 YARN-465-branch-2.patch, YARN-465-trunk-a.patch, YARN-465-trunk--n3.patch, 
 YARN-465-trunk--n4.patch, YARN-465-trunk.patch


 fix coverage  org.apache.hadoop.yarn.server.webproxy
 patch YARN-465-trunk.patch for trunk
 patch YARN-465-branch-2.patch for branch-2
 patch YARN-465-branch-0.23.patch for branch-0.23
 There is issue in branch-0.23 . Patch does not creating .keep file.
 For fix it need to run commands:
 mkdir 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy
 touch 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy/.keep
  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1254) NM is polluting container's credentials


 [ 
https://issues.apache.org/jira/browse/YARN-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-1254:


Attachment: YARN-1254.20131004.1.patch

 NM is polluting container's credentials
 ---

 Key: YARN-1254
 URL: https://issues.apache.org/jira/browse/YARN-1254
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Omkar Vinit Joshi
 Attachments: YARN-1254.20131004.1.patch, YARN-1254.20131030.1.patch


 Before launching the container, NM is using the same credential object and so 
 is polluting what container should see. We should fix this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-415) Capture memory utilization at the app-level for chargeback

[
https://issues.apache.org/jira/browse/YARN-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrey Klochkov updated YARN-415:
-

Attachment: YARN-415--n4.patch

Jason, thanks for the thorough review. Attaching the patch with fixes. I
basically made all the fixes you're proposing except the last one about
capturing the leak.

Capture memory utilization at the app-level for chargeback
--

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1254) NM is polluting container's credentials


[ 
https://issues.apache.org/jira/browse/YARN-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786745#comment-13786745
 ] 

Omkar Vinit Joshi commented on YARN-1254:
-

Not able to find a good way to test container credential contamination.

 NM is polluting container's credentials
 ---

 Key: YARN-1254
 URL: https://issues.apache.org/jira/browse/YARN-1254
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Omkar Vinit Joshi
 Attachments: YARN-1254.20131004.1.patch, YARN-1254.20131030.1.patch


 Before launching the container, NM is using the same credential object and so 
 is polluting what container should see. We should fix this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-445) Ability to signal containers


 [ 
https://issues.apache.org/jira/browse/YARN-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated YARN-445:
-

Attachment: YARN-445--n3.patch

Attaching the patch that marks all new interfaces/methods as unstable.

 Ability to signal containers
 

 Key: YARN-445
 URL: https://issues.apache.org/jira/browse/YARN-445
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Jason Lowe
 Attachments: YARN-445--n2.patch, YARN-445--n3.patch, YARN-445.patch


 It would be nice if an ApplicationMaster could send signals to contaniers 
 such as SIGQUIT, SIGUSR1, etc.
 For example, in order to replicate the jstack-on-task-timeout feature 
 implemented by MAPREDUCE-1119 in Hadoop 0.21 the NodeManager needs an 
 interface for sending SIGQUIT to a container.  For that specific feature we 
 could implement it as an additional field in the StopContainerRequest.  
 However that would not address other potential features like the ability for 
 an AM to trigger jstacks on arbitrary tasks *without* killing them.  The 
 latter feature would be a very useful debugging tool for users who do not 
 have shell access to the nodes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty


 [ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1167:


Attachment: YARN-1167.8.patch

 Submitted distributed shell application shows appMasterHost = empty
 ---

 Key: YARN-1167
 URL: https://issues.apache.org/jira/browse/YARN-1167
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
 YARN-1167.4.patch, YARN-1167.5.patch, YARN-1167.6.patch, YARN-1167.7.patch, 
 YARN-1167.8.patch


 Submit distributed shell application. Once the application turns to be 
 RUNNING state, app master host should not be empty. In reality, it is empty.
 ==console logs==
 distributedshell.Client: Got application report from ASM for, appId=12, 
 clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
 appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
 distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1254) NM is polluting container's credentials