[jira] [Commented] (YARN-1370) Fair scheduler to re-populate container allocation state

2014-08-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092029#comment-14092029
 ] 

Hadoop QA commented on YARN-1370:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12660838/YARN-1370.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4578//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4578//console

This message is automatically generated.

 Fair scheduler to re-populate container allocation state
 

 Key: YARN-1370
 URL: https://issues.apache.org/jira/browse/YARN-1370
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Anubhav Dhoot
 Attachments: YARN-1370.001.patch


 YARN-1367 and YARN-1368 enable the NM to tell the RM about currently running 
 containers and the RM will pass this information to the schedulers along with 
 the node information. The schedulers are currently already informed about 
 previously running apps when the app data is recovered from the store. The 
 scheduler is expected to be able to repopulate its allocation state from the 
 above 2 sources of information.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2302) Refactor TimelineWebServices

2014-08-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092047#comment-14092047
 ] 

Hudson commented on YARN-2302:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6044 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6044/])
YARN-2302. Refactor TimelineWebServices. (Contributed by Zhijie Shen) 
(junping_du: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1617055)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/TimelineDataManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryClientService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java


 Refactor TimelineWebServices
 

 Key: YARN-2302
 URL: https://issues.apache.org/jira/browse/YARN-2302
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.6.0

 Attachments: YARN-2302.1.patch, YARN-2302.2.patch, YARN-2302.3.patch, 
 YARN-2302.4.patch


 Now TimelineWebServices contains non-trivial logic to process the HTTP 
 requests, manipulate the data, check the access, and interact with the 
 timeline store.
 I propose the move the data-oriented logic to a middle layer (so called 
 TimelineDataManager), and TimelineWebServices only processes the requests, 
 and call TimelineDataManager to complete the remaining tasks.
 By doing this, we make the generic history module reuse TimelineDataManager 
 internally (YARN-2033), invoking the putting/getting methods directly. 
 Otherwise, we have to send the HTTP requests to TimelineWebServices to query 
 the generic history data, which is not an efficient way.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2302) Refactor TimelineWebServices

2014-08-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092068#comment-14092068
 ] 

Hudson commented on YARN-2302:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #640 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/640/])
YARN-2302. Refactor TimelineWebServices. (Contributed by Zhijie Shen) 
(junping_du: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1617055)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/TimelineDataManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryClientService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java


 Refactor TimelineWebServices
 

 Key: YARN-2302
 URL: https://issues.apache.org/jira/browse/YARN-2302
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.6.0

 Attachments: YARN-2302.1.patch, YARN-2302.2.patch, YARN-2302.3.patch, 
 YARN-2302.4.patch


 Now TimelineWebServices contains non-trivial logic to process the HTTP 
 requests, manipulate the data, check the access, and interact with the 
 timeline store.
 I propose the move the data-oriented logic to a middle layer (so called 
 TimelineDataManager), and TimelineWebServices only processes the requests, 
 and call TimelineDataManager to complete the remaining tasks.
 By doing this, we make the generic history module reuse TimelineDataManager 
 internally (YARN-2033), invoking the putting/getting methods directly. 
 Otherwise, we have to send the HTTP requests to TimelineWebServices to query 
 the generic history data, which is not an efficient way.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2400) TestAMRestart fails intermittently

2014-08-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092067#comment-14092067
 ] 

Hudson commented on YARN-2400:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #640 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/640/])
YARN-2400. Fixed TestAMRestart fails intermittently. Contributed by Jian He: 
(xgong: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1617028)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java


 TestAMRestart fails intermittently
 --

 Key: YARN-2400
 URL: https://issues.apache.org/jira/browse/YARN-2400
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Fix For: 2.6.0

 Attachments: YARN-2240.2.patch, YARN-2400.1.patch


 java.lang.AssertionError: AppAttempt state is not correct (timedout) 
 expected:ALLOCATED but was:SCHEDULED
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:417)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAM(MockRM.java:579)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAndRegisterAM(MockRM.java:586)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart.testShouldNotCountFailureToMaxAttemptRetry(TestAMRestart.java:389)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1954) Add waitFor to AMRMClient(Async)

2014-08-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092069#comment-14092069
 ] 

Hudson commented on YARN-1954:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #640 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/640/])
YARN-1954. Added waitFor to AMRMClient(Async). Contributed by Tsuyoshi Ozawa. 
(zjshen: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1617002)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/AMRMClient.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/async/AMRMClientAsync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestAMRMClientAsync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java


 Add waitFor to AMRMClient(Async)
 

 Key: YARN-1954
 URL: https://issues.apache.org/jira/browse/YARN-1954
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: client
Affects Versions: 3.0.0, 2.4.0
Reporter: Zhijie Shen
Assignee: Tsuyoshi OZAWA
 Fix For: 2.6.0

 Attachments: YARN-1954.1.patch, YARN-1954.2.patch, YARN-1954.3.patch, 
 YARN-1954.4.patch, YARN-1954.4.patch, YARN-1954.5.patch, YARN-1954.6.patch, 
 YARN-1954.7.patch, YARN-1954.8.patch


 Recently, I saw some use cases of AMRMClient(Async). The painful thing is 
 that the main non-daemon thread has to sit in a dummy loop to prevent AM 
 process exiting before all the tasks are done, while unregistration is 
 triggered on a separate another daemon thread by callback methods (in 
 particular when using AMRMClientAsync). IMHO, it should be beneficial to add 
 a waitFor method to AMRMClient(Async) to block the AM until unregistration or 
 user supplied check point, such that users don't need to write the loop 
 themselves.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2400) TestAMRestart fails intermittently

2014-08-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092088#comment-14092088
 ] 

Hudson commented on YARN-2400:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1833 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1833/])
YARN-2400. Fixed TestAMRestart fails intermittently. Contributed by Jian He: 
(xgong: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1617028)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java


 TestAMRestart fails intermittently
 --

 Key: YARN-2400
 URL: https://issues.apache.org/jira/browse/YARN-2400
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Fix For: 2.6.0

 Attachments: YARN-2240.2.patch, YARN-2400.1.patch


 java.lang.AssertionError: AppAttempt state is not correct (timedout) 
 expected:ALLOCATED but was:SCHEDULED
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:417)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAM(MockRM.java:579)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAndRegisterAM(MockRM.java:586)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart.testShouldNotCountFailureToMaxAttemptRetry(TestAMRestart.java:389)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1954) Add waitFor to AMRMClient(Async)

2014-08-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092090#comment-14092090
 ] 

Hudson commented on YARN-1954:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1833 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1833/])
YARN-1954. Added waitFor to AMRMClient(Async). Contributed by Tsuyoshi Ozawa. 
(zjshen: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1617002)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/AMRMClient.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/async/AMRMClientAsync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestAMRMClientAsync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java


 Add waitFor to AMRMClient(Async)
 

 Key: YARN-1954
 URL: https://issues.apache.org/jira/browse/YARN-1954
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: client
Affects Versions: 3.0.0, 2.4.0
Reporter: Zhijie Shen
Assignee: Tsuyoshi OZAWA
 Fix For: 2.6.0

 Attachments: YARN-1954.1.patch, YARN-1954.2.patch, YARN-1954.3.patch, 
 YARN-1954.4.patch, YARN-1954.4.patch, YARN-1954.5.patch, YARN-1954.6.patch, 
 YARN-1954.7.patch, YARN-1954.8.patch


 Recently, I saw some use cases of AMRMClient(Async). The painful thing is 
 that the main non-daemon thread has to sit in a dummy loop to prevent AM 
 process exiting before all the tasks are done, while unregistration is 
 triggered on a separate another daemon thread by callback methods (in 
 particular when using AMRMClientAsync). IMHO, it should be beneficial to add 
 a waitFor method to AMRMClient(Async) to block the AM until unregistration or 
 user supplied check point, such that users don't need to write the loop 
 themselves.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2302) Refactor TimelineWebServices

2014-08-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092089#comment-14092089
 ] 

Hudson commented on YARN-2302:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1833 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1833/])
YARN-2302. Refactor TimelineWebServices. (Contributed by Zhijie Shen) 
(junping_du: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1617055)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/TimelineDataManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryClientService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java


 Refactor TimelineWebServices
 

 Key: YARN-2302
 URL: https://issues.apache.org/jira/browse/YARN-2302
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.6.0

 Attachments: YARN-2302.1.patch, YARN-2302.2.patch, YARN-2302.3.patch, 
 YARN-2302.4.patch


 Now TimelineWebServices contains non-trivial logic to process the HTTP 
 requests, manipulate the data, check the access, and interact with the 
 timeline store.
 I propose the move the data-oriented logic to a middle layer (so called 
 TimelineDataManager), and TimelineWebServices only processes the requests, 
 and call TimelineDataManager to complete the remaining tasks.
 By doing this, we make the generic history module reuse TimelineDataManager 
 internally (YARN-2033), invoking the putting/getting methods directly. 
 Otherwise, we have to send the HTTP requests to TimelineWebServices to query 
 the generic history data, which is not an efficient way.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2302) Refactor TimelineWebServices

2014-08-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092098#comment-14092098
 ] 

Hudson commented on YARN-2302:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1859 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1859/])
YARN-2302. Refactor TimelineWebServices. (Contributed by Zhijie Shen) 
(junping_du: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1617055)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/TimelineDataManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryClientService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java


 Refactor TimelineWebServices
 

 Key: YARN-2302
 URL: https://issues.apache.org/jira/browse/YARN-2302
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.6.0

 Attachments: YARN-2302.1.patch, YARN-2302.2.patch, YARN-2302.3.patch, 
 YARN-2302.4.patch


 Now TimelineWebServices contains non-trivial logic to process the HTTP 
 requests, manipulate the data, check the access, and interact with the 
 timeline store.
 I propose the move the data-oriented logic to a middle layer (so called 
 TimelineDataManager), and TimelineWebServices only processes the requests, 
 and call TimelineDataManager to complete the remaining tasks.
 By doing this, we make the generic history module reuse TimelineDataManager 
 internally (YARN-2033), invoking the putting/getting methods directly. 
 Otherwise, we have to send the HTTP requests to TimelineWebServices to query 
 the generic history data, which is not an efficient way.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2400) TestAMRestart fails intermittently

2014-08-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092097#comment-14092097
 ] 

Hudson commented on YARN-2400:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1859 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1859/])
YARN-2400. Fixed TestAMRestart fails intermittently. Contributed by Jian He: 
(xgong: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1617028)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java


 TestAMRestart fails intermittently
 --

 Key: YARN-2400
 URL: https://issues.apache.org/jira/browse/YARN-2400
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Fix For: 2.6.0

 Attachments: YARN-2240.2.patch, YARN-2400.1.patch


 java.lang.AssertionError: AppAttempt state is not correct (timedout) 
 expected:ALLOCATED but was:SCHEDULED
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:417)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAM(MockRM.java:579)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAndRegisterAM(MockRM.java:586)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart.testShouldNotCountFailureToMaxAttemptRetry(TestAMRestart.java:389)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1954) Add waitFor to AMRMClient(Async)

2014-08-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092099#comment-14092099
 ] 

Hudson commented on YARN-1954:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1859 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1859/])
YARN-1954. Added waitFor to AMRMClient(Async). Contributed by Tsuyoshi Ozawa. 
(zjshen: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1617002)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/AMRMClient.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/async/AMRMClientAsync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestAMRMClientAsync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java


 Add waitFor to AMRMClient(Async)
 

 Key: YARN-1954
 URL: https://issues.apache.org/jira/browse/YARN-1954
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: client
Affects Versions: 3.0.0, 2.4.0
Reporter: Zhijie Shen
Assignee: Tsuyoshi OZAWA
 Fix For: 2.6.0

 Attachments: YARN-1954.1.patch, YARN-1954.2.patch, YARN-1954.3.patch, 
 YARN-1954.4.patch, YARN-1954.4.patch, YARN-1954.5.patch, YARN-1954.6.patch, 
 YARN-1954.7.patch, YARN-1954.8.patch


 Recently, I saw some use cases of AMRMClient(Async). The painful thing is 
 that the main non-daemon thread has to sit in a dummy loop to prevent AM 
 process exiting before all the tasks are done, while unregistration is 
 triggered on a separate another daemon thread by callback methods (in 
 particular when using AMRMClientAsync). IMHO, it should be beneficial to add 
 a waitFor method to AMRMClient(Async) to block the AM until unregistration or 
 user supplied check point, such that users don't need to write the loop 
 themselves.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1915) ClientToAMTokenMasterKey should be provided to AM at launch time

2014-08-10 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-1915:
-

Attachment: YARN-1915.patch

We're starting to see this as well in our rollout of 2.x.  Attaching a patch 
that works around the issue by having the AM secret manager wait around for a 
bit before trying to validate a token if the master key isn't set yet.

Another approach we could try is to have the RM not advertise to clients where 
the AM is (i.e.: hide the host, port, and tracking URL) until the RM has seen 
at least one heartbeat after the AM registered.  The approach in this patch was 
easy to implement and probably just as effective in practice.

 ClientToAMTokenMasterKey should be provided to AM at launch time
 

 Key: YARN-1915
 URL: https://issues.apache.org/jira/browse/YARN-1915
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.2.0
Reporter: Hitesh Shah
Priority: Critical
 Attachments: YARN-1915.patch


 Currently, the AM receives the key as part of registration. This introduces a 
 race where a client can connect to the AM when the AM has not received the 
 key. 
 Current Flow:
 1) AM needs to start the client listening service in order to get host:port 
 and send it to the RM as part of registration
 2) RM gets the port info in register() and transitions the app to RUNNING. 
 Responds back with client secret to AM.
 3) User asks RM for client token. Gets it and pings the AM. AM hasn't 
 received client secret from RM and so RPC itself rejects the request.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-2402) NM restart: Container recovery for Windows

2014-08-10 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-2402:


 Summary: NM restart: Container recovery for Windows
 Key: YARN-2402
 URL: https://issues.apache.org/jira/browse/YARN-2402
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe


We should add container recovery for NM restart on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows

2014-08-10 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092185#comment-14092185
 ] 

Jason Lowe commented on YARN-2402:
--

See YARN-1337 for the changes needed to the container executors to handle this 
on UNIX/Linux.

 NM restart: Container recovery for Windows
 --

 Key: YARN-2402
 URL: https://issues.apache.org/jira/browse/YARN-2402
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe

 We should add container recovery for NM restart on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1915) ClientToAMTokenMasterKey should be provided to AM at launch time

2014-08-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092196#comment-14092196
 ] 

Hadoop QA commented on YARN-1915:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12660870/YARN-1915.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4580//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/4580//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4580//console

This message is automatically generated.

 ClientToAMTokenMasterKey should be provided to AM at launch time
 

 Key: YARN-1915
 URL: https://issues.apache.org/jira/browse/YARN-1915
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.2.0
Reporter: Hitesh Shah
Assignee: Jason Lowe
Priority: Critical
 Attachments: YARN-1915.patch


 Currently, the AM receives the key as part of registration. This introduces a 
 race where a client can connect to the AM when the AM has not received the 
 key. 
 Current Flow:
 1) AM needs to start the client listening service in order to get host:port 
 and send it to the RM as part of registration
 2) RM gets the port info in register() and transitions the app to RUNNING. 
 Responds back with client secret to AM.
 3) User asks RM for client token. Gets it and pings the AM. AM hasn't 
 received client secret from RM and so RPC itself rejects the request.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2308) NPE happened when RM restart after CapacityScheduler queue configuration changed

2014-08-10 Thread chang li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chang li updated YARN-2308:
---

Attachment: jira2308.patch

 NPE happened when RM restart after CapacityScheduler queue configuration 
 changed 
 -

 Key: YARN-2308
 URL: https://issues.apache.org/jira/browse/YARN-2308
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: chang li
Priority: Critical
 Attachments: jira2308.patch


 I encountered a NPE when RM restart
 {code}
 2014-07-16 07:22:46,957 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type APP_ATTEMPT_ADDED to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:566)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:922)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:594)
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:654)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:85)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:698)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:682)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
 at java.lang.Thread.run(Thread.java:744)
 {code}
 And RM will be failed to restart.
 This is caused by queue configuration changed, I removed some queues and 
 added new queues. So when RM restarts, it tries to recover history 
 applications, and when any of queues of these applications removed, NPE will 
 be raised.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2308) NPE happened when RM restart after CapacityScheduler queue configuration changed

2014-08-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092245#comment-14092245
 ] 

Hadoop QA commented on YARN-2308:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12660878/jira2308.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4581//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4581//console

This message is automatically generated.

 NPE happened when RM restart after CapacityScheduler queue configuration 
 changed 
 -

 Key: YARN-2308
 URL: https://issues.apache.org/jira/browse/YARN-2308
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: chang li
Priority: Critical
 Attachments: jira2308.patch


 I encountered a NPE when RM restart
 {code}
 2014-07-16 07:22:46,957 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type APP_ATTEMPT_ADDED to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:566)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:922)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:594)
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:654)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:85)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:698)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:682)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
 at java.lang.Thread.run(Thread.java:744)
 {code}
 And RM will be failed to restart.
 This is caused by queue configuration changed, I removed some queues and 
 added new queues. So when RM restarts, it tries to recover history 
 applications, and when any of queues of these applications removed, NPE will 
 be raised.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2315) Should use setCurrentCapacity instead of setCapacity to configure used resource capacity for FairScheduler.

2014-08-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092246#comment-14092246
 ] 

Karthik Kambatla commented on YARN-2315:


Thanks for catching this, [~zxu]. Mind adding a test case or augmenting 
existing tests to demonstrate the problem? 

 Should use setCurrentCapacity instead of setCapacity to configure used 
 resource capacity for FairScheduler.
 ---

 Key: YARN-2315
 URL: https://issues.apache.org/jira/browse/YARN-2315
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2315.patch


 Should use setCurrentCapacity instead of setCapacity to configure used 
 resource capacity for FairScheduler.
 In function getQueueInfo of FSQueue.java, we call setCapacity twice with 
 different parameters so the first call is overrode by the second call. 
 queueInfo.setCapacity((float) getFairShare().getMemory() /
 scheduler.getClusterResource().getMemory());
 queueInfo.setCapacity((float) getResourceUsage().getMemory() /
 scheduler.getClusterResource().getMemory());
 We should change the second setCapacity call to setCurrentCapacity to 
 configure the current used capacity.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2337) ResourceManager sets ClientRMService in RMContext multiple times

2014-08-10 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2337:
---

Priority: Trivial  (was: Minor)
 Summary: ResourceManager sets ClientRMService in RMContext multiple times  
(was: remove duplication function call (setClientRMService) in resource manage 
class)

 ResourceManager sets ClientRMService in RMContext multiple times
 

 Key: YARN-2337
 URL: https://issues.apache.org/jira/browse/YARN-2337
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Trivial
  Labels: newbie
 Attachments: YARN-2337.000.patch


 remove duplication function call (setClientRMService) in resource manage 
 class.
 rmContext.setClientRMService(clientRM); is duplicate in serviceInit of 
 ResourceManager. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2337) ResourceManager sets ClientRMService in RMContext multiple times

2014-08-10 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2337:
---

 Target Version/s: 2.6.0
Affects Version/s: 2.5.0
   Labels: newbie  (was: )

 ResourceManager sets ClientRMService in RMContext multiple times
 

 Key: YARN-2337
 URL: https://issues.apache.org/jira/browse/YARN-2337
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Trivial
  Labels: newbie
 Attachments: YARN-2337.000.patch


 remove duplication function call (setClientRMService) in resource manage 
 class.
 rmContext.setClientRMService(clientRM); is duplicate in serviceInit of 
 ResourceManager. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2337) ResourceManager sets ClientRMService in RMContext multiple times

2014-08-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092249#comment-14092249
 ] 

Karthik Kambatla commented on YARN-2337:


+1. Committing this. 

 ResourceManager sets ClientRMService in RMContext multiple times
 

 Key: YARN-2337
 URL: https://issues.apache.org/jira/browse/YARN-2337
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Trivial
  Labels: newbie
 Attachments: YARN-2337.000.patch


 remove duplication function call (setClientRMService) in resource manage 
 class.
 rmContext.setClientRMService(clientRM); is duplicate in serviceInit of 
 ResourceManager. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1337) Recover containers upon nodemanager restart

2014-08-10 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-1337:
-

Attachment: YARN-1337-v2.patch

Thanks for the comments, Junping!

bq. May be LOG.warn is a better option here?

Changed to a warning.

bq. What about msecLeft = 0? the logic get quit from while loop but not throw 
exception, better to be msecLeft = 0.

Good catch!  I changed it to msecLeft = 0.

bq. We should open a JIRA for this?

Filed YARN-2402 to track adding container recovery support for Windows.

bq. Again, what would happen if container get removed failed (and other 
actions, i.e. store, etc.)? 

If storeContainer fails then the corresponding container start request will 
also fail.

If storeContainerLaunched fails then the container launch process will fail and 
the container will be marked as failed.

If storeContainerKilled fails then the corresponding container kill request 
will also fail.

If storeContainerDiagnostics fails then we can lose prior diagnostic strings 
for a container upon restart but the container will continue.  This seems like 
a reasonable tradeoff, but we could change it to cause the store failure to 
also kill the container if deemed more desirable.

If removeContainer fails then the container will remain in the state store but 
be removed from the internal state.  That means we'll reload the completed 
container state upon restart, but this should be safe because we'll only track 
it as a completed container that will eventually be removed from memory by the 
NodeStatusUpdaterImpl the next time it scans for old containers.

bq. We mark NM port to be 0 for identifying if delayedRpcServerStart. Does this 
sound a little tricky? May be replace it with a new configuration?

No new config necessary.  Essentially the issue is that we need to delay 
starting the RPC server if we're recovering containers because client requests 
for containers being recovered can disrupt the recovery process.  I updated the 
code to try to make this more clear.

bq. Unnecessary change?

Removed.

bq. It could cause trouble here if we allow NM’s resource get changed (when 
YARN-291 get done) during NM restart. We may just remove the killing container 
code rather than move it to else where?

Good point.  I removed the container killing code and had the node update 
itself with any new resource total and http address in case those were updated 
as part of the NM restart.  I also had to fix a bug where the CapacityScheduler 
wasn't updating queue metrics when a node's resources changed during a status 
update.

 Recover containers upon nodemanager restart
 ---

 Key: YARN-1337
 URL: https://issues.apache.org/jira/browse/YARN-1337
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1337-v1.patch, YARN-1337-v2.patch


 To support work-preserving NM restart we need to recover the state of the 
 containers when the nodemanager went down.  This includes informing the RM of 
 containers that have exited in the interim and a strategy for dealing with 
 the exit codes from those containers along with how to reacquire the active 
 containers and determine their exit codes when they terminate.  The state of 
 finished containers also needs to be recovered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-2403) TestNodeManagerResync fails occasionally in trunk

2014-08-10 Thread Ted Yu (JIRA)
Ted Yu created YARN-2403:


 Summary: TestNodeManagerResync fails occasionally in trunk
 Key: YARN-2403
 URL: https://issues.apache.org/jira/browse/YARN-2403
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor


From  https://builds.apache.org/job/Hadoop-Yarn-trunk/640/ :
{code}
  
TestNodeManagerResync.testKillContainersOnResync:112-testContainerPreservationOnResyncImpl:146
 expected:2 but was:1
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2337) ResourceManager sets ClientRMService in RMContext multiple times

2014-08-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092295#comment-14092295
 ] 

Hudson commented on YARN-2337:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6046 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6046/])
YARN-2337. ResourceManager sets ClientRMService in RMContext multiple times. 
(Zhihai Xu via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1617183)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java


 ResourceManager sets ClientRMService in RMContext multiple times
 

 Key: YARN-2337
 URL: https://issues.apache.org/jira/browse/YARN-2337
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Trivial
  Labels: newbie
 Fix For: 2.6.0

 Attachments: YARN-2337.000.patch


 remove duplication function call (setClientRMService) in resource manage 
 class.
 rmContext.setClientRMService(clientRM); is duplicate in serviceInit of 
 ResourceManager. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2308) NPE happened when RM restart after CapacityScheduler queue configuration changed

2014-08-10 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092301#comment-14092301
 ] 

Wangda Tan commented on YARN-2308:
--

[~lichangleo],
Thanks for working on this,
I took a quick scan at your patch, I think the general approach should be fine. 
Some minor suggestions:
1) 
{code}
+if (application==null) {
+  LOG.info(can't retireve application attempt);
+  return;
+}
{code}
Please leave a space before and after ==,
Use LOG.error instead of info

2) Test code
2.1
bq. +System.out.println(testing queue change!!!);
Remove this plz,

2.2
{code}
+conf.setBoolean(CapacitySchedulerConfiguration.ENABLE_USER_METRICS, true);
+conf.set(CapacitySchedulerConfiguration.RESOURCE_CALCULATOR_CLASS,
{code}
We may not need this too

2.3
{code}
+// clear queue metrics
+rm1.clearQueueMetrics(app1);
{code}
Also this

2.4
It's better to wait and check for app state transition to Failed after it 
rejected

2.5
I think this test isn't work-preserving restart specific problem, it's better 
to place the test in TestRMRestart

Please let me know if you have any comment on them.

Thanks,
Wangda




 NPE happened when RM restart after CapacityScheduler queue configuration 
 changed 
 -

 Key: YARN-2308
 URL: https://issues.apache.org/jira/browse/YARN-2308
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: chang li
Priority: Critical
 Attachments: jira2308.patch


 I encountered a NPE when RM restart
 {code}
 2014-07-16 07:22:46,957 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type APP_ATTEMPT_ADDED to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:566)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:922)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:594)
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:654)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:85)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:698)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:682)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
 at java.lang.Thread.run(Thread.java:744)
 {code}
 And RM will be failed to restart.
 This is caused by queue configuration changed, I removed some queues and 
 added new queues. So when RM restarts, it tries to recover history 
 applications, and when any of queues of these applications removed, NPE will 
 be raised.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-415) Capture memory utilization at the app-level for chargeback

2014-08-10 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092303#comment-14092303
 ] 

Wangda Tan commented on YARN-415:
-

[~eepayne],
bq. I created a common method that both of these call.
Thanks!

bq. I also noticed that testUsageWithMultipleContainers was doing similar 
things to testUsageAfterRMRestart, so I combined them both into 
testUsageWithMultipleContainersAndRMRestart.
Good catch,

I don't have further comments, but would you please check test failure above? 

Thanks,
Wangda

 Capture memory utilization at the app-level for chargeback
 --

 Key: YARN-415
 URL: https://issues.apache.org/jira/browse/YARN-415
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Affects Versions: 0.23.6
Reporter: Kendall Thrapp
Assignee: Andrey Klochkov
 Attachments: YARN-415--n10.patch, YARN-415--n2.patch, 
 YARN-415--n3.patch, YARN-415--n4.patch, YARN-415--n5.patch, 
 YARN-415--n6.patch, YARN-415--n7.patch, YARN-415--n8.patch, 
 YARN-415--n9.patch, YARN-415.201405311749.txt, YARN-415.201406031616.txt, 
 YARN-415.201406262136.txt, YARN-415.201407042037.txt, 
 YARN-415.201407071542.txt, YARN-415.201407171553.txt, 
 YARN-415.201407172144.txt, YARN-415.201407232237.txt, 
 YARN-415.201407242148.txt, YARN-415.201407281816.txt, 
 YARN-415.201408062232.txt, YARN-415.201408080204.txt, 
 YARN-415.201408092006.txt, YARN-415.patch


 For the purpose of chargeback, I'd like to be able to compute the cost of an
 application in terms of cluster resource usage.  To start out, I'd like to 
 get the memory utilization of an application.  The unit should be MB-seconds 
 or something similar and, from a chargeback perspective, the memory amount 
 should be the memory reserved for the application, as even if the app didn't 
 use all that memory, no one else was able to use it.
 (reserved ram for container 1 * lifetime of container 1) + (reserved ram for
 container 2 * lifetime of container 2) + ... + (reserved ram for container n 
 * lifetime of container n)
 It'd be nice to have this at the app level instead of the job level because:
 1. We'd still be able to get memory usage for jobs that crashed (and wouldn't 
 appear on the job history server).
 2. We'd be able to get memory usage for future non-MR jobs (e.g. Storm).
 This new metric should be available both through the RM UI and RM Web 
 Services REST API.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2361) remove duplicate entries (EXPIRE event) in the EnumSet of event type in RMAppAttempt state machine

2014-08-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092321#comment-14092321
 ] 

Karthik Kambatla commented on YARN-2361:


+1. Checking this in. 

 remove duplicate entries (EXPIRE event) in the EnumSet of event type in 
 RMAppAttempt state machine
 --

 Key: YARN-2361
 URL: https://issues.apache.org/jira/browse/YARN-2361
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Trivial
 Attachments: YARN-2361.000.patch


 remove duplicate entries in the EnumSet of event type in RMAppAttempt state 
 machine. The  event RMAppAttemptEventType.EXPIRE is duplicated in the 
 following code.
 {code}
   EnumSet.of(RMAppAttemptEventType.ATTEMPT_ADDED,
   RMAppAttemptEventType.EXPIRE,
   RMAppAttemptEventType.LAUNCHED,
   RMAppAttemptEventType.LAUNCH_FAILED,
   RMAppAttemptEventType.EXPIRE,
   RMAppAttemptEventType.REGISTERED,
   RMAppAttemptEventType.CONTAINER_ALLOCATED,
   RMAppAttemptEventType.UNREGISTERED,
   RMAppAttemptEventType.KILL,
   RMAppAttemptEventType.STATUS_UPDATE))
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2361) remove duplicate entries (EXPIRE event) in the EnumSet of event type in RMAppAttempt state machine

2014-08-10 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2361:
---

 Priority: Trivial  (was: Minor)
 Target Version/s: 2.6.0
Affects Version/s: 2.5.0
 Assignee: zhihai xu

 remove duplicate entries (EXPIRE event) in the EnumSet of event type in 
 RMAppAttempt state machine
 --

 Key: YARN-2361
 URL: https://issues.apache.org/jira/browse/YARN-2361
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Trivial
 Attachments: YARN-2361.000.patch


 remove duplicate entries in the EnumSet of event type in RMAppAttempt state 
 machine. The  event RMAppAttemptEventType.EXPIRE is duplicated in the 
 following code.
 {code}
   EnumSet.of(RMAppAttemptEventType.ATTEMPT_ADDED,
   RMAppAttemptEventType.EXPIRE,
   RMAppAttemptEventType.LAUNCHED,
   RMAppAttemptEventType.LAUNCH_FAILED,
   RMAppAttemptEventType.EXPIRE,
   RMAppAttemptEventType.REGISTERED,
   RMAppAttemptEventType.CONTAINER_ALLOCATED,
   RMAppAttemptEventType.UNREGISTERED,
   RMAppAttemptEventType.KILL,
   RMAppAttemptEventType.STATUS_UPDATE))
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2361) RMAppAttempt state machine entries for KILLED state has duplicate event entries

2014-08-10 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2361:
---

Summary: RMAppAttempt state machine entries for KILLED state has duplicate 
event entries  (was: remove duplicate entries (EXPIRE event) in the EnumSet of 
event type in RMAppAttempt state machine)

 RMAppAttempt state machine entries for KILLED state has duplicate event 
 entries
 ---

 Key: YARN-2361
 URL: https://issues.apache.org/jira/browse/YARN-2361
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Trivial
 Attachments: YARN-2361.000.patch


 remove duplicate entries in the EnumSet of event type in RMAppAttempt state 
 machine. The  event RMAppAttemptEventType.EXPIRE is duplicated in the 
 following code.
 {code}
   EnumSet.of(RMAppAttemptEventType.ATTEMPT_ADDED,
   RMAppAttemptEventType.EXPIRE,
   RMAppAttemptEventType.LAUNCHED,
   RMAppAttemptEventType.LAUNCH_FAILED,
   RMAppAttemptEventType.EXPIRE,
   RMAppAttemptEventType.REGISTERED,
   RMAppAttemptEventType.CONTAINER_ALLOCATED,
   RMAppAttemptEventType.UNREGISTERED,
   RMAppAttemptEventType.KILL,
   RMAppAttemptEventType.STATUS_UPDATE))
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2361) RMAppAttempt state machine entries for KILLED state has duplicate event entries

2014-08-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092332#comment-14092332
 ] 

Hudson commented on YARN-2361:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6047 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6047/])
YARN-2361. RMAppAttempt state machine entries for KILLED state has duplicate 
event entries. (Zhihai Xu via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1617190)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java


 RMAppAttempt state machine entries for KILLED state has duplicate event 
 entries
 ---

 Key: YARN-2361
 URL: https://issues.apache.org/jira/browse/YARN-2361
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Trivial
 Fix For: 2.6.0

 Attachments: YARN-2361.000.patch


 remove duplicate entries in the EnumSet of event type in RMAppAttempt state 
 machine. The  event RMAppAttemptEventType.EXPIRE is duplicated in the 
 following code.
 {code}
   EnumSet.of(RMAppAttemptEventType.ATTEMPT_ADDED,
   RMAppAttemptEventType.EXPIRE,
   RMAppAttemptEventType.LAUNCHED,
   RMAppAttemptEventType.LAUNCH_FAILED,
   RMAppAttemptEventType.EXPIRE,
   RMAppAttemptEventType.REGISTERED,
   RMAppAttemptEventType.CONTAINER_ALLOCATED,
   RMAppAttemptEventType.UNREGISTERED,
   RMAppAttemptEventType.KILL,
   RMAppAttemptEventType.STATUS_UPDATE))
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2277) Add Cross-Origin support to the ATS REST API

2014-08-10 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-2277:
--

Attachment: YARN-2277-v5.patch

Addressing findbugs warning

 Add Cross-Origin support to the ATS REST API
 

 Key: YARN-2277
 URL: https://issues.apache.org/jira/browse/YARN-2277
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Attachments: YARN-2277-CORS.patch, YARN-2277-JSONP.patch, 
 YARN-2277-v2.patch, YARN-2277-v3.patch, YARN-2277-v3.patch, 
 YARN-2277-v4.patch, YARN-2277-v5.patch


 As the Application Timeline Server is not provided with built-in UI, it may 
 make sense to enable JSONP or CORS Rest API capabilities to allow for remote 
 UI to access the data directly via javascript without cross side server 
 browser blocks coming into play.
 Example client may be like
 http://api.jquery.com/jQuery.getJSON/ 
 This can alleviate the need to create a local proxy cache.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1729) TimelineWebServices always passes primary and secondary filters as strings

2014-08-10 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1729:
--

Assignee: Billie Rinaldi  (was: Leitao Guo)

 TimelineWebServices always passes primary and secondary filters as strings
 --

 Key: YARN-1729
 URL: https://issues.apache.org/jira/browse/YARN-1729
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi
 Fix For: 2.4.0

 Attachments: YARN-1729.1.patch, YARN-1729.2.patch, YARN-1729.3.patch, 
 YARN-1729.4.patch, YARN-1729.5.patch, YARN-1729.6.patch, YARN-1729.7.patch


 Primary filters and secondary filter values can be arbitrary json-compatible 
 Object.  The web services should determine if the filters specified as query 
 parameters are objects or strings before passing them to the store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2138) Cleanup notifyDone* methods in RMStateStore

2014-08-10 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092413#comment-14092413
 ] 

Jian He commented on YARN-2138:
---

seems RMAppUpdatedSavedEvent, RMAppNewSavedEvent etc. are empty files , can you 
remove them ?

 Cleanup notifyDone* methods in RMStateStore
 ---

 Key: YARN-2138
 URL: https://issues.apache.org/jira/browse/YARN-2138
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Varun Saxena
 Attachments: YARN-2138.002.patch, YARN-2138.003.patch, YARN-2138.patch


 The storedException passed into notifyDoneStoringApplication is always null. 
 Similarly for other notifyDone* methods. We can clean up these methods as 
 this control flow path is not used anymore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1954) Add waitFor to AMRMClient(Async)

2014-08-10 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092425#comment-14092425
 ] 

Tsuyoshi OZAWA commented on YARN-1954:
--

Thank you for your review and comments, Zhijie!

 Add waitFor to AMRMClient(Async)
 

 Key: YARN-1954
 URL: https://issues.apache.org/jira/browse/YARN-1954
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: client
Affects Versions: 3.0.0, 2.4.0
Reporter: Zhijie Shen
Assignee: Tsuyoshi OZAWA
 Fix For: 2.6.0

 Attachments: YARN-1954.1.patch, YARN-1954.2.patch, YARN-1954.3.patch, 
 YARN-1954.4.patch, YARN-1954.4.patch, YARN-1954.5.patch, YARN-1954.6.patch, 
 YARN-1954.7.patch, YARN-1954.8.patch


 Recently, I saw some use cases of AMRMClient(Async). The painful thing is 
 that the main non-daemon thread has to sit in a dummy loop to prevent AM 
 process exiting before all the tasks are done, while unregistration is 
 triggered on a separate another daemon thread by callback methods (in 
 particular when using AMRMClientAsync). IMHO, it should be beneficial to add 
 a waitFor method to AMRMClient(Async) to block the AM until unregistration or 
 user supplied check point, such that users don't need to write the loop 
 themselves.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-2404) Cleanup ApplicationAttemptState and ApplicationState in RMStateStore

2014-08-10 Thread Jian He (JIRA)
Jian He created YARN-2404:
-

 Summary: Cleanup ApplicationAttemptState and ApplicationState in 
RMStateStore 
 Key: YARN-2404
 URL: https://issues.apache.org/jira/browse/YARN-2404
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He


We can remove ApplicationState and ApplicationAttemptState class in 
RMStateStore, given that we already have ApplicationStateData and 
ApplicationAttemptStateData records. we can just replace ApplicationState with 
ApplicationStateData, similarly for ApplicationAttemptState.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2404) Remove ApplicationAttemptState and ApplicationState class in RMStateStore class

2014-08-10 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2404:
--

Description: We can remove ApplicationState and ApplicationAttemptState 
class in RMStateStore, given that we already have ApplicationStateData and 
ApplicationAttemptStateData records. we may just replace ApplicationState with 
ApplicationStateData, similarly for ApplicationAttemptState.  (was: We can 
remove ApplicationState and ApplicationAttemptState class in RMStateStore, 
given that we already have ApplicationStateData and ApplicationAttemptStateData 
records. we can just replace ApplicationState with ApplicationStateData, 
similarly for ApplicationAttemptState.)
Summary: Remove ApplicationAttemptState and ApplicationState class in 
RMStateStore class   (was: Cleanup ApplicationAttemptState and ApplicationState 
in RMStateStore )

 Remove ApplicationAttemptState and ApplicationState class in RMStateStore 
 class 
 

 Key: YARN-2404
 URL: https://issues.apache.org/jira/browse/YARN-2404
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He

 We can remove ApplicationState and ApplicationAttemptState class in 
 RMStateStore, given that we already have ApplicationStateData and 
 ApplicationAttemptStateData records. we may just replace ApplicationState 
 with ApplicationStateData, similarly for ApplicationAttemptState.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2138) Cleanup notifyDone* methods in RMStateStore

2014-08-10 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092443#comment-14092443
 ] 

Varun Saxena commented on YARN-2138:


[~jianhe], I have removed these files in the patch. To verify, I applied the 
patch(YARN-2138.003.patch) to code downloaded from trunk and find the above 
mentioned files getting deleted. So, this patch should work.
I used SVN delete to delete the files. Let me know if something else needs to 
be done.

Can you verify the patch once more ? If you are still facing issues,  I will 
generate a new patch.

 Cleanup notifyDone* methods in RMStateStore
 ---

 Key: YARN-2138
 URL: https://issues.apache.org/jira/browse/YARN-2138
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Varun Saxena
 Attachments: YARN-2138.002.patch, YARN-2138.003.patch, YARN-2138.patch


 The storedException passed into notifyDoneStoringApplication is always null. 
 Similarly for other notifyDone* methods. We can clean up these methods as 
 this control flow path is not used anymore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2138) Cleanup notifyDone* methods in RMStateStore

2014-08-10 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092458#comment-14092458
 ] 

Jian He commented on YARN-2138:
---

Varun, I tried to apply the patch in both git  and svn repository with patch 
-p0, the files still remain but just that they are empty. Do you mind creating 
a new patch? the patch seems conflicting with trunk again.

 Cleanup notifyDone* methods in RMStateStore
 ---

 Key: YARN-2138
 URL: https://issues.apache.org/jira/browse/YARN-2138
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Varun Saxena
 Attachments: YARN-2138.002.patch, YARN-2138.003.patch, YARN-2138.patch


 The storedException passed into notifyDoneStoringApplication is always null. 
 Similarly for other notifyDone* methods. We can clean up these methods as 
 this control flow path is not used anymore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)