[jira] [Commented] (YARN-2529) Generic history service RPC interface doesn't work when service authorization is enabled

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131147#comment-14131147
 ] 

Hadoop QA commented on YARN-2529:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668266/YARN-2529.2.patch
  against trunk revision 5633da2.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4912//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4912//console

This message is automatically generated.

 Generic history service RPC interface doesn't work when service authorization 
 is enabled
 

 Key: YARN-2529
 URL: https://issues.apache.org/jira/browse/YARN-2529
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2529.1.patch, YARN-2529.2.patch


 Here's the problem shown in the log:
 {code}
 14/09/10 10:42:44 INFO ipc.Server: Connection from 10.22.2.109:55439 for 
 protocol org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB is 
 unauthorized for user zshen (auth:SIMPLE)
 14/09/10 10:42:44 INFO ipc.Server: Socket Reader #1 for port 10200: 
 readAndProcess from client 10.22.2.109 threw exception 
 [org.apache.hadoop.security.authorize.AuthorizationException: Protocol 
 interface org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB is not 
 known.]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2542) yarn application -status appId throws NPE when retrieving the app from the timelineserver

2014-09-12 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-2542:
--
Attachment: YARN-2542.1.patch

Upload a patch to fix the bug.

 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver
 -

 Key: YARN-2542
 URL: https://issues.apache.org/jira/browse/YARN-2542
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2542.1.patch


 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver. It's broken by YARN-415. When app is finished, there's no 
 usageReport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2542) yarn application -status appId throws NPE when retrieving the app from the timelineserver

2014-09-12 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-2542:
--
Attachment: (was: YARN-2542.2.patch)

 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver
 -

 Key: YARN-2542
 URL: https://issues.apache.org/jira/browse/YARN-2542
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2542.1.patch


 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver. It's broken by YARN-415. When app is finished, there's no 
 usageReport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2542) yarn application -status appId throws NPE when retrieving the app from the timelineserver

2014-09-12 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-2542:
--
Attachment: YARN-2542.2.patch

Upload a new patch:

1. Fix 80 char line limit.
2. Update the comment.

The long term fix is to record the attempt's resource usage as well. Will file 
a ticket to trace the issue.

 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver
 -

 Key: YARN-2542
 URL: https://issues.apache.org/jira/browse/YARN-2542
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2542.1.patch


 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver. It's broken by YARN-415. When app is finished, there's no 
 usageReport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2542) yarn application -status appId throws NPE when retrieving the app from the timelineserver

2014-09-12 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-2542:
--
Attachment: YARN-2542.2.patch

 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver
 -

 Key: YARN-2542
 URL: https://issues.apache.org/jira/browse/YARN-2542
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2542.1.patch, YARN-2542.2.patch


 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver. It's broken by YARN-415. When app is finished, there's no 
 usageReport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2542) yarn application -status appId throws NPE when retrieving the app from the timelineserver

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131188#comment-14131188
 ] 

Hadoop QA commented on YARN-2542:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668280/YARN-2542.1.patch
  against trunk revision 469ea3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4913//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4913//console

This message is automatically generated.

 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver
 -

 Key: YARN-2542
 URL: https://issues.apache.org/jira/browse/YARN-2542
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2542.1.patch, YARN-2542.2.patch


 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver. It's broken by YARN-415. When app is finished, there's no 
 usageReport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2543) Resource usage should be published to the timeline server as well

2014-09-12 Thread Zhijie Shen (JIRA)
Zhijie Shen created YARN-2543:
-

 Summary: Resource usage should be published to the timeline server 
as well
 Key: YARN-2543
 URL: https://issues.apache.org/jira/browse/YARN-2543
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen


RM will include the resource usage in the app report, but generic history 
service doesn't, because RM doesn't publish this data to the timeline server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2542) yarn application -status appId throws NPE when retrieving the app from the timelineserver

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131219#comment-14131219
 ] 

Hadoop QA commented on YARN-2542:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668289/YARN-2542.2.patch
  against trunk revision 469ea3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4914//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4914//console

This message is automatically generated.

 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver
 -

 Key: YARN-2542
 URL: https://issues.apache.org/jira/browse/YARN-2542
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2542.1.patch, YARN-2542.2.patch


 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver. It's broken by YARN-415. When app is finished, there's no 
 usageReport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2452) TestRMApplicationHistoryWriter is failed for FairScheduler

2014-09-12 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2452:

Attachment: YARN-2452.002.patch

 TestRMApplicationHistoryWriter is failed for FairScheduler
 --

 Key: YARN-2452
 URL: https://issues.apache.org/jira/browse/YARN-2452
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2452.000.patch, YARN-2452.001.patch, 
 YARN-2452.002.patch


 TestRMApplicationHistoryWriter is failed for FairScheduler. The failure is 
 the following:
 T E S T S
 ---
 Running 
 org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 69.311 sec 
  FAILURE! - in 
 org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter
 testRMWritingMassiveHistory(org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter)
   Time elapsed: 66.261 sec   FAILURE!
 java.lang.AssertionError: expected:1 but was:200
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter.testRMWritingMassiveHistory(TestRMApplicationHistoryWriter.java:430)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter.testRMWritingMassiveHistory(TestRMApplicationHistoryWriter.java:391)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2452) TestRMApplicationHistoryWriter is failed for FairScheduler

2014-09-12 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131259#comment-14131259
 ] 

zhihai xu commented on YARN-2452:
-

I uploaded a new patch YARN-2452.002.patch which use 
FairSchedulerConfiguration.ASSIGN_MULTIPLE and make 
FairSchedulerConfiguration.ASSIGN_MULTIPLE public. Please review it.
thanks


 TestRMApplicationHistoryWriter is failed for FairScheduler
 --

 Key: YARN-2452
 URL: https://issues.apache.org/jira/browse/YARN-2452
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2452.000.patch, YARN-2452.001.patch, 
 YARN-2452.002.patch


 TestRMApplicationHistoryWriter is failed for FairScheduler. The failure is 
 the following:
 T E S T S
 ---
 Running 
 org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 69.311 sec 
  FAILURE! - in 
 org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter
 testRMWritingMassiveHistory(org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter)
   Time elapsed: 66.261 sec   FAILURE!
 java.lang.AssertionError: expected:1 but was:200
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter.testRMWritingMassiveHistory(TestRMApplicationHistoryWriter.java:430)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter.testRMWritingMassiveHistory(TestRMApplicationHistoryWriter.java:391)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2452) TestRMApplicationHistoryWriter is failed for FairScheduler

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131317#comment-14131317
 ] 

Hadoop QA commented on YARN-2452:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668301/YARN-2452.002.patch
  against trunk revision 469ea3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4915//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4915//console

This message is automatically generated.

 TestRMApplicationHistoryWriter is failed for FairScheduler
 --

 Key: YARN-2452
 URL: https://issues.apache.org/jira/browse/YARN-2452
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2452.000.patch, YARN-2452.001.patch, 
 YARN-2452.002.patch


 TestRMApplicationHistoryWriter is failed for FairScheduler. The failure is 
 the following:
 T E S T S
 ---
 Running 
 org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 69.311 sec 
  FAILURE! - in 
 org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter
 testRMWritingMassiveHistory(org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter)
   Time elapsed: 66.261 sec   FAILURE!
 java.lang.AssertionError: expected:1 but was:200
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter.testRMWritingMassiveHistory(TestRMApplicationHistoryWriter.java:430)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter.testRMWritingMassiveHistory(TestRMApplicationHistoryWriter.java:391)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2544) [YARN-796] Common server side PB changes (not include user API PB changes)

2014-09-12 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-2544:


 Summary: [YARN-796] Common server side PB changes (not include 
user API PB changes)
 Key: YARN-2544
 URL: https://issues.apache.org/jira/browse/YARN-2544
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2545) RMApp should transit to FAILED when AM calls finishApplicationMaster with FAILED

2014-09-12 Thread Hong Zhiguo (JIRA)
Hong Zhiguo created YARN-2545:
-

 Summary: RMApp should transit to FAILED when AM calls 
finishApplicationMaster with FAILED
 Key: YARN-2545
 URL: https://issues.apache.org/jira/browse/YARN-2545
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hong Zhiguo
Assignee: Hong Zhiguo
Priority: Minor


If AM calls finishApplicationMaster with getFinalApplicationStatus()==FAILED, 
and then exits, the corresponding RMApp and RMAppAttempt transit to state 
FINISHED.

I think this is wrong and confusing. On RM WebUI, this application is displayed 
as State=FINISHED, FinalStatus=FAILED, and is counted as Apps Completed, 
not as Apps Failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2538) Add logs when RM send new AMRMToken to ApplicationMaster

2014-09-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131404#comment-14131404
 ] 

Hudson commented on YARN-2538:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #678 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/678/])
YARN-2538. Added logs when RM sends roll-overed AMRMToken to AM. Contributed by 
Xuan Gong. (zjshen: rev 469ea3dcef6e427d02fd08b859b2789cc25189f9)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* hadoop-yarn-project/CHANGES.txt


 Add logs when RM send new AMRMToken to ApplicationMaster
 

 Key: YARN-2538
 URL: https://issues.apache.org/jira/browse/YARN-2538
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Xuan Gong
Assignee: Xuan Gong
 Fix For: 2.6.0

 Attachments: YARN-2538.1.patch, YARN-2538.1.patch


 This is for testing/debugging purpose 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2541) Fix ResourceManagerRest.apt.vm syntax error

2014-09-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131399#comment-14131399
 ] 

Hudson commented on YARN-2541:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #678 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/678/])
YARN-2541. Fixed ResourceManagerRest.apt.vm table syntax error. Contributed by 
Jian He (jianhe: rev 5633da2a018efcfac03cc1dd65af79bce2f1a11b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerRest.apt.vm
* hadoop-yarn-project/CHANGES.txt


 Fix ResourceManagerRest.apt.vm syntax error
 ---

 Key: YARN-2541
 URL: https://issues.apache.org/jira/browse/YARN-2541
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Fix For: 2.6.0

 Attachments: YARN-2541.1.patch


 the incorrect table syntax somehow causes hadoop-yarn-site intermittent build 
 failure as in https://jira.codehaus.org/browse/DOXIA-453



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2033) Merging generic-history into the Timeline Store

2014-09-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131403#comment-14131403
 ] 

Hudson commented on YARN-2033:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #678 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/678/])
YARN-2033. Merging generic-history into the Timeline Store (Contributed by 
Zhijie Shen) (junping_du: rev 6b8b1608e64e300e4e1d23c60476febaca29ca38)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/ContainerFinishedEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/ContainerBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/MockAsm.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/WebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryManagerOnTimelineStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/metrics/AppAttemptMetricsConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/dao/AppInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryClientService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/metrics/ContainerMetricsConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TestSystemMetricsPublisher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerOnTimelineStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppsBlock.java
* 

[jira] [Commented] (YARN-2534) FairScheduler: Potential integer overflow calculating totalMaxShare

2014-09-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131394#comment-14131394
 ] 

Hudson commented on YARN-2534:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #678 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/678/])
YARN-2534. FairScheduler: Potential integer overflow calculating totalMaxShare. 
(Zhihai Xu via kasha) (kasha: rev c11ada5ea6d17321626e5a9a4152ff857d03aee2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/ComputeFairShares.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


 FairScheduler: Potential integer overflow calculating totalMaxShare
 ---

 Key: YARN-2534
 URL: https://issues.apache.org/jira/browse/YARN-2534
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.6.0

 Attachments: YARN-2534.000.patch


 FairScheduler: totalMaxShare is not calculated correctly in 
 computeSharesInternal for some cases.
 If the sum of MAX share of all Schedulables is more than Integer.MAX_VALUE 
 ,but each individual MAX share is not equal to Integer.MAX_VALUE. then 
 totalMaxShare will be a negative value, which will cause all fairShare are 
 wrongly calculated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2546) REST API for application creation/submission is using strings for numeric boolean values

2014-09-12 Thread Doug Haigh (JIRA)
Doug Haigh created YARN-2546:


 Summary: REST API for application creation/submission is using 
strings for numeric  boolean values
 Key: YARN-2546
 URL: https://issues.apache.org/jira/browse/YARN-2546
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api
Affects Versions: 2.5.1
Reporter: Doug Haigh


When YARN responds with or accepts JSON, numbers  booleans are being 
represented as strings which can cause parsing problems.

Resource values look like 

{
  application-id:application_1404198295326_0001,
  maximum-resource-capability:
   {
  memory:8192,
  vCores:32
   }
}

Instead of

{
  application-id:application_1404198295326_0001,
  maximum-resource-capability:
   {
  memory:8192,
  vCores:32
   }
}

When I POST to start a job, numeric values are represented as numbers:

  local-resources:
  {
entry:
[
  {
key:AppMaster.jar,
value:
{
  
resource:hdfs://hdfs-namenode:9000/user/testuser/DistributedShell/demo-app/AppMaster.jar,
  type:FILE,
  visibility:APPLICATION,
  size: 43004,
  timestamp: 1405452071209
}
  }
]
  },

Instead of

  local-resources:
  {
entry:
[
  {
key:AppMaster.jar,
value:
{
  
resource:hdfs://hdfs-namenode:9000/user/testuser/DistributedShell/demo-app/AppMaster.jar,
  type:FILE,
  visibility:APPLICATION,
  size: 43004,
  timestamp: 1405452071209
}
  }
]
  },

Similarly, Boolean values are also represented as strings:

keep-containers-across-application-attempts:false

Instead of 

keep-containers-across-application-attempts:false




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-415) Capture aggregate memory allocation at the app-level for chargeback

2014-09-12 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131522#comment-14131522
 ] 

Eric Payne commented on YARN-415:
-

I would like to express my thanks to [~aklochkov], [~jianhe], [~leftnoteasy], 
[~kkambatl], [~sandyr], and [~jlowe]. It was a team effort, and I appreciate 
all of the great help you have given on this feature.

 Capture aggregate memory allocation at the app-level for chargeback
 ---

 Key: YARN-415
 URL: https://issues.apache.org/jira/browse/YARN-415
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Affects Versions: 2.5.0
Reporter: Kendall Thrapp
Assignee: Eric Payne
 Fix For: 2.6.0

 Attachments: YARN-415--n10.patch, YARN-415--n2.patch, 
 YARN-415--n3.patch, YARN-415--n4.patch, YARN-415--n5.patch, 
 YARN-415--n6.patch, YARN-415--n7.patch, YARN-415--n8.patch, 
 YARN-415--n9.patch, YARN-415.201405311749.txt, YARN-415.201406031616.txt, 
 YARN-415.201406262136.txt, YARN-415.201407042037.txt, 
 YARN-415.201407071542.txt, YARN-415.201407171553.txt, 
 YARN-415.201407172144.txt, YARN-415.201407232237.txt, 
 YARN-415.201407242148.txt, YARN-415.201407281816.txt, 
 YARN-415.201408062232.txt, YARN-415.201408080204.txt, 
 YARN-415.201408092006.txt, YARN-415.201408132109.txt, 
 YARN-415.201408150030.txt, YARN-415.201408181938.txt, 
 YARN-415.201408181938.txt, YARN-415.201408212033.txt, 
 YARN-415.201409040036.txt, YARN-415.201409092204.txt, 
 YARN-415.201409102216.txt, YARN-415.patch


 For the purpose of chargeback, I'd like to be able to compute the cost of an
 application in terms of cluster resource usage.  To start out, I'd like to 
 get the memory utilization of an application.  The unit should be MB-seconds 
 or something similar and, from a chargeback perspective, the memory amount 
 should be the memory reserved for the application, as even if the app didn't 
 use all that memory, no one else was able to use it.
 (reserved ram for container 1 * lifetime of container 1) + (reserved ram for
 container 2 * lifetime of container 2) + ... + (reserved ram for container n 
 * lifetime of container n)
 It'd be nice to have this at the app level instead of the job level because:
 1. We'd still be able to get memory usage for jobs that crashed (and wouldn't 
 appear on the job history server).
 2. We'd be able to get memory usage for future non-MR jobs (e.g. Storm).
 This new metric should be available both through the RM UI and RM Web 
 Services REST API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2534) FairScheduler: Potential integer overflow calculating totalMaxShare

2014-09-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131531#comment-14131531
 ] 

Hudson commented on YARN-2534:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1894 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1894/])
YARN-2534. FairScheduler: Potential integer overflow calculating totalMaxShare. 
(Zhihai Xu via kasha) (kasha: rev c11ada5ea6d17321626e5a9a4152ff857d03aee2)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/ComputeFairShares.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


 FairScheduler: Potential integer overflow calculating totalMaxShare
 ---

 Key: YARN-2534
 URL: https://issues.apache.org/jira/browse/YARN-2534
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.6.0

 Attachments: YARN-2534.000.patch


 FairScheduler: totalMaxShare is not calculated correctly in 
 computeSharesInternal for some cases.
 If the sum of MAX share of all Schedulables is more than Integer.MAX_VALUE 
 ,but each individual MAX share is not equal to Integer.MAX_VALUE. then 
 totalMaxShare will be a negative value, which will cause all fairShare are 
 wrongly calculated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2033) Merging generic-history into the Timeline Store

2014-09-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131540#comment-14131540
 ] 

Hudson commented on YARN-2033:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1894 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1894/])
YARN-2033. Merging generic-history into the Timeline Store (Contributed by 
Zhijie Shen) (junping_du: rev 6b8b1608e64e300e4e1d23c60476febaca29ca38)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/ContainerFinishedEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAHSClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/metrics/ApplicationMetricsConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationAttemptReport.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/metrics/ContainerMetricsConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/SystemMetricsPublisher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/AppAttemptFinishedEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/ContainerCreatedEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryClientService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/HtmlBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/ContainerBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/ApplicationFinishedEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java
* 

[jira] [Commented] (YARN-2541) Fix ResourceManagerRest.apt.vm syntax error

2014-09-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131536#comment-14131536
 ] 

Hudson commented on YARN-2541:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1894 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1894/])
YARN-2541. Fixed ResourceManagerRest.apt.vm table syntax error. Contributed by 
Jian He (jianhe: rev 5633da2a018efcfac03cc1dd65af79bce2f1a11b)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerRest.apt.vm


 Fix ResourceManagerRest.apt.vm syntax error
 ---

 Key: YARN-2541
 URL: https://issues.apache.org/jira/browse/YARN-2541
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Fix For: 2.6.0

 Attachments: YARN-2541.1.patch


 the incorrect table syntax somehow causes hadoop-yarn-site intermittent build 
 failure as in https://jira.codehaus.org/browse/DOXIA-453



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2538) Add logs when RM send new AMRMToken to ApplicationMaster

2014-09-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131541#comment-14131541
 ] 

Hudson commented on YARN-2538:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1894 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1894/])
YARN-2538. Added logs when RM sends roll-overed AMRMToken to AM. Contributed by 
Xuan Gong. (zjshen: rev 469ea3dcef6e427d02fd08b859b2789cc25189f9)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java


 Add logs when RM send new AMRMToken to ApplicationMaster
 

 Key: YARN-2538
 URL: https://issues.apache.org/jira/browse/YARN-2538
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Xuan Gong
Assignee: Xuan Gong
 Fix For: 2.6.0

 Attachments: YARN-2538.1.patch, YARN-2538.1.patch


 This is for testing/debugging purpose 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2033) Merging generic-history into the Timeline Store

2014-09-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131572#comment-14131572
 ] 

Hudson commented on YARN-2033:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1869 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1869/])
YARN-2033. Merging generic-history into the Timeline Store (Contributed by 
Zhijie Shen) (junping_du: rev 6b8b1608e64e300e4e1d23c60476febaca29ca38)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/AppAttemptFinishedEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/AppAttemptRegisteredEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryClientService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/SystemMetricsEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/dao/AppInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/metrics/AppAttemptMetricsConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/ApplicationCreatedEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/TestRMContainerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/dao/AppAttemptInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationAttemptReport.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/metrics/ApplicationMetricsConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryManagerOnTimelineStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/ContainerCreatedEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/WebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ahs/RMApplicationHistoryWriter.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/metrics/ContainerMetricsConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/ApplicationContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 

[jira] [Commented] (YARN-2541) Fix ResourceManagerRest.apt.vm syntax error

2014-09-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131568#comment-14131568
 ] 

Hudson commented on YARN-2541:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1869 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1869/])
YARN-2541. Fixed ResourceManagerRest.apt.vm table syntax error. Contributed by 
Jian He (jianhe: rev 5633da2a018efcfac03cc1dd65af79bce2f1a11b)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerRest.apt.vm


 Fix ResourceManagerRest.apt.vm syntax error
 ---

 Key: YARN-2541
 URL: https://issues.apache.org/jira/browse/YARN-2541
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Fix For: 2.6.0

 Attachments: YARN-2541.1.patch


 the incorrect table syntax somehow causes hadoop-yarn-site intermittent build 
 failure as in https://jira.codehaus.org/browse/DOXIA-453



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2534) FairScheduler: Potential integer overflow calculating totalMaxShare

2014-09-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131563#comment-14131563
 ] 

Hudson commented on YARN-2534:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1869 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1869/])
YARN-2534. FairScheduler: Potential integer overflow calculating totalMaxShare. 
(Zhihai Xu via kasha) (kasha: rev c11ada5ea6d17321626e5a9a4152ff857d03aee2)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/ComputeFairShares.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


 FairScheduler: Potential integer overflow calculating totalMaxShare
 ---

 Key: YARN-2534
 URL: https://issues.apache.org/jira/browse/YARN-2534
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.6.0

 Attachments: YARN-2534.000.patch


 FairScheduler: totalMaxShare is not calculated correctly in 
 computeSharesInternal for some cases.
 If the sum of MAX share of all Schedulables is more than Integer.MAX_VALUE 
 ,but each individual MAX share is not equal to Integer.MAX_VALUE. then 
 totalMaxShare will be a negative value, which will cause all fairShare are 
 wrongly calculated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2484) FileSystemRMStateStore#readFile/writeFile should close FSData(In|Out)putStream in final block

2014-09-12 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131577#comment-14131577
 ] 

Jason Lowe commented on YARN-2484:
--

+1 lgtm.  Committing this.

 FileSystemRMStateStore#readFile/writeFile should close 
 FSData(In|Out)putStream in final block
 -

 Key: YARN-2484
 URL: https://issues.apache.org/jira/browse/YARN-2484
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
Priority: Trivial
 Attachments: YARN-2484.1.patch, YARN-2484.2.patch


 File descriptors can leak if exceptions are thrown in these methods.
 {code}
  private byte[] readFile(Path inputPath, long len) throws Exception {
 FSDataInputStream fsIn = fs.open(inputPath);
 // state data will not be that long
 byte[] data = new byte[(int)len];
 fsIn.readFully(data);
 fsIn.close();
 return data;
   }
 {code}
 {code}
   private void writeFile(Path outputPath, byte[] data) throws Exception {
 Path tempPath =
 new Path(outputPath.getParent(), outputPath.getName() + .tmp);
 FSDataOutputStream fsOut = null;
 // This file will be overwritten when app/attempt finishes for saving the
 // final status.
 fsOut = fs.create(tempPath, true);
 fsOut.write(data);
 fsOut.close();
 fs.rename(tempPath, outputPath);
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2547) Cross Origin Filter throws UnsupportedOperationException upon destroy

2014-09-12 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created YARN-2547:
-

 Summary: Cross Origin Filter throws UnsupportedOperationException 
upon destroy
 Key: YARN-2547
 URL: https://issues.apache.org/jira/browse/YARN-2547
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN

2014-09-12 Thread Krisztian Horvath (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131581#comment-14131581
 ] 

Krisztian Horvath commented on YARN-1964:
-

Will containers be able to communicate with each other, e.g with slider I can 
run HBase inside containers.

 Create Docker analog of the LinuxContainerExecutor in YARN
 --

 Key: YARN-1964
 URL: https://issues.apache.org/jira/browse/YARN-1964
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.2.0
Reporter: Arun C Murthy
Assignee: Abin Shahab
 Attachments: yarn-1964-branch-2.2.0-docker.patch, 
 yarn-1964-branch-2.2.0-docker.patch, yarn-1964-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch, yarn-1964-docker.patch, 
 yarn-1964-docker.patch


 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to *package* their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-2547) Cross Origin Filter throws UnsupportedOperationException upon destroy

2014-09-12 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai reassigned YARN-2547:
---

Assignee: Mit Desai

 Cross Origin Filter throws UnsupportedOperationException upon destroy
 -

 Key: YARN-2547
 URL: https://issues.apache.org/jira/browse/YARN-2547
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2494) [YARN-796] Node label manager API and storage implementations

2014-09-12 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2494:
-
Attachment: YARN-2494.patch

 [YARN-796] Node label manager API and storage implementations
 -

 Key: YARN-2494
 URL: https://issues.apache.org/jira/browse/YARN-2494
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2494.patch, YARN-2494.patch, YARN-2494.patch


 This JIRA includes APIs and storage implementations of node label manager,
 NodeLabelManager is an abstract class used to manage labels of nodes in the 
 cluster, it has APIs to query/modify
 - Nodes according to given label
 - Labels according to given hostname
 - Add/remove labels
 - Set labels of nodes in the cluster
 - Persist/recover changes of labels/labels-on-nodes to/from storage
 And it has two implementations to store modifications
 - Memory based storage: It will not persist changes, so all labels will be 
 lost when RM restart
 - FileSystem based storage: It will persist/recover to/from FileSystem (like 
 HDFS), and all labels and labels-on-nodes will be recovered upon RM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2498) [YARN-796] Respect labels in preemption policy of capacity scheduler

2014-09-12 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2498:
-
Attachment: YARN-2498.patch

 [YARN-796] Respect labels in preemption policy of capacity scheduler
 

 Key: YARN-2498
 URL: https://issues.apache.org/jira/browse/YARN-2498
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2498.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2496) [YARN-796] Changes for capacity scheduler to support allocate resource respect labels

2014-09-12 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2496:
-
Attachment: YARN-2496.patch

 [YARN-796] Changes for capacity scheduler to support allocate resource 
 respect labels
 -

 Key: YARN-2496
 URL: https://issues.apache.org/jira/browse/YARN-2496
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2496.patch, YARN-2496.patch, YARN-2496.patch, 
 YARN-2496.patch


 This JIRA Includes:
 - Add/parse labels option to {{capacity-scheduler.xml}} similar to other 
 options of queue like capacity/maximum-capacity, etc.
 - Include a default-label-expression option in queue config, if an app 
 doesn't specify label-expression, default-label-expression of queue will be 
 used.
 - Check if labels can be accessed by the queue when submit an app with 
 labels-expression to queue or update ResourceRequest with label-expression
 - Check labels on NM when trying to allocate ResourceRequest on the NM with 
 label-expression
 - Respect  labels when calculate headroom/user-limit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2502) [YARN-796] Changes in distributed shell to support specify labels

2014-09-12 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2502:
-
Attachment: YARN-2502.patch

 [YARN-796] Changes in distributed shell to support specify labels
 -

 Key: YARN-2502
 URL: https://issues.apache.org/jira/browse/YARN-2502
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2502.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2503) [YARN-796] Changes in RM Web UI to better show labels to end users

2014-09-12 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2503:
-
Attachment: YARN-2503.patch

 [YARN-796] Changes in RM Web UI to better show labels to end users
 --

 Key: YARN-2503
 URL: https://issues.apache.org/jira/browse/YARN-2503
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2503.patch


 Include but not limited to:
 - Show labels of nodes in RM/nodes page
 - Show labels of queue in RM/scheduler page
 - Warn user/admin if capacity of queue cannot be guaranteed according to mis 
 config of labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2502) [YARN-796] Changes in distributed shell to support specify labels

2014-09-12 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2502:
-
Attachment: (was: YARN-2502.patch)

 [YARN-796] Changes in distributed shell to support specify labels
 -

 Key: YARN-2502
 URL: https://issues.apache.org/jira/browse/YARN-2502
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2502.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2496) [YARN-796] Changes for capacity scheduler to support allocate resource respect labels

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131616#comment-14131616
 ] 

Hadoop QA commented on YARN-2496:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668345/YARN-2496.patch
  against trunk revision 78b0483.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 10 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4918//console

This message is automatically generated.

 [YARN-796] Changes for capacity scheduler to support allocate resource 
 respect labels
 -

 Key: YARN-2496
 URL: https://issues.apache.org/jira/browse/YARN-2496
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2496.patch, YARN-2496.patch, YARN-2496.patch, 
 YARN-2496.patch


 This JIRA Includes:
 - Add/parse labels option to {{capacity-scheduler.xml}} similar to other 
 options of queue like capacity/maximum-capacity, etc.
 - Include a default-label-expression option in queue config, if an app 
 doesn't specify label-expression, default-label-expression of queue will be 
 used.
 - Check if labels can be accessed by the queue when submit an app with 
 labels-expression to queue or update ResourceRequest with label-expression
 - Check labels on NM when trying to allocate ResourceRequest on the NM with 
 label-expression
 - Respect  labels when calculate headroom/user-limit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2494) [YARN-796] Node label manager API and storage implementations

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131615#comment-14131615
 ] 

Hadoop QA commented on YARN-2494:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668344/YARN-2494.patch
  against trunk revision 78b0483.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4916//console

This message is automatically generated.

 [YARN-796] Node label manager API and storage implementations
 -

 Key: YARN-2494
 URL: https://issues.apache.org/jira/browse/YARN-2494
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2494.patch, YARN-2494.patch, YARN-2494.patch


 This JIRA includes APIs and storage implementations of node label manager,
 NodeLabelManager is an abstract class used to manage labels of nodes in the 
 cluster, it has APIs to query/modify
 - Nodes according to given label
 - Labels according to given hostname
 - Add/remove labels
 - Set labels of nodes in the cluster
 - Persist/recover changes of labels/labels-on-nodes to/from storage
 And it has two implementations to store modifications
 - Memory based storage: It will not persist changes, so all labels will be 
 lost when RM restart
 - FileSystem based storage: It will persist/recover to/from FileSystem (like 
 HDFS), and all labels and labels-on-nodes will be recovered upon RM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-796) Allow for (admin) labels on nodes and resource-requests

2014-09-12 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131626#comment-14131626
 ] 

Wangda Tan commented on YARN-796:
-

Split and updated all existing patches for YARN-796 against latest trunk, patch 
dependencies:

{code}
  YARN-2493;YARN-2544
  |  \
   YARN-2494   YARN-2501;YARN-2502
  |
   YARN-2500
  |
 YARN-2596
   / | \
  YARN-2598  YARN-2504 YARN-2505
   |
   YARN-2503
{code}
Please kindly review.

Thanks,
Wangda

 Allow for (admin) labels on nodes and resource-requests
 ---

 Key: YARN-796
 URL: https://issues.apache.org/jira/browse/YARN-796
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.4.1
Reporter: Arun C Murthy
Assignee: Wangda Tan
 Attachments: LabelBasedScheduling.pdf, 
 Node-labels-Requirements-Design-doc-V1.pdf, 
 Node-labels-Requirements-Design-doc-V2.pdf, YARN-796-Diagram.pdf, 
 YARN-796.node-label.consolidate.1.patch, 
 YARN-796.node-label.consolidate.2.patch, 
 YARN-796.node-label.consolidate.3.patch, 
 YARN-796.node-label.consolidate.4.patch, YARN-796.node-label.demo.patch.1, 
 YARN-796.patch, YARN-796.patch4


 It will be useful for admins to specify labels for nodes. Examples of labels 
 are OS, processor architecture etc.
 We should expose these labels and allow applications to specify labels on 
 resource-requests.
 Obviously we need to support admin operations on adding/removing node labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2501) [YARN-796] Changes in AMRMClient to support labels

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131630#comment-14131630
 ] 

Hadoop QA commented on YARN-2501:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668348/YARN-2501.patch
  against trunk revision 78b0483.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4922//console

This message is automatically generated.

 [YARN-796] Changes in AMRMClient to support labels
 --

 Key: YARN-2501
 URL: https://issues.apache.org/jira/browse/YARN-2501
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2501.patch


 Changes in AMRMClient to support labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2503) [YARN-796] Changes in RM Web UI to better show labels to end users

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131629#comment-14131629
 ] 

Hadoop QA commented on YARN-2503:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668350/YARN-2503.patch
  against trunk revision 78b0483.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4919//console

This message is automatically generated.

 [YARN-796] Changes in RM Web UI to better show labels to end users
 --

 Key: YARN-2503
 URL: https://issues.apache.org/jira/browse/YARN-2503
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2503.patch


 Include but not limited to:
 - Show labels of nodes in RM/nodes page
 - Show labels of queue in RM/scheduler page
 - Warn user/admin if capacity of queue cannot be guaranteed according to mis 
 config of labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2500) [YARN-796] Miscellaneous changes in ResourceManager to support labels

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131634#comment-14131634
 ] 

Hadoop QA commented on YARN-2500:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668347/YARN-2500.patch
  against trunk revision 78b0483.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 10 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4923//console

This message is automatically generated.

 [YARN-796] Miscellaneous changes in ResourceManager to support labels
 -

 Key: YARN-2500
 URL: https://issues.apache.org/jira/browse/YARN-2500
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2500.patch, YARN-2500.patch, YARN-2500.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2504) [YARN-796] Support get/add/remove/change labels in RM admin CLI

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131637#comment-14131637
 ] 

Hadoop QA commented on YARN-2504:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668354/YARN-2504.patch
  against trunk revision 78b0483.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4924//console

This message is automatically generated.

 [YARN-796] Support get/add/remove/change labels in RM admin CLI 
 

 Key: YARN-2504
 URL: https://issues.apache.org/jira/browse/YARN-2504
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2504.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-796) Allow for (admin) labels on nodes and resource-requests

2014-09-12 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-796:

Attachment: YARN-796.node-label.consolidate.5.patch

 Allow for (admin) labels on nodes and resource-requests
 ---

 Key: YARN-796
 URL: https://issues.apache.org/jira/browse/YARN-796
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.4.1
Reporter: Arun C Murthy
Assignee: Wangda Tan
 Attachments: LabelBasedScheduling.pdf, 
 Node-labels-Requirements-Design-doc-V1.pdf, 
 Node-labels-Requirements-Design-doc-V2.pdf, YARN-796-Diagram.pdf, 
 YARN-796.node-label.consolidate.1.patch, 
 YARN-796.node-label.consolidate.2.patch, 
 YARN-796.node-label.consolidate.3.patch, 
 YARN-796.node-label.consolidate.4.patch, 
 YARN-796.node-label.consolidate.5.patch, YARN-796.node-label.demo.patch.1, 
 YARN-796.patch, YARN-796.patch4


 It will be useful for admins to specify labels for nodes. Examples of labels 
 are OS, processor architecture etc.
 We should expose these labels and allow applications to specify labels on 
 resource-requests.
 Obviously we need to support admin operations on adding/removing node labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN

2014-09-12 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131640#comment-14131640
 ] 

Abin Shahab commented on YARN-1964:
---

We have decided to create an umbrella issue to cover the integration between 
YARN and Docker(YARN-2466). 
This task(YARN-1964) has the following scope:
1) Launch docker containers from YARN with net=host mode. This will allow the 
container to take on the host's network, and therefore the YARN administrators 
will not need to set up special networking for docker.
2) Allow users to provide docker images through the job configuration.
3) Setup and user guides.

The rest(secure hadoop, advanced networking) will be handled in other issues 
under YARN-2466. 
Please add your feedback to this plan on this jira.
Thanks!
Abin



 Create Docker analog of the LinuxContainerExecutor in YARN
 --

 Key: YARN-1964
 URL: https://issues.apache.org/jira/browse/YARN-1964
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.2.0
Reporter: Arun C Murthy
Assignee: Abin Shahab
 Attachments: yarn-1964-branch-2.2.0-docker.patch, 
 yarn-1964-branch-2.2.0-docker.patch, yarn-1964-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch, yarn-1964-docker.patch, 
 yarn-1964-docker.patch


 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to *package* their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2544) [YARN-796] Common server side PB changes (not include user API PB changes)

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131652#comment-14131652
 ] 

Hadoop QA commented on YARN-2544:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668355/YARN-2544.patch
  against trunk revision 78b0483.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

  {color:red}-1 javac{color}.  The applied patch generated 1329 javac 
compiler warnings (more than the trunk's current 1301 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common:

  org.apache.hadoop.yarn.api.TestPBImplRecords

  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common:

org.apache.hadoop.yarn.webapp.view.TestHtmlBlock

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4921//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/4921//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4921//console

This message is automatically generated.

 [YARN-796] Common server side PB changes (not include user API PB changes)
 --

 Key: YARN-2544
 URL: https://issues.apache.org/jira/browse/YARN-2544
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2544.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2502) [YARN-796] Changes in distributed shell to support specify labels

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131653#comment-14131653
 ] 

Hadoop QA commented on YARN-2502:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668353/YARN-2502.patch
  against trunk revision 78b0483.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

  {color:red}-1 javac{color}.  The applied patch generated 1329 javac 
compiler warnings (more than the trunk's current 1301 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common:

  org.apache.hadoop.yarn.api.TestPBImplRecords

  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common:

org.apache.hadoop.yarn.webapp.view.TestHtmlBlock

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4920//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/4920//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4920//console

This message is automatically generated.

 [YARN-796] Changes in distributed shell to support specify labels
 -

 Key: YARN-2502
 URL: https://issues.apache.org/jira/browse/YARN-2502
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2502.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2534) FairScheduler: Potential integer overflow calculating totalMaxShare

2014-09-12 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131673#comment-14131673
 ] 

zhihai xu commented on YARN-2534:
-

[~kasha], thanks  to review and commit the patch.

 FairScheduler: Potential integer overflow calculating totalMaxShare
 ---

 Key: YARN-2534
 URL: https://issues.apache.org/jira/browse/YARN-2534
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.5.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.6.0

 Attachments: YARN-2534.000.patch


 FairScheduler: totalMaxShare is not calculated correctly in 
 computeSharesInternal for some cases.
 If the sum of MAX share of all Schedulables is more than Integer.MAX_VALUE 
 ,but each individual MAX share is not equal to Integer.MAX_VALUE. then 
 totalMaxShare will be a negative value, which will cause all fairShare are 
 wrongly calculated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2547) Cross Origin Filter throws UnsupportedOperationException upon destroy

2014-09-12 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-2547:

Attachment: YARN-2547.patch

Attaching the patch.

 Cross Origin Filter throws UnsupportedOperationException upon destroy
 -

 Key: YARN-2547
 URL: https://issues.apache.org/jira/browse/YARN-2547
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2547.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2547) Cross Origin Filter throws UnsupportedOperationException upon destroy

2014-09-12 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131682#comment-14131682
 ] 

Mit Desai commented on YARN-2547:
-

Refining the patch. Will update shortly

 Cross Origin Filter throws UnsupportedOperationException upon destroy
 -

 Key: YARN-2547
 URL: https://issues.apache.org/jira/browse/YARN-2547
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2547.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2547) Cross Origin Filter throws UnsupportedOperationException upon destroy

2014-09-12 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-2547:

Attachment: YARN-2547.patch

Updated the patch.

 Cross Origin Filter throws UnsupportedOperationException upon destroy
 -

 Key: YARN-2547
 URL: https://issues.apache.org/jira/browse/YARN-2547
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2547.patch, YARN-2547.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1710) Admission Control: agents to allocate reservation

2014-09-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131700#comment-14131700
 ] 

Chris Douglas commented on YARN-1710:
-

bq. I am not memoizing findEarliestTime, as it would only save one invocation 
(the others are on diff sets, or updated version of the same set)

I'm confused. There are three invocations:
{code}
if (findEarliestTime(allocations.keySet())  earliestStart) {
  allocations.put(new ReservationInterval(earliestStart,
  findEarliestTime(allocations.keySet())), ZERO_RES);
}
ReservationAllocation capReservation =
new InMemoryReservationAllocation(reservationId, contract, user,
plan.getQueueName(), findEarliestTime(allocations.keySet()),
findLatestTime(allocations.keySet()), allocations,
plan.getResourceCalculator(), plan.getMinimumAllocation());
{code}
Isn't earliest time is either the earliest in the set, or the interval this 
just added?

 Admission Control: agents to allocate reservation
 -

 Key: YARN-1710
 URL: https://issues.apache.org/jira/browse/YARN-1710
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-1710.1.patch, YARN-1710.2.patch, YARN-1710.patch


 This JIRA tracks the algorithms used to allocate a user ReservationRequest 
 coming in from the new reservation API (YARN-1708), in the inventory 
 subsystem (YARN-1709) maintaining the current plan for the cluster. The focus 
 of this agents is to quickly find a solution for the set of contraints 
 provided by the user, and the physical constraints of the plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2547) Cross Origin Filter throws UnsupportedOperationException upon destroy

2014-09-12 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131711#comment-14131711
 ] 

Jonathan Eagles commented on YARN-2547:
---

[~mitdesai], thanks for the quick fix posted. Main changes look good. Couple of 
minor things related to the test code. Instead testing for this one exception 
thrown which is an implementation detail. What we really want to test is that 
restart works init - destroy - init. That way the test conveys the 
functionality of the filter we are trying to ensure. 

 Cross Origin Filter throws UnsupportedOperationException upon destroy
 -

 Key: YARN-2547
 URL: https://issues.apache.org/jira/browse/YARN-2547
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2547.patch, YARN-2547.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2475) ReservationSystem: replan upon capacity reduction

2014-09-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131713#comment-14131713
 ] 

Chris Douglas commented on YARN-2475:
-

+1, other than a couple very minor nits:
* the new cstr accepting {{Clock}} can be package-private, with the no-arg cstr 
calling {{this(new UTCClock());}} (comment unnecessary, or replace with 
{{@VisibleForTesting}})
* The unit test could have a more descriptive name than {{test()}}, declare 
{{PlanningException}} in its throws clause instead of calling 
{{Assert::fail()}} on catching it, and not declare {{InterruptedException}} 
which it no longer throws

Just a minor clarification: as this iterates over each instant of the plan, are 
others allowed to modify it?

 ReservationSystem: replan upon capacity reduction
 -

 Key: YARN-2475
 URL: https://issues.apache.org/jira/browse/YARN-2475
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-2475.patch, YARN-2475.patch


 In the context of YARN-1051, if capacity of the cluster drops significantly 
 upon machine failures we need to trigger a reorganization of the planned 
 reservations. As reservations are absolute it is possible that they will 
 not all fit, and some need to be rejected a-posteriori.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2547) Cross Origin Filter throws UnsupportedOperationException upon destroy

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131719#comment-14131719
 ] 

Hadoop QA commented on YARN-2547:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668374/YARN-2547.patch
  against trunk revision 78b0483.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4925//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4925//console

This message is automatically generated.

 Cross Origin Filter throws UnsupportedOperationException upon destroy
 -

 Key: YARN-2547
 URL: https://issues.apache.org/jira/browse/YARN-2547
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2547.patch, YARN-2547.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1709) Admission Control: Reservation subsystem

2014-09-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131727#comment-14131727
 ] 

Chris Douglas commented on YARN-1709:
-

Thanks for the updates. Just a few minor tweaks, then I'm +1
* In checking the preconditions:
{code}
if (!readWriteLock.isWriteLockedByCurrentThread()) {
  return;
}
{code}
The intent was to {{assert}} and crash, so tests against this code can detect 
violations if the code is modified. When assertions are disabled, the check is 
elided
* Instead of two cstr that assign all the final fields, the no-arg should call 
the other
* Instead of explicitly throwing {{ClassCastException}}, this should just 
attempt the cast. The cause is implicit, and doesn't require a custom error 
string

 Admission Control: Reservation subsystem
 

 Key: YARN-1709
 URL: https://issues.apache.org/jira/browse/YARN-1709
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1709.patch, YARN-1709.patch, YARN-1709.patch, 
 YARN-1709.patch, YARN-1709.patch, YARN-1709.patch


 This JIRA is about the key data structure used to track resources over time 
 to enable YARN-1051. The Reservation subsystem is conceptually a plan of 
 how the scheduler will allocate resources over-time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2314) ContainerManagementProtocolProxy can create thousands of threads for a large cluster

2014-09-12 Thread Lohit Vijayarenu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131732#comment-14131732
 ] 

Lohit Vijayarenu commented on YARN-2314:


We hit same problem on one of our large cluster with more than 2.5K nodes. As a 
work around we ended up increasing container size to 6G for AM (and with 
pmem-vmem ratio of 2:1) we give away 12G of VM for AM container. From initial 
looks of this, there is no way to turn this behavior off via config, other than 
patching code, right?

 ContainerManagementProtocolProxy can create thousands of threads for a large 
 cluster
 

 Key: YARN-2314
 URL: https://issues.apache.org/jira/browse/YARN-2314
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Priority: Critical
 Attachments: nmproxycachefix.prototype.patch


 ContainerManagementProtocolProxy has a cache of NM proxies, and the size of 
 this cache is configurable.  However the cache can grow far beyond the 
 configured size when running on a large cluster and blow AM address/container 
 limits.  More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2540) Fair Scheduler : queue filters not working on scheduler page in RM UI

2014-09-12 Thread Ashwin Shankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwin Shankar updated YARN-2540:
-
Attachment: YARN-2540-v1.txt

 Fair Scheduler : queue filters not working on scheduler page in RM UI
 -

 Key: YARN-2540
 URL: https://issues.apache.org/jira/browse/YARN-2540
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.5.0, 2.5.1
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
 Attachments: YARN-2540-v1.txt


 Steps to reproduce :
 1. Run an app in default queue.
 2. While the app is running, go to the scheduler page on RM UI.
 3. You would see the app in the apptable at the bottom.
 4. Now click on default queue to filter the apptable on root.default.
 5. App disappears from apptable although it is running on default queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2542) yarn application -status appId throws NPE when retrieving the app from the timelineserver

2014-09-12 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2542:
--
Attachment: YARN-2542.3.patch

Patch looks good, just added N/A string in case appUsage doesn't exist

 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver
 -

 Key: YARN-2542
 URL: https://issues.apache.org/jira/browse/YARN-2542
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2542.1.patch, YARN-2542.2.patch, YARN-2542.3.patch


 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver. It's broken by YARN-415. When app is finished, there's no 
 usageReport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1710) Admission Control: agents to allocate reservation

2014-09-12 Thread Carlo Curino (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Curino updated YARN-1710:
---
Attachment: YARN-1710.3.patch

 Admission Control: agents to allocate reservation
 -

 Key: YARN-1710
 URL: https://issues.apache.org/jira/browse/YARN-1710
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-1710.1.patch, YARN-1710.2.patch, YARN-1710.3.patch, 
 YARN-1710.patch


 This JIRA tracks the algorithms used to allocate a user ReservationRequest 
 coming in from the new reservation API (YARN-1708), in the inventory 
 subsystem (YARN-1709) maintaining the current plan for the cluster. The focus 
 of this agents is to quickly find a solution for the set of contraints 
 provided by the user, and the physical constraints of the plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1710) Admission Control: agents to allocate reservation

2014-09-12 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131772#comment-14131772
 ] 

Carlo Curino commented on YARN-1710:


I understand what you meant (after our brief chat). I addressed it in the 
version I just uploaded (v3).

 Admission Control: agents to allocate reservation
 -

 Key: YARN-1710
 URL: https://issues.apache.org/jira/browse/YARN-1710
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-1710.1.patch, YARN-1710.2.patch, YARN-1710.3.patch, 
 YARN-1710.patch


 This JIRA tracks the algorithms used to allocate a user ReservationRequest 
 coming in from the new reservation API (YARN-1708), in the inventory 
 subsystem (YARN-1709) maintaining the current plan for the cluster. The focus 
 of this agents is to quickly find a solution for the set of contraints 
 provided by the user, and the physical constraints of the plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2540) Fair Scheduler : queue filters not working on scheduler page in RM UI

2014-09-12 Thread Ashwin Shankar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131773#comment-14131773
 ] 

Ashwin Shankar commented on YARN-2540:
--

Attached patch fixes this issue. The problem was that in FairScheduler, queue 
names are represented as fully qualified name(root.blah.blah) while the 
filtering logic in FairSchedulerPage.java filters based on a substring of the 
queue name.

 Fair Scheduler : queue filters not working on scheduler page in RM UI
 -

 Key: YARN-2540
 URL: https://issues.apache.org/jira/browse/YARN-2540
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.5.0, 2.5.1
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
 Attachments: YARN-2540-v1.txt


 Steps to reproduce :
 1. Run an app in default queue.
 2. While the app is running, go to the scheduler page on RM UI.
 3. You would see the app in the apptable at the bottom.
 4. Now click on default queue to filter the apptable on root.default.
 5. App disappears from apptable although it is running on default queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2104) Scheduler queue filter failed to work because index of queue column changed

2014-09-12 Thread Ashwin Shankar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131781#comment-14131781
 ] 

Ashwin Shankar commented on YARN-2104:
--

Created YARN-2540 and posted a patch to fix this issue in fair scheduler.

 Scheduler queue filter failed to work because index of queue column changed
 ---

 Key: YARN-2104
 URL: https://issues.apache.org/jira/browse/YARN-2104
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
Affects Versions: 2.4.0
Reporter: Wangda Tan
Assignee: Wangda Tan
 Fix For: 2.5.0

 Attachments: YARN-2104.patch


 YARN-563 added,
 {code}
 + th(.type, Application Type”).
 {code}
 to application table, which makes queue’s column index from 3 to 4. And in 
 scheduler page, queue’s column index is hard coded to 3 when filter 
 application with queue’s name,
 {code}
   if (q == 'root') q = '';,
   else q = '^' + q.substr(q.lastIndexOf('.') + 1) + '$';,
   $('#apps').dataTable().fnFilter(q, 3, true);,
 {code}
 So queue filter will not work for application page.
 Reproduce steps: (Thanks Bo Yang for pointing this)
 {code}
 1) In default setup, there’s a default queue under root queue
 2) Run an arbitrary application, you can find it in “Applications” page
 3) Click “Default” queue in scheduler page
 4) Click “Applications”, no application will show here
 5) Click “Root” queue in scheduler page
 6) Click “Applications”, application will show again
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1710) Admission Control: agents to allocate reservation

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131797#comment-14131797
 ] 

Hadoop QA commented on YARN-1710:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668391/YARN-1710.3.patch
  against trunk revision 78b0483.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4926//console

This message is automatically generated.

 Admission Control: agents to allocate reservation
 -

 Key: YARN-1710
 URL: https://issues.apache.org/jira/browse/YARN-1710
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-1710.1.patch, YARN-1710.2.patch, YARN-1710.3.patch, 
 YARN-1710.patch


 This JIRA tracks the algorithms used to allocate a user ReservationRequest 
 coming in from the new reservation API (YARN-1708), in the inventory 
 subsystem (YARN-1709) maintaining the current plan for the cluster. The focus 
 of this agents is to quickly find a solution for the set of contraints 
 provided by the user, and the physical constraints of the plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2032) Implement a scalable, available TimelineStore using HBase

2014-09-12 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-2032:

Attachment: (was: YARN-2032-091114.patch)

 Implement a scalable, available TimelineStore using HBase
 -

 Key: YARN-2032
 URL: https://issues.apache.org/jira/browse/YARN-2032
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Li Lu
 Attachments: YARN-2032-branch-2-1.patch, YARN-2032-branch2-2.patch


 As discussed on YARN-1530, we should pursue implementing a scalable, 
 available Timeline store using HBase.
 One goal is to reuse most of the code from the levelDB Based store - 
 YARN-1635.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2032) Implement a scalable, available TimelineStore using HBase

2014-09-12 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-2032:

Attachment: YARN-2032-091114.patch

Reapplied my patch to latest trunk branch locally several times, could not 
reproduce the javac failure. Re-upload a patch to see if this is a persistent 
failure. 

 Implement a scalable, available TimelineStore using HBase
 -

 Key: YARN-2032
 URL: https://issues.apache.org/jira/browse/YARN-2032
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Li Lu
 Attachments: YARN-2032-091114.patch, YARN-2032-branch-2-1.patch, 
 YARN-2032-branch2-2.patch


 As discussed on YARN-1530, we should pursue implementing a scalable, 
 available Timeline store using HBase.
 One goal is to reuse most of the code from the levelDB Based store - 
 YARN-1635.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2539) FairScheduler: Update the default value for maxAMShare

2014-09-12 Thread Ashwin Shankar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131818#comment-14131818
 ] 

Ashwin Shankar commented on YARN-2539:
--

Sounds good, thanks.

 FairScheduler: Update the default value for maxAMShare
 --

 Key: YARN-2539
 URL: https://issues.apache.org/jira/browse/YARN-2539
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-2539-1.patch


 Currently, the maxAMShare per queue is -1 in default, which disables the AM 
 share constraint. Change to 0.5f would be good.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2032) Implement a scalable, available TimelineStore using HBase

2014-09-12 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-2032:

Attachment: (was: YARN-2032-091114.patch)

 Implement a scalable, available TimelineStore using HBase
 -

 Key: YARN-2032
 URL: https://issues.apache.org/jira/browse/YARN-2032
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Li Lu
 Attachments: YARN-2032-branch-2-1.patch, YARN-2032-branch2-2.patch


 As discussed on YARN-1530, we should pursue implementing a scalable, 
 available Timeline store using HBase.
 One goal is to reuse most of the code from the levelDB Based store - 
 YARN-1635.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2542) yarn application -status appId throws NPE when retrieving the app from the timelineserver

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131830#comment-14131830
 ] 

Hadoop QA commented on YARN-2542:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668390/YARN-2542.3.patch
  against trunk revision 78b0483.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:

  org.apache.hadoop.yarn.client.cli.TestYarnCLI

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4927//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4927//console

This message is automatically generated.

 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver
 -

 Key: YARN-2542
 URL: https://issues.apache.org/jira/browse/YARN-2542
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2542.1.patch, YARN-2542.2.patch, YARN-2542.3.patch


 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver. It's broken by YARN-415. When app is finished, there's no 
 usageReport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2032) Implement a scalable, available TimelineStore using HBase

2014-09-12 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-2032:

Attachment: YARN-2032-091114.patch

 Implement a scalable, available TimelineStore using HBase
 -

 Key: YARN-2032
 URL: https://issues.apache.org/jira/browse/YARN-2032
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Li Lu
 Attachments: YARN-2032-091114.patch, YARN-2032-branch-2-1.patch, 
 YARN-2032-branch2-2.patch


 As discussed on YARN-1530, we should pursue implementing a scalable, 
 available Timeline store using HBase.
 One goal is to reuse most of the code from the levelDB Based store - 
 YARN-1635.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2456) Possible livelock in CapacityScheduler when RM is recovering apps

2014-09-12 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2456:
--
Attachment: YARN-2456.2.patch

patch rebased

 Possible livelock in CapacityScheduler when RM is recovering apps
 -

 Key: YARN-2456
 URL: https://issues.apache.org/jira/browse/YARN-2456
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-2456.1.patch, YARN-2456.2.patch


 Consider this scenario:
 1. RM is configured with a single queue and only one application can be 
 active at a time.
 2. Submit App1 which uses up the queue's whole capacity
 3. Submit App2 which remains pending.
 4. Restart RM.
 5. App2 is recovered before App1, so App2 is added to the activeApplications 
 list. Now App1 remains pending (because of max-active-app limit)
 6. All containers of App1 are now recovered when NM registers, and use up the 
 whole queue capacity again.
 7. Since the queue is full, App2 cannot proceed to allocate AM container.
 8. In the meanwhile, App1 cannot proceed to become active because of the 
 max-active-app limit 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2547) Cross Origin Filter throws UnsupportedOperationException upon destroy

2014-09-12 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-2547:

Attachment: YARN-2547.patch

Thanks for the feedback [~jeagles]. Uploading new patch with modified test

 Cross Origin Filter throws UnsupportedOperationException upon destroy
 -

 Key: YARN-2547
 URL: https://issues.apache.org/jira/browse/YARN-2547
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2547.patch, YARN-2547.patch, YARN-2547.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1372) Ensure all completed containers are reported to the AMs across RM restart

2014-09-12 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-1372:

Attachment: YARN-1372.005.patch

Fixed unit test failure

 Ensure all completed containers are reported to the AMs across RM restart
 -

 Key: YARN-1372
 URL: https://issues.apache.org/jira/browse/YARN-1372
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Anubhav Dhoot
 Attachments: YARN-1372.001.patch, YARN-1372.001.patch, 
 YARN-1372.002_NMHandlesCompletedApp.patch, 
 YARN-1372.002_RMHandlesCompletedApp.patch, 
 YARN-1372.002_RMHandlesCompletedApp.patch, YARN-1372.003.patch, 
 YARN-1372.004.patch, YARN-1372.005.patch, YARN-1372.prelim.patch, 
 YARN-1372.prelim2.patch


 Currently the NM informs the RM about completed containers and then removes 
 those containers from the RM notification list. The RM passes on that 
 completed container information to the AM and the AM pulls this data. If the 
 RM dies before the AM pulls this data then the AM may not be able to get this 
 information again. To fix this, NM should maintain a separate list of such 
 completed container notifications sent to the RM. After the AM has pulled the 
 containers from the RM then the RM will inform the NM about it and the NM can 
 remove the completed container from the new list. Upon re-register with the 
 RM (after RM restart) the NM should send the entire list of completed 
 containers to the RM along with any other containers that completed while the 
 RM was dead. This ensures that the RM can inform the AM's about all completed 
 containers. Some container completions may be reported more than once since 
 the AM may have pulled the container but the RM may die before notifying the 
 NM about the pull.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2547) Cross Origin Filter throws UnsupportedOperationException upon destroy

2014-09-12 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131848#comment-14131848
 ] 

Jonathan Eagles commented on YARN-2547:
---

+1 pending QA comment. Thanks, Mit.

 Cross Origin Filter throws UnsupportedOperationException upon destroy
 -

 Key: YARN-2547
 URL: https://issues.apache.org/jira/browse/YARN-2547
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2547.patch, YARN-2547.patch, YARN-2547.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2456) Possible livelock in CapacityScheduler when RM is recovering apps

2014-09-12 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131850#comment-14131850
 ] 

Xuan Gong commented on YARN-2456:
-

+1 LGTM. Will commit this after Jenkins give +1

 Possible livelock in CapacityScheduler when RM is recovering apps
 -

 Key: YARN-2456
 URL: https://issues.apache.org/jira/browse/YARN-2456
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-2456.1.patch, YARN-2456.2.patch


 Consider this scenario:
 1. RM is configured with a single queue and only one application can be 
 active at a time.
 2. Submit App1 which uses up the queue's whole capacity
 3. Submit App2 which remains pending.
 4. Restart RM.
 5. App2 is recovered before App1, so App2 is added to the activeApplications 
 list. Now App1 remains pending (because of max-active-app limit)
 6. All containers of App1 are now recovered when NM registers, and use up the 
 whole queue capacity again.
 7. Since the queue is full, App2 cannot proceed to allocate AM container.
 8. In the meanwhile, App1 cannot proceed to become active because of the 
 max-active-app limit 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1372) Ensure all completed containers are reported to the AMs across RM restart

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131865#comment-14131865
 ] 

Hadoop QA commented on YARN-1372:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668409/YARN-1372.005.patch
  against trunk revision 3122daa.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4931//console

This message is automatically generated.

 Ensure all completed containers are reported to the AMs across RM restart
 -

 Key: YARN-1372
 URL: https://issues.apache.org/jira/browse/YARN-1372
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Anubhav Dhoot
 Attachments: YARN-1372.001.patch, YARN-1372.001.patch, 
 YARN-1372.002_NMHandlesCompletedApp.patch, 
 YARN-1372.002_RMHandlesCompletedApp.patch, 
 YARN-1372.002_RMHandlesCompletedApp.patch, YARN-1372.003.patch, 
 YARN-1372.004.patch, YARN-1372.005.patch, YARN-1372.prelim.patch, 
 YARN-1372.prelim2.patch


 Currently the NM informs the RM about completed containers and then removes 
 those containers from the RM notification list. The RM passes on that 
 completed container information to the AM and the AM pulls this data. If the 
 RM dies before the AM pulls this data then the AM may not be able to get this 
 information again. To fix this, NM should maintain a separate list of such 
 completed container notifications sent to the RM. After the AM has pulled the 
 containers from the RM then the RM will inform the NM about it and the NM can 
 remove the completed container from the new list. Upon re-register with the 
 RM (after RM restart) the NM should send the entire list of completed 
 containers to the RM along with any other containers that completed while the 
 RM was dead. This ensures that the RM can inform the AM's about all completed 
 containers. Some container completions may be reported more than once since 
 the AM may have pulled the container but the RM may die before notifying the 
 NM about the pull.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2547) Cross Origin Filter throws UnsupportedOperationException upon destroy

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131876#comment-14131876
 ] 

Hadoop QA commented on YARN-2547:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668407/YARN-2547.patch
  against trunk revision 3122daa.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4932//console

This message is automatically generated.

 Cross Origin Filter throws UnsupportedOperationException upon destroy
 -

 Key: YARN-2547
 URL: https://issues.apache.org/jira/browse/YARN-2547
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2547.patch, YARN-2547.patch, YARN-2547.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2540) Fair Scheduler : queue filters not working on scheduler page in RM UI

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131878#comment-14131878
 ] 

Hadoop QA commented on YARN-2540:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668385/YARN-2540-v1.txt
  against trunk revision 78b0483.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4928//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4928//console

This message is automatically generated.

 Fair Scheduler : queue filters not working on scheduler page in RM UI
 -

 Key: YARN-2540
 URL: https://issues.apache.org/jira/browse/YARN-2540
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.5.0, 2.5.1
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
 Attachments: YARN-2540-v1.txt


 Steps to reproduce :
 1. Run an app in default queue.
 2. While the app is running, go to the scheduler page on RM UI.
 3. You would see the app in the apptable at the bottom.
 4. Now click on default queue to filter the apptable on root.default.
 5. App disappears from apptable although it is running on default queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2032) Implement a scalable, available TimelineStore using HBase

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131881#comment-14131881
 ] 

Hadoop QA commented on YARN-2032:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12668402/YARN-2032-091114.patch
  against trunk revision 3122daa.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice:

  
org.apache.hadoop.yarn.server.timeline.TestHBaseTimelineStoreUtil

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4930//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/4930//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-applicationhistoryservice.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4930//console

This message is automatically generated.

 Implement a scalable, available TimelineStore using HBase
 -

 Key: YARN-2032
 URL: https://issues.apache.org/jira/browse/YARN-2032
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Li Lu
 Attachments: YARN-2032-091114.patch, YARN-2032-branch-2-1.patch, 
 YARN-2032-branch2-2.patch


 As discussed on YARN-1530, we should pursue implementing a scalable, 
 available Timeline store using HBase.
 One goal is to reuse most of the code from the levelDB Based store - 
 YARN-1635.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1372) Ensure all completed containers are reported to the AMs across RM restart

2014-09-12 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-1372:

Attachment: YARN-1372.005.patch

Rebased patch

 Ensure all completed containers are reported to the AMs across RM restart
 -

 Key: YARN-1372
 URL: https://issues.apache.org/jira/browse/YARN-1372
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Anubhav Dhoot
 Attachments: YARN-1372.001.patch, YARN-1372.001.patch, 
 YARN-1372.002_NMHandlesCompletedApp.patch, 
 YARN-1372.002_RMHandlesCompletedApp.patch, 
 YARN-1372.002_RMHandlesCompletedApp.patch, YARN-1372.003.patch, 
 YARN-1372.004.patch, YARN-1372.005.patch, YARN-1372.005.patch, 
 YARN-1372.prelim.patch, YARN-1372.prelim2.patch


 Currently the NM informs the RM about completed containers and then removes 
 those containers from the RM notification list. The RM passes on that 
 completed container information to the AM and the AM pulls this data. If the 
 RM dies before the AM pulls this data then the AM may not be able to get this 
 information again. To fix this, NM should maintain a separate list of such 
 completed container notifications sent to the RM. After the AM has pulled the 
 containers from the RM then the RM will inform the NM about it and the NM can 
 remove the completed container from the new list. Upon re-register with the 
 RM (after RM restart) the NM should send the entire list of completed 
 containers to the RM along with any other containers that completed while the 
 RM was dead. This ensures that the RM can inform the AM's about all completed 
 containers. Some container completions may be reported more than once since 
 the AM may have pulled the container but the RM may die before notifying the 
 NM about the pull.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2540) Fair Scheduler : queue filters not working on scheduler page in RM UI

2014-09-12 Thread Ashwin Shankar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131886#comment-14131886
 ] 

Ashwin Shankar commented on YARN-2540:
--

Didn't add unit tests since it was a cosmetic UI change.
I verified the patch manually by running apps in multiple queues in 2-level 
queue hierarchy and
checked if clicking on parent/leaf queues resulted in right filter set.

 Fair Scheduler : queue filters not working on scheduler page in RM UI
 -

 Key: YARN-2540
 URL: https://issues.apache.org/jira/browse/YARN-2540
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.5.0, 2.5.1
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
 Attachments: YARN-2540-v1.txt


 Steps to reproduce :
 1. Run an app in default queue.
 2. While the app is running, go to the scheduler page on RM UI.
 3. You would see the app in the apptable at the bottom.
 4. Now click on default queue to filter the apptable on root.default.
 5. App disappears from apptable although it is running on default queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-12 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131892#comment-14131892
 ] 

Xuan Gong commented on YARN-2468:
-

Did more investigations and offline discussions. It turns out this is a really 
hard problem. So, we decide to solve this step by step. 

For the first step, we will stick to the original proposal: change the log 
layout, create a directory (named as node id of the NM), under this directory, 
every time when AppLogAggregatorImpl starts to upload container logs; it will 
create a file (named as node_id + timestamp). 
This method will increase the number of log files, but it will work fine for a 
small cluster. 

For the next step, we need to find a better way to handle the logs more 
efficiently. We would like to aggregate all containers’ log (Those containers 
are belong to the same NM) in a single file. In that case, the total number of 
logs is bounded. But we need find more scalable way, other than TFile, to do 
it. Will open a separate ticket for this.


 Log handling for LRS
 

 Key: YARN-2468
 URL: https://issues.apache.org/jira/browse/YARN-2468
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: log-aggregation, nodemanager, resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-2468.1.patch


 Currently, when application is finished, NM will start to do the log 
 aggregation. But for Long running service applications, this is not ideal. 
 The problems we have are:
 1) LRS applications are expected to run for a long time (weeks, months).
 2) Currently, all the container logs (from one NM) will be written into a 
 single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2468) Log handling for LRS

2014-09-12 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-2468:

Attachment: YARN-2468.2.patch

 Log handling for LRS
 

 Key: YARN-2468
 URL: https://issues.apache.org/jira/browse/YARN-2468
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: log-aggregation, nodemanager, resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-2468.1.patch, YARN-2468.2.patch


 Currently, when application is finished, NM will start to do the log 
 aggregation. But for Long running service applications, this is not ideal. 
 The problems we have are:
 1) LRS applications are expected to run for a long time (weeks, months).
 2) Currently, all the container logs (from one NM) will be written into a 
 single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2314) ContainerManagementProtocolProxy can create thousands of threads for a large cluster

2014-09-12 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-2314:
-
Attachment: disable-cm-proxy-cache.patch

Yeah, I don't think there's a good way to fix this short of running a bigger 
container than necessary or patching the code.

Attaching a patch we've been running with recently that disables the CM proxy 
cache completely and reinstates the fix from MAPREDUCE-.  It's not an ideal 
fix but it effectively restores the behavior to what Hadoop 0.23 did which 
worked OK for us.

 ContainerManagementProtocolProxy can create thousands of threads for a large 
 cluster
 

 Key: YARN-2314
 URL: https://issues.apache.org/jira/browse/YARN-2314
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Priority: Critical
 Attachments: disable-cm-proxy-cache.patch, 
 nmproxycachefix.prototype.patch


 ContainerManagementProtocolProxy has a cache of NM proxies, and the size of 
 this cache is configurable.  However the cache can grow far beyond the 
 configured size when running on a large cluster and blow AM address/container 
 limits.  More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-415) Capture aggregate memory allocation at the app-level for chargeback

2014-09-12 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131908#comment-14131908
 ] 

Sandy Ryza commented on YARN-415:
-

Awesome to see this go in!

 Capture aggregate memory allocation at the app-level for chargeback
 ---

 Key: YARN-415
 URL: https://issues.apache.org/jira/browse/YARN-415
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Affects Versions: 2.5.0
Reporter: Kendall Thrapp
Assignee: Eric Payne
 Fix For: 2.6.0

 Attachments: YARN-415--n10.patch, YARN-415--n2.patch, 
 YARN-415--n3.patch, YARN-415--n4.patch, YARN-415--n5.patch, 
 YARN-415--n6.patch, YARN-415--n7.patch, YARN-415--n8.patch, 
 YARN-415--n9.patch, YARN-415.201405311749.txt, YARN-415.201406031616.txt, 
 YARN-415.201406262136.txt, YARN-415.201407042037.txt, 
 YARN-415.201407071542.txt, YARN-415.201407171553.txt, 
 YARN-415.201407172144.txt, YARN-415.201407232237.txt, 
 YARN-415.201407242148.txt, YARN-415.201407281816.txt, 
 YARN-415.201408062232.txt, YARN-415.201408080204.txt, 
 YARN-415.201408092006.txt, YARN-415.201408132109.txt, 
 YARN-415.201408150030.txt, YARN-415.201408181938.txt, 
 YARN-415.201408181938.txt, YARN-415.201408212033.txt, 
 YARN-415.201409040036.txt, YARN-415.201409092204.txt, 
 YARN-415.201409102216.txt, YARN-415.patch


 For the purpose of chargeback, I'd like to be able to compute the cost of an
 application in terms of cluster resource usage.  To start out, I'd like to 
 get the memory utilization of an application.  The unit should be MB-seconds 
 or something similar and, from a chargeback perspective, the memory amount 
 should be the memory reserved for the application, as even if the app didn't 
 use all that memory, no one else was able to use it.
 (reserved ram for container 1 * lifetime of container 1) + (reserved ram for
 container 2 * lifetime of container 2) + ... + (reserved ram for container n 
 * lifetime of container n)
 It'd be nice to have this at the app level instead of the job level because:
 1. We'd still be able to get memory usage for jobs that crashed (and wouldn't 
 appear on the job history server).
 2. We'd be able to get memory usage for future non-MR jobs (e.g. Storm).
 This new metric should be available both through the RM UI and RM Web 
 Services REST API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2314) ContainerManagementProtocolProxy can create thousands of threads for a large cluster

2014-09-12 Thread Lohit Vijayarenu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131920#comment-14131920
 ] 

Lohit Vijayarenu commented on YARN-2314:


Thanks [~jlowe]

 ContainerManagementProtocolProxy can create thousands of threads for a large 
 cluster
 

 Key: YARN-2314
 URL: https://issues.apache.org/jira/browse/YARN-2314
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Priority: Critical
 Attachments: disable-cm-proxy-cache.patch, 
 nmproxycachefix.prototype.patch


 ContainerManagementProtocolProxy has a cache of NM proxies, and the size of 
 this cache is configurable.  However the cache can grow far beyond the 
 configured size when running on a large cluster and blow AM address/container 
 limits.  More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2542) yarn application -status appId throws NPE when retrieving the app from the timelineserver

2014-09-12 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2542:
--
Attachment: YARN-2542.4.patch

fixed test failure

 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver
 -

 Key: YARN-2542
 URL: https://issues.apache.org/jira/browse/YARN-2542
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2542.1.patch, YARN-2542.2.patch, YARN-2542.3.patch, 
 YARN-2542.4.patch


 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver. It's broken by YARN-415. When app is finished, there's no 
 usageReport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2513) Host framework UIs in YARN for use with the ATS

2014-09-12 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-2513:
--
Attachment: YARN-2513-v1.patch

 Host framework UIs in YARN for use with the ATS
 ---

 Key: YARN-2513
 URL: https://issues.apache.org/jira/browse/YARN-2513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Attachments: YARN-2513-v1.patch


 Allow for pluggable UIs as described by TEZ-8. Yarn can provide the 
 infrastructure to host java script and possible java UIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2456) Possible livelock in CapacityScheduler when RM is recovering apps

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131943#comment-14131943
 ] 

Hadoop QA commented on YARN-2456:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668403/YARN-2456.2.patch
  against trunk revision 3122daa.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4929//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4929//console

This message is automatically generated.

 Possible livelock in CapacityScheduler when RM is recovering apps
 -

 Key: YARN-2456
 URL: https://issues.apache.org/jira/browse/YARN-2456
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-2456.1.patch, YARN-2456.2.patch


 Consider this scenario:
 1. RM is configured with a single queue and only one application can be 
 active at a time.
 2. Submit App1 which uses up the queue's whole capacity
 3. Submit App2 which remains pending.
 4. Restart RM.
 5. App2 is recovered before App1, so App2 is added to the activeApplications 
 list. Now App1 remains pending (because of max-active-app limit)
 6. All containers of App1 are now recovered when NM registers, and use up the 
 whole queue capacity again.
 7. Since the queue is full, App2 cannot proceed to allocate AM container.
 8. In the meanwhile, App1 cannot proceed to become active because of the 
 max-active-app limit 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2513) Host framework UIs in YARN for use with the ATS

2014-09-12 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131946#comment-14131946
 ] 

Jonathan Eagles commented on YARN-2513:
---

[~vinodkv], [~hitesh], [~zjshen], I have posted a patch that will simply allow 
the timeline server to host generic UI that still pass through the web filters 
of hadoop. Please give some feedback.

{code}
property
nameyarn.timeline-service.ui-names/name
valuetez/value
/property
property
nameyarn.timeline-service.ui-on-disk-path.tez/name
value/Users/jeagles/hadoop/tez-ui/value
/property
property
nameyarn.timeline-service.ui-web-path.tez/name
value/tez-ui-v1.0/value
/property
{code}

 Host framework UIs in YARN for use with the ATS
 ---

 Key: YARN-2513
 URL: https://issues.apache.org/jira/browse/YARN-2513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Attachments: YARN-2513-v1.patch


 Allow for pluggable UIs as described by TEZ-8. Yarn can provide the 
 infrastructure to host java script and possible java UIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-611) Add an AM retry count reset window to YARN RM

2014-09-12 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131960#comment-14131960
 ] 

Zhijie Shen commented on YARN-611:
--

ControlledClock should be marked \@LimitedPrivate\{mapreduce, yarn\}?

 Add an AM retry count reset window to YARN RM
 -

 Key: YARN-611
 URL: https://issues.apache.org/jira/browse/YARN-611
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Chris Riccomini
Assignee: Xuan Gong
 Attachments: YARN-611.1.patch, YARN-611.2.patch, YARN-611.3.patch, 
 YARN-611.4.patch, YARN-611.4.rebase.patch, YARN-611.5.patch, 
 YARN-611.6.patch, YARN-611.7.patch, YARN-611.8.patch, YARN-611.9.patch, 
 YARN-611.9.rebase.patch


 YARN currently has the following config:
 yarn.resourcemanager.am.max-retries
 This config defaults to 2, and defines how many times to retry a failed AM 
 before failing the whole YARN job. YARN counts an AM as failed if the node 
 that it was running on dies (the NM will timeout, which counts as a failure 
 for the AM), or if the AM dies.
 This configuration is insufficient for long running (or infinitely running) 
 YARN jobs, since the machine (or NM) that the AM is running on will 
 eventually need to be restarted (or the machine/NM will fail). In such an 
 event, the AM has not done anything wrong, but this is counted as a failure 
 by the RM. Since the retry count for the AM is never reset, eventually, at 
 some point, the number of machine/NM failures will result in the AM failure 
 count going above the configured value for 
 yarn.resourcemanager.am.max-retries. Once this happens, the RM will mark the 
 job as failed, and shut it down. This behavior is not ideal.
 I propose that we add a second configuration:
 yarn.resourcemanager.am.retry-count-window-ms
 This configuration would define a window of time that would define when an AM 
 is well behaved, and it's safe to reset its failure count back to zero. 
 Every time an AM fails the RmAppImpl would check the last time that the AM 
 failed. If the last failure was less than retry-count-window-ms ago, and the 
 new failure count is  max-retries, then the job should fail. If the AM has 
 never failed, the retry count is  max-retries, or if the last failure was 
 OUTSIDE the retry-count-window-ms, then the job should be restarted. 
 Additionally, if the last failure was outside the retry-count-window-ms, then 
 the failure count should be set back to 0.
 This would give developers a way to have well-behaved AMs run forever, while 
 still failing mis-behaving AMs after a short period of time.
 I think the work to be done here is to change the RmAppImpl to actually look 
 at app.attempts, and see if there have been more than max-retries failures in 
 the last retry-count-window-ms milliseconds. If there have, then the job 
 should fail, if not, then the job should go forward. Additionally, we might 
 also need to add an endTime in either RMAppAttemptImpl or 
 RMAppFailedAttemptEvent, so that the RmAppImpl can check the time of the 
 failure.
 Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2540) Fair Scheduler : queue filters not working on scheduler page in RM UI

2014-09-12 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131962#comment-14131962
 ] 

Wei Yan commented on YARN-2540:
---

Veried the patch. Running an app in queue root.wei.yan, and the patch works 
well.

 Fair Scheduler : queue filters not working on scheduler page in RM UI
 -

 Key: YARN-2540
 URL: https://issues.apache.org/jira/browse/YARN-2540
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.5.0, 2.5.1
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
 Attachments: YARN-2540-v1.txt


 Steps to reproduce :
 1. Run an app in default queue.
 2. While the app is running, go to the scheduler page on RM UI.
 3. You would see the app in the apptable at the bottom.
 4. Now click on default queue to filter the apptable on root.default.
 5. App disappears from apptable although it is running on default queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1372) Ensure all completed containers are reported to the AMs across RM restart

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131986#comment-14131986
 ] 

Hadoop QA commented on YARN-1372:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668415/YARN-1372.005.patch
  against trunk revision 3122daa.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4933//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4933//console

This message is automatically generated.

 Ensure all completed containers are reported to the AMs across RM restart
 -

 Key: YARN-1372
 URL: https://issues.apache.org/jira/browse/YARN-1372
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Anubhav Dhoot
 Attachments: YARN-1372.001.patch, YARN-1372.001.patch, 
 YARN-1372.002_NMHandlesCompletedApp.patch, 
 YARN-1372.002_RMHandlesCompletedApp.patch, 
 YARN-1372.002_RMHandlesCompletedApp.patch, YARN-1372.003.patch, 
 YARN-1372.004.patch, YARN-1372.005.patch, YARN-1372.005.patch, 
 YARN-1372.prelim.patch, YARN-1372.prelim2.patch


 Currently the NM informs the RM about completed containers and then removes 
 those containers from the RM notification list. The RM passes on that 
 completed container information to the AM and the AM pulls this data. If the 
 RM dies before the AM pulls this data then the AM may not be able to get this 
 information again. To fix this, NM should maintain a separate list of such 
 completed container notifications sent to the RM. After the AM has pulled the 
 containers from the RM then the RM will inform the NM about it and the NM can 
 remove the completed container from the new list. Upon re-register with the 
 RM (after RM restart) the NM should send the entire list of completed 
 containers to the RM along with any other containers that completed while the 
 RM was dead. This ensures that the RM can inform the AM's about all completed 
 containers. Some container completions may be reported more than once since 
 the AM may have pulled the container but the RM may die before notifying the 
 NM about the pull.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-611) Add an AM retry count reset window to YARN RM

2014-09-12 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-611:
---
Attachment: YARN-611.10.patch

 Add an AM retry count reset window to YARN RM
 -

 Key: YARN-611
 URL: https://issues.apache.org/jira/browse/YARN-611
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Chris Riccomini
Assignee: Xuan Gong
 Attachments: YARN-611.1.patch, YARN-611.10.patch, YARN-611.2.patch, 
 YARN-611.3.patch, YARN-611.4.patch, YARN-611.4.rebase.patch, 
 YARN-611.5.patch, YARN-611.6.patch, YARN-611.7.patch, YARN-611.8.patch, 
 YARN-611.9.patch, YARN-611.9.rebase.patch


 YARN currently has the following config:
 yarn.resourcemanager.am.max-retries
 This config defaults to 2, and defines how many times to retry a failed AM 
 before failing the whole YARN job. YARN counts an AM as failed if the node 
 that it was running on dies (the NM will timeout, which counts as a failure 
 for the AM), or if the AM dies.
 This configuration is insufficient for long running (or infinitely running) 
 YARN jobs, since the machine (or NM) that the AM is running on will 
 eventually need to be restarted (or the machine/NM will fail). In such an 
 event, the AM has not done anything wrong, but this is counted as a failure 
 by the RM. Since the retry count for the AM is never reset, eventually, at 
 some point, the number of machine/NM failures will result in the AM failure 
 count going above the configured value for 
 yarn.resourcemanager.am.max-retries. Once this happens, the RM will mark the 
 job as failed, and shut it down. This behavior is not ideal.
 I propose that we add a second configuration:
 yarn.resourcemanager.am.retry-count-window-ms
 This configuration would define a window of time that would define when an AM 
 is well behaved, and it's safe to reset its failure count back to zero. 
 Every time an AM fails the RmAppImpl would check the last time that the AM 
 failed. If the last failure was less than retry-count-window-ms ago, and the 
 new failure count is  max-retries, then the job should fail. If the AM has 
 never failed, the retry count is  max-retries, or if the last failure was 
 OUTSIDE the retry-count-window-ms, then the job should be restarted. 
 Additionally, if the last failure was outside the retry-count-window-ms, then 
 the failure count should be set back to 0.
 This would give developers a way to have well-behaved AMs run forever, while 
 still failing mis-behaving AMs after a short period of time.
 I think the work to be done here is to change the RmAppImpl to actually look 
 at app.attempts, and see if there have been more than max-retries failures in 
 the last retry-count-window-ms milliseconds. If there have, then the job 
 should fail, if not, then the job should go forward. Additionally, we might 
 also need to add an endTime in either RMAppAttemptImpl or 
 RMAppFailedAttemptEvent, so that the RmAppImpl can check the time of the 
 failure.
 Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-611) Add an AM retry count reset window to YARN RM

2014-09-12 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131992#comment-14131992
 ] 

Xuan Gong commented on YARN-611:


bq. ControlledClock should actually be in a test module.

Moved into the test module.

bq. ControlledClock should be marked @LimitedPrivate{mapreduce, yarn}?

Moved into the test module. So, not need to add those.

 Add an AM retry count reset window to YARN RM
 -

 Key: YARN-611
 URL: https://issues.apache.org/jira/browse/YARN-611
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Chris Riccomini
Assignee: Xuan Gong
 Attachments: YARN-611.1.patch, YARN-611.10.patch, YARN-611.2.patch, 
 YARN-611.3.patch, YARN-611.4.patch, YARN-611.4.rebase.patch, 
 YARN-611.5.patch, YARN-611.6.patch, YARN-611.7.patch, YARN-611.8.patch, 
 YARN-611.9.patch, YARN-611.9.rebase.patch


 YARN currently has the following config:
 yarn.resourcemanager.am.max-retries
 This config defaults to 2, and defines how many times to retry a failed AM 
 before failing the whole YARN job. YARN counts an AM as failed if the node 
 that it was running on dies (the NM will timeout, which counts as a failure 
 for the AM), or if the AM dies.
 This configuration is insufficient for long running (or infinitely running) 
 YARN jobs, since the machine (or NM) that the AM is running on will 
 eventually need to be restarted (or the machine/NM will fail). In such an 
 event, the AM has not done anything wrong, but this is counted as a failure 
 by the RM. Since the retry count for the AM is never reset, eventually, at 
 some point, the number of machine/NM failures will result in the AM failure 
 count going above the configured value for 
 yarn.resourcemanager.am.max-retries. Once this happens, the RM will mark the 
 job as failed, and shut it down. This behavior is not ideal.
 I propose that we add a second configuration:
 yarn.resourcemanager.am.retry-count-window-ms
 This configuration would define a window of time that would define when an AM 
 is well behaved, and it's safe to reset its failure count back to zero. 
 Every time an AM fails the RmAppImpl would check the last time that the AM 
 failed. If the last failure was less than retry-count-window-ms ago, and the 
 new failure count is  max-retries, then the job should fail. If the AM has 
 never failed, the retry count is  max-retries, or if the last failure was 
 OUTSIDE the retry-count-window-ms, then the job should be restarted. 
 Additionally, if the last failure was outside the retry-count-window-ms, then 
 the failure count should be set back to 0.
 This would give developers a way to have well-behaved AMs run forever, while 
 still failing mis-behaving AMs after a short period of time.
 I think the work to be done here is to change the RmAppImpl to actually look 
 at app.attempts, and see if there have been more than max-retries failures in 
 the last retry-count-window-ms milliseconds. If there have, then the job 
 should fail, if not, then the job should go forward. Additionally, we might 
 also need to add an endTime in either RMAppAttemptImpl or 
 RMAppFailedAttemptEvent, so that the RmAppImpl can check the time of the 
 failure.
 Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2542) yarn application -status appId throws NPE when retrieving the app from the timelineserver

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131998#comment-14131998
 ] 

Hadoop QA commented on YARN-2542:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668428/YARN-2542.4.patch
  against trunk revision 3122daa.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4935//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4935//console

This message is automatically generated.

 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver
 -

 Key: YARN-2542
 URL: https://issues.apache.org/jira/browse/YARN-2542
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2542.1.patch, YARN-2542.2.patch, YARN-2542.3.patch, 
 YARN-2542.4.patch


 yarn application -status appId throws NPE when retrieving the app from 
 the timelineserver. It's broken by YARN-415. When app is finished, there's no 
 usageReport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2468) Log handling for LRS

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14132012#comment-14132012
 ] 

Hadoop QA commented on YARN-2468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668421/YARN-2468.2.patch
  against trunk revision 3122daa.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 4 
warning messages.
See 
https://builds.apache.org/job/PreCommit-YARN-Build/4934//artifact/trunk/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4934//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/4934//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4934//console

This message is automatically generated.

 Log handling for LRS
 

 Key: YARN-2468
 URL: https://issues.apache.org/jira/browse/YARN-2468
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: log-aggregation, nodemanager, resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-2468.1.patch, YARN-2468.2.patch


 Currently, when application is finished, NM will start to do the log 
 aggregation. But for Long running service applications, this is not ideal. 
 The problems we have are:
 1) LRS applications are expected to run for a long time (weeks, months).
 2) Currently, all the container logs (from one NM) will be written into a 
 single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-611) Add an AM retry count reset window to YARN RM

2014-09-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14132015#comment-14132015
 ] 

Hadoop QA commented on YARN-611:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668436/YARN-611.10.patch
  against trunk revision 3122daa.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4936//console

This message is automatically generated.

 Add an AM retry count reset window to YARN RM
 -

 Key: YARN-611
 URL: https://issues.apache.org/jira/browse/YARN-611
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Chris Riccomini
Assignee: Xuan Gong
 Attachments: YARN-611.1.patch, YARN-611.10.patch, YARN-611.2.patch, 
 YARN-611.3.patch, YARN-611.4.patch, YARN-611.4.rebase.patch, 
 YARN-611.5.patch, YARN-611.6.patch, YARN-611.7.patch, YARN-611.8.patch, 
 YARN-611.9.patch, YARN-611.9.rebase.patch


 YARN currently has the following config:
 yarn.resourcemanager.am.max-retries
 This config defaults to 2, and defines how many times to retry a failed AM 
 before failing the whole YARN job. YARN counts an AM as failed if the node 
 that it was running on dies (the NM will timeout, which counts as a failure 
 for the AM), or if the AM dies.
 This configuration is insufficient for long running (or infinitely running) 
 YARN jobs, since the machine (or NM) that the AM is running on will 
 eventually need to be restarted (or the machine/NM will fail). In such an 
 event, the AM has not done anything wrong, but this is counted as a failure 
 by the RM. Since the retry count for the AM is never reset, eventually, at 
 some point, the number of machine/NM failures will result in the AM failure 
 count going above the configured value for 
 yarn.resourcemanager.am.max-retries. Once this happens, the RM will mark the 
 job as failed, and shut it down. This behavior is not ideal.
 I propose that we add a second configuration:
 yarn.resourcemanager.am.retry-count-window-ms
 This configuration would define a window of time that would define when an AM 
 is well behaved, and it's safe to reset its failure count back to zero. 
 Every time an AM fails the RmAppImpl would check the last time that the AM 
 failed. If the last failure was less than retry-count-window-ms ago, and the 
 new failure count is  max-retries, then the job should fail. If the AM has 
 never failed, the retry count is  max-retries, or if the last failure was 
 OUTSIDE the retry-count-window-ms, then the job should be restarted. 
 Additionally, if the last failure was outside the retry-count-window-ms, then 
 the failure count should be set back to 0.
 This would give developers a way to have well-behaved AMs run forever, while 
 still failing mis-behaving AMs after a short period of time.
 I think the work to be done here is to change the RmAppImpl to actually look 
 at app.attempts, and see if there have been more than max-retries failures in 
 the last retry-count-window-ms milliseconds. If there have, then the job 
 should fail, if not, then the job should go forward. Additionally, we might 
 also need to add an endTime in either RMAppAttemptImpl or 
 RMAppFailedAttemptEvent, so that the RmAppImpl can check the time of the 
 failure.
 Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2548) Find a more scalable way to handle logs for long running service

2014-09-12 Thread Xuan Gong (JIRA)
Xuan Gong created YARN-2548:
---

 Summary: Find a more scalable way to handle logs for long running 
service
 Key: YARN-2548
 URL: https://issues.apache.org/jira/browse/YARN-2548
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong


After YARN-2468, the container logs will be aggregated separately based on the 
time. It will increate the total number of log files. It is fine for small 
cluster. But for the larger cluster, it will make too-many-files problem even 
worse.

We need to find a more scalable way to handle those logs. Aggregate all 
container logs in a single file is an option, but we need to find a different 
way, other than TFile(do not support append), to do it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2032) Implement a scalable, available TimelineStore using HBase

2014-09-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14132031#comment-14132031
 ] 

stack commented on YARN-2032:
-

What you need of hbase lads?  Our next release undoes our dependency on 
HTTPServer (the coming 1.0, a 0.99.0 developer release is imminent).  If you 
want us to change our sync method call, np, just say; now would be a good time 
to do it before 1.0 goes out.  We are also well-practiced poking around with 
reflection looking for whatever the method that does hdfs sync'ing is called 
(smile).

 Implement a scalable, available TimelineStore using HBase
 -

 Key: YARN-2032
 URL: https://issues.apache.org/jira/browse/YARN-2032
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Li Lu
 Attachments: YARN-2032-091114.patch, YARN-2032-branch-2-1.patch, 
 YARN-2032-branch2-2.patch


 As discussed on YARN-1530, we should pursue implementing a scalable, 
 available Timeline store using HBase.
 One goal is to reuse most of the code from the levelDB Based store - 
 YARN-1635.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >