date:20140731

[jira] [Commented] (YARN-2008) CapacityScheduler may report incorrect queueMaxCap if there is hierarchy queue structure

2014-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081988#comment-14081988
 ] 

Hadoop QA commented on YARN-2008:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12659068/YARN-2008.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4503//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4503//console

This message is automatically generated.

> CapacityScheduler may report incorrect queueMaxCap if there is hierarchy 
> queue structure 
> -
>
> Key: YARN-2008
> URL: https://issues.apache.org/jira/browse/YARN-2008
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.3.0
>Reporter: Chen He
>Assignee: Craig Welch
> Attachments: YARN-2008.1.patch, YARN-2008.2.patch, YARN-2008.3.patch, 
> YARN-2008.4.patch, YARN-2008.5.patch
>
>
> If there are two queues, both allowed to use 100% of the actual resources in 
> the cluster. Q1 and Q2 currently use 50% of actual cluster's resources and 
> there is not actual space available. If we use current method to get 
> headroom, CapacityScheduler thinks there are still available resources for 
> users in Q1 but they have been used by Q2. 
> If the CapacityScheduelr has a hierarchy queue structure, it may report 
> incorrect queueMaxCap. Here is a example
>  ||||rootQueue|| ||
> |  |   /   |  
>   \ |
> |  L1ParentQueue1  |  |
> L1ParentQueue2|
> |  (allowed to use up 80% of its parent)|  | (allowed to use 20% 
> in minimum of its parent)|
> |/   | \ ||  
> |  L2LeafQueue1 |L2LeafQueue2 |  | 
> |(50% of its parent) |  (50% of its parent in minimum) |   |
> When we calculate headroom of a user in L2LeafQueue2, current method will 
> think L2LeafQueue2 can use 40% (80%*50%) of actual rootQueue resources. 
> However, without checking L1ParentQueue1, we are not sure. It is possible 
> that L1ParentQueue2 have used 40% of rootQueue resources right now. Actually, 
> L2LeafQueue2 can only use 30% (60%*50%). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2069) CS queue level preemption should respect user-limits

2014-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081985#comment-14081985
 ] 

Hadoop QA commented on YARN-2069:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12659051/YARN-2069-trunk-9.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4502//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4502//console

This message is automatically generated.

> CS queue level preemption should respect user-limits
> 
>
> Key: YARN-2069
> URL: https://issues.apache.org/jira/browse/YARN-2069
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Mayank Bansal
> Attachments: YARN-2069-trunk-1.patch, YARN-2069-trunk-2.patch, 
> YARN-2069-trunk-3.patch, YARN-2069-trunk-4.patch, YARN-2069-trunk-5.patch, 
> YARN-2069-trunk-6.patch, YARN-2069-trunk-7.patch, YARN-2069-trunk-8.patch, 
> YARN-2069-trunk-9.patch
>
>
> This is different from (even if related to, and likely share code with) 
> YARN-2113.
> YARN-2113 focuses on making sure that even if queue has its guaranteed 
> capacity, it's individual users are treated in-line with their limits 
> irrespective of when they join in.
> This JIRA is about respecting user-limits while preempting containers to 
> balance queue capacities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2348) ResourceManager web UI should display server-side time instead of UTC time

2014-07-31 Thread Leitao Guo (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leitao Guo updated YARN-2348:
-

Attachment: (was: YARN-2348.2.patch)

> ResourceManager web UI should display server-side time instead of UTC time
> --
>
> Key: YARN-2348
> URL: https://issues.apache.org/jira/browse/YARN-2348
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.1
>Reporter: Leitao Guo
> Attachments: 3.before-patch.JPG, 4.after-patch.JPG, YARN-2348.2.patch
>
>
> ResourceManager web UI, including application list and scheduler, displays 
> UTC time in default,  this will confuse users who do not use UTC time. This 
> web UI should display server-side time in default.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2348) ResourceManager web UI should display server-side time instead of UTC time

2014-07-31 Thread Leitao Guo (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leitao Guo updated YARN-2348:
-

Attachment: (was: YARN-2348.patch)

> ResourceManager web UI should display server-side time instead of UTC time
> --
>
> Key: YARN-2348
> URL: https://issues.apache.org/jira/browse/YARN-2348
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.1
>Reporter: Leitao Guo
> Attachments: 3.before-patch.JPG, 4.after-patch.JPG, YARN-2348.2.patch
>
>
> ResourceManager web UI, including application list and scheduler, displays 
> UTC time in default,  this will confuse users who do not use UTC time. This 
> web UI should display server-side time in default.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2051) Fix bug in PBimpls and add more unit tests with reflection

2014-07-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081899#comment-14081899
 ] 

Hudson commented on YARN-2051:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5993 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5993/])
YARN-2051. Fix bug in PBimpls and add more unit tests with reflection. 
(Contributed by Binglin Chang) (junping_du: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1615025)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetApplicationsRequest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ResourceOption.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetApplicationsRequestPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ApplicationReportPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ApplicationSubmissionContextPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ResourceBlacklistRequestPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ResourceOptionPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/TokenPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/UpdateNodeResourceRequestPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestPBImplRecords.java


> Fix bug in PBimpls and add more unit tests with reflection
> --
>
> Key: YARN-2051
> URL: https://issues.apache.org/jira/browse/YARN-2051
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Junping Du
>Assignee: Binglin Chang
>Priority: Critical
> Attachments: YARN-2051.v1.patch, YARN-2051.v2.patch
>
>
> From YARN-2016, we can see some bug could exist in PB implementation of 
> protocol. The bad news is most of these PBImpl don't have any unit test to 
> verify the info is not lost or changed after serialization/deserialization. 
> We should add more tests for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2051) Fix bug in PBimpls and add more unit tests with reflection

2014-07-31 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-2051:
-

Summary: Fix bug in PBimpls and add more unit tests with reflection  (was: 
Fix code in PBimpls and add more unit tests with reflection)

> Fix bug in PBimpls and add more unit tests with reflection
> --
>
> Key: YARN-2051
> URL: https://issues.apache.org/jira/browse/YARN-2051
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Junping Du
>Assignee: Binglin Chang
>Priority: Critical
> Attachments: YARN-2051.v1.patch, YARN-2051.v2.patch
>
>
> From YARN-2016, we can see some bug could exist in PB implementation of 
> protocol. The bad news is most of these PBImpl don't have any unit test to 
> verify the info is not lost or changed after serialization/deserialization. 
> We should add more tests for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2051) Fix code in PBimpls and add more unit tests with reflection

2014-07-31 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-2051:
-

Summary: Fix code in PBimpls and add more unit tests with reflection  (was: 
Fix code bug and add more unit tests for PBImpls)

> Fix code in PBimpls and add more unit tests with reflection
> ---
>
> Key: YARN-2051
> URL: https://issues.apache.org/jira/browse/YARN-2051
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Junping Du
>Assignee: Binglin Chang
>Priority: Critical
> Attachments: YARN-2051.v1.patch, YARN-2051.v2.patch
>
>
> From YARN-2016, we can see some bug could exist in PB implementation of 
> protocol. The bad news is most of these PBImpl don't have any unit test to 
> verify the info is not lost or changed after serialization/deserialization. 
> We should add more tests for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2069) CS queue level preemption should respect user-limits

2014-07-31 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081845#comment-14081845
 ] 

Wangda Tan commented on YARN-2069:
--

Hi [~mayank_bansal],
Thanks for uploading, reviewing it now.

Wangda

> CS queue level preemption should respect user-limits
> 
>
> Key: YARN-2069
> URL: https://issues.apache.org/jira/browse/YARN-2069
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Mayank Bansal
> Attachments: YARN-2069-trunk-1.patch, YARN-2069-trunk-2.patch, 
> YARN-2069-trunk-3.patch, YARN-2069-trunk-4.patch, YARN-2069-trunk-5.patch, 
> YARN-2069-trunk-6.patch, YARN-2069-trunk-7.patch, YARN-2069-trunk-8.patch, 
> YARN-2069-trunk-9.patch
>
>
> This is different from (even if related to, and likely share code with) 
> YARN-2113.
> YARN-2113 focuses on making sure that even if queue has its guaranteed 
> capacity, it's individual users are treated in-line with their limits 
> irrespective of when they join in.
> This JIRA is about respecting user-limits while preempting containers to 
> balance queue capacities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2008) CapacityScheduler may report incorrect queueMaxCap if there is hierarchy queue structure

2014-07-31 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081844#comment-14081844
 ] 

Wangda Tan commented on YARN-2008:
--

Hi [~cwelch],
Thanks for updating, now tests can cover all cases I can think about, 
A very minor comment:
Could you please add a small ε for all {{assertEquals}} like following?
bq. +assertEquals( 0.1f, result, 0.01f);

Thanks,
Wangda 



> CapacityScheduler may report incorrect queueMaxCap if there is hierarchy 
> queue structure 
> -
>
> Key: YARN-2008
> URL: https://issues.apache.org/jira/browse/YARN-2008
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.3.0
>Reporter: Chen He
>Assignee: Craig Welch
> Attachments: YARN-2008.1.patch, YARN-2008.2.patch, YARN-2008.3.patch, 
> YARN-2008.4.patch, YARN-2008.5.patch
>
>
> If there are two queues, both allowed to use 100% of the actual resources in 
> the cluster. Q1 and Q2 currently use 50% of actual cluster's resources and 
> there is not actual space available. If we use current method to get 
> headroom, CapacityScheduler thinks there are still available resources for 
> users in Q1 but they have been used by Q2. 
> If the CapacityScheduelr has a hierarchy queue structure, it may report 
> incorrect queueMaxCap. Here is a example
>  ||||rootQueue|| ||
> |  |   /   |  
>   \ |
> |  L1ParentQueue1  |  |
> L1ParentQueue2|
> |  (allowed to use up 80% of its parent)|  | (allowed to use 20% 
> in minimum of its parent)|
> |/   | \ ||  
> |  L2LeafQueue1 |L2LeafQueue2 |  | 
> |(50% of its parent) |  (50% of its parent in minimum) |   |
> When we calculate headroom of a user in L2LeafQueue2, current method will 
> think L2LeafQueue2 can use 40% (80%*50%) of actual rootQueue resources. 
> However, without checking L1ParentQueue1, we are not sure. It is possible 
> that L1ParentQueue2 have used 40% of rootQueue resources right now. Actually, 
> L2LeafQueue2 can only use 30% (60%*50%). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2212) ApplicationMaster needs to find a way to update the AMRMToken periodically

2014-07-31 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081841#comment-14081841
 ] 

Xuan Gong commented on YARN-2212:
-

Test can be passed locally..

> ApplicationMaster needs to find a way to update the AMRMToken periodically
> --
>
> Key: YARN-2212
> URL: https://issues.apache.org/jira/browse/YARN-2212
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2212.1.patch, YARN-2212.2.patch, 
> YARN-2212.3.1.patch, YARN-2212.3.patch, YARN-2212.4.patch, YARN-2212.5.patch, 
> YARN-2212.5.patch, YARN-2212.5.rebase.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2288) Data persistent in timelinestore should be versioned

2014-07-31 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-2288:
-

Attachment: YARN-2288.patch

Upload the patch to version timelinestore.

> Data persistent in timelinestore should be versioned
> 
>
> Key: YARN-2288
> URL: https://issues.apache.org/jira/browse/YARN-2288
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: 2.4.1
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-2288.patch
>
>
> We have LevelDB-backed TimelineStore, it should have schema version for 
> changes in schema in future.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2372) There are Chinese Characters in the FairScheduler's document

2014-07-31 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081832#comment-14081832
 ] 

Zhijie Shen commented on YARN-2372:
---

There're non-unicode double quotes in HdfsDesign.apt.vm and 
HdfsNfsGateway.apt.vm. It's not a big change, and I think we can fix them in 
one patch.

> There are Chinese Characters in the FairScheduler's document
> 
>
> Key: YARN-2372
> URL: https://issues.apache.org/jira/browse/YARN-2372
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.4.1
>Reporter: Fengdong Yu
>Assignee: Fengdong Yu
>Priority: Minor
> Attachments: YARN-2372.patch, YARN-2372.patch, YARN-2372.patch, 
> YARN-2372.patch, YARN-2372.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2212) ApplicationMaster needs to find a way to update the AMRMToken periodically

2014-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081826#comment-14081826
 ] 

Hadoop QA commented on YARN-2212:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12658995/YARN-2212.5.rebase.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4501//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4501//console

This message is automatically generated.

> ApplicationMaster needs to find a way to update the AMRMToken periodically
> --
>
> Key: YARN-2212
> URL: https://issues.apache.org/jira/browse/YARN-2212
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2212.1.patch, YARN-2212.2.patch, 
> YARN-2212.3.1.patch, YARN-2212.3.patch, YARN-2212.4.patch, YARN-2212.5.patch, 
> YARN-2212.5.patch, YARN-2212.5.rebase.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2008) CapacityScheduler may report incorrect queueMaxCap if there is hierarchy queue structure

2014-07-31 Thread Craig Welch (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2008:
--

Attachment: YARN-2008.5.patch

This time, actually with the additional tests :-) 

> CapacityScheduler may report incorrect queueMaxCap if there is hierarchy 
> queue structure 
> -
>
> Key: YARN-2008
> URL: https://issues.apache.org/jira/browse/YARN-2008
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.3.0
>Reporter: Chen He
>Assignee: Craig Welch
> Attachments: YARN-2008.1.patch, YARN-2008.2.patch, YARN-2008.3.patch, 
> YARN-2008.4.patch, YARN-2008.5.patch
>
>
> If there are two queues, both allowed to use 100% of the actual resources in 
> the cluster. Q1 and Q2 currently use 50% of actual cluster's resources and 
> there is not actual space available. If we use current method to get 
> headroom, CapacityScheduler thinks there are still available resources for 
> users in Q1 but they have been used by Q2. 
> If the CapacityScheduelr has a hierarchy queue structure, it may report 
> incorrect queueMaxCap. Here is a example
>  ||||rootQueue|| ||
> |  |   /   |  
>   \ |
> |  L1ParentQueue1  |  |
> L1ParentQueue2|
> |  (allowed to use up 80% of its parent)|  | (allowed to use 20% 
> in minimum of its parent)|
> |/   | \ ||  
> |  L2LeafQueue1 |L2LeafQueue2 |  | 
> |(50% of its parent) |  (50% of its parent in minimum) |   |
> When we calculate headroom of a user in L2LeafQueue2, current method will 
> think L2LeafQueue2 can use 40% (80%*50%) of actual rootQueue resources. 
> However, without checking L1ParentQueue1, we are not sure. It is possible 
> that L1ParentQueue2 have used 40% of rootQueue resources right now. Actually, 
> L2LeafQueue2 can only use 30% (60%*50%). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2377) Localization exception stack traces are not passed as diagnostic info

2014-07-31 Thread Gera Shegalov (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-2377:


Attachment: YARN-2377.v01.patch

v01 for review. With this you get a more actionable stack trace:

{code}
14/07/31 17:46:39 INFO mapreduce.Job: Job job_1406853387336_0001 failed with 
state FAILED due to: Application application_1406853387336_0001 failed 2 times 
due to AM Container for appattempt_1406853387336_0001_02 exited with  
exitCode: -1000
For more detailed output, check application tracking 
page:http://tw-mbp-gshegalov:8088/proxy/application_1406853387336_0001/Then, 
click on links to logs of each attempt.
Diagnostics: java.net.UnknownHostException: ha-nn-uri-0
java.lang.IllegalArgumentException: java.net.UnknownHostException: ha-nn-uri-0
at 
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:373)
at 
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:260)
at 
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:153)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:607)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:552)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:139)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2590)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2624)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2606)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:248)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:60)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:354)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:394)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1626)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:353)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:59)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
Caused by: java.net.UnknownHostException: ha-nn-uri-0
... 29 more
Caused by: ha-nn-uri-0
java.lang.IllegalArgumentException: java.net.UnknownHostException: ha-nn-uri-0
at 
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:373)
at 
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:260)
at 
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:153)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:607)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:552)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:139)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2590)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2624)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2606)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:248)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:60)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:354)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:394)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1626)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:353)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:59)
at java.util.concurrent.FutureTask$Sync.innerRun

[jira] [Commented] (YARN-2372) There are Chinese Characters in the FairScheduler's document

2014-07-31 Thread Fengdong Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081777#comment-14081777
 ] 

Fengdong Yu commented on YARN-2372:
---

I cannot find any more places on this issue by now. Thanks.

> There are Chinese Characters in the FairScheduler's document
> 
>
> Key: YARN-2372
> URL: https://issues.apache.org/jira/browse/YARN-2372
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.4.1
>Reporter: Fengdong Yu
>Assignee: Fengdong Yu
>Priority: Minor
> Attachments: YARN-2372.patch, YARN-2372.patch, YARN-2372.patch, 
> YARN-2372.patch, YARN-2372.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2377) Localization exception stack traces are not passed as diagnostic info

2014-07-31 Thread Gera Shegalov (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-2377:


Description: 
In the Localizer log one can only see this kind of message
{code}
14/07/31 10:29:00 INFO localizer.ResourceLocalizationService: DEBUG: FAILED { 
hdfs://ha-nn-uri-0:8020/tmp/hadoop-yarn/staging/gshegalov/.staging/job_1406825443306_0004/job.jar,
 1406827248944, PATTERN, (?:classes/|lib/).* }, java.net.UnknownHos tException: 
ha-nn-uri-0
{code}

And then only {{ java.net.UnknownHostException: ha-nn-uri-0}} message is 
propagated as diagnostics.

  was:
In the Localizer log one can only see this kind of message
{code}
14/07/31 10:29:00 INFO localizer.ResourceLocalizationService: DEBUG: FAILED { 
hdfs://ha-nn-uri-0:8020/tmp/hadoop-yarn/staging/gshegalov/.staging/job_1406825443306_0004/job.jar,
 1406827248944, PATTERN, (?:classes/|lib/).* }, java.net.UnknownHos tException: 
ha-nn-uri-0
{code}

And then only {{ java.net.UnknownHos tException: ha-nn-uri-0}} message is 
propagated as diagnostics.


> Localization exception stack traces are not passed as diagnostic info
> -
>
> Key: YARN-2377
> URL: https://issues.apache.org/jira/browse/YARN-2377
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
>
> In the Localizer log one can only see this kind of message
> {code}
> 14/07/31 10:29:00 INFO localizer.ResourceLocalizationService: DEBUG: FAILED { 
> hdfs://ha-nn-uri-0:8020/tmp/hadoop-yarn/staging/gshegalov/.staging/job_1406825443306_0004/job.jar,
>  1406827248944, PATTERN, (?:classes/|lib/).* }, java.net.UnknownHos 
> tException: ha-nn-uri-0
> {code}
> And then only {{ java.net.UnknownHostException: ha-nn-uri-0}} message is 
> propagated as diagnostics.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2377) Localization exception stack traces are not passed as diagnostic info

2014-07-31 Thread Gera Shegalov (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-2377:


Description: 
In the Localizer log one can only see this kind of message
{code}
14/07/31 10:29:00 INFO localizer.ResourceLocalizationService: DEBUG: FAILED { 
hdfs://ha-nn-uri-0:8020/tmp/hadoop-yarn/staging/gshegalov/.staging/job_1406825443306_0004/job.jar,
 1406827248944, PATTERN, (?:classes/|lib/).* }, java.net.UnknownHos tException: 
ha-nn-uri-0
{code}

And then only {{ java.net.UnknownHos tException: ha-nn-uri-0}} message is 
propagated as diagnostics.

  was:
In the Localizer log one can only see this kind of message
{code}
14/07/31 10:29:00 INFO localizer.ResourceLocalizationService: DEBUG: FAILED { 
hdfs://ha-nn-uri-0:8020/tmp/hadoop-yarn/staging/gshegalov/.staging/job_1406825443306_0004/job.jar,
 1406827248944, PATTERN, (?:classes/|lib/).* }, java.net.UnknownHos tException: 
ha-nn-uri-0
{code}

And then onlt {{ java.net.UnknownHos tException: ha-nn-uri-0}} message is 
propagated as diagnostics.


> Localization exception stack traces are not passed as diagnostic info
> -
>
> Key: YARN-2377
> URL: https://issues.apache.org/jira/browse/YARN-2377
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
>
> In the Localizer log one can only see this kind of message
> {code}
> 14/07/31 10:29:00 INFO localizer.ResourceLocalizationService: DEBUG: FAILED { 
> hdfs://ha-nn-uri-0:8020/tmp/hadoop-yarn/staging/gshegalov/.staging/job_1406825443306_0004/job.jar,
>  1406827248944, PATTERN, (?:classes/|lib/).* }, java.net.UnknownHos 
> tException: ha-nn-uri-0
> {code}
> And then only {{ java.net.UnknownHos tException: ha-nn-uri-0}} message is 
> propagated as diagnostics.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (YARN-2377) Localization exception stack traces are not passed as diagnostic info

2014-07-31 Thread Gera Shegalov (JIRA)

Gera Shegalov created YARN-2377:
---

 Summary: Localization exception stack traces are not passed as 
diagnostic info
 Key: YARN-2377
 URL: https://issues.apache.org/jira/browse/YARN-2377
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov


In the Localizer log one can only see this kind of message
{code}
14/07/31 10:29:00 INFO localizer.ResourceLocalizationService: DEBUG: FAILED { 
hdfs://ha-nn-uri-0:8020/tmp/hadoop-yarn/staging/gshegalov/.staging/job_1406825443306_0004/job.jar,
 1406827248944, PATTERN, (?:classes/|lib/).* }, java.net.UnknownHos tException: 
ha-nn-uri-0
{code}

And then onlt {{ java.net.UnknownHos tException: ha-nn-uri-0}} message is 
propagated as diagnostics.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2069) CS queue level preemption should respect user-limits

2014-07-31 Thread Mayank Bansal (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated YARN-2069:


Attachment: YARN-2069-trunk-9.patch

Fixing findbug warning and adding one more test case for no user limit.

Thanks,
Mayank

> CS queue level preemption should respect user-limits
> 
>
> Key: YARN-2069
> URL: https://issues.apache.org/jira/browse/YARN-2069
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Mayank Bansal
> Attachments: YARN-2069-trunk-1.patch, YARN-2069-trunk-2.patch, 
> YARN-2069-trunk-3.patch, YARN-2069-trunk-4.patch, YARN-2069-trunk-5.patch, 
> YARN-2069-trunk-6.patch, YARN-2069-trunk-7.patch, YARN-2069-trunk-8.patch, 
> YARN-2069-trunk-9.patch
>
>
> This is different from (even if related to, and likely share code with) 
> YARN-2113.
> YARN-2113 focuses on making sure that even if queue has its guaranteed 
> capacity, it's individual users are treated in-line with their limits 
> irrespective of when they join in.
> This JIRA is about respecting user-limits while preempting containers to 
> balance queue capacities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2008) CapacityScheduler may report incorrect queueMaxCap if there is hierarchy queue structure

2014-07-31 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081718#comment-14081718
 ] 

Wangda Tan commented on YARN-2008:
--

Hi [~cwelch],
I found the patch you updated is identical with *.3.patch, could you please 
check?

Thanks

> CapacityScheduler may report incorrect queueMaxCap if there is hierarchy 
> queue structure 
> -
>
> Key: YARN-2008
> URL: https://issues.apache.org/jira/browse/YARN-2008
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.3.0
>Reporter: Chen He
>Assignee: Craig Welch
> Attachments: YARN-2008.1.patch, YARN-2008.2.patch, YARN-2008.3.patch, 
> YARN-2008.4.patch
>
>
> If there are two queues, both allowed to use 100% of the actual resources in 
> the cluster. Q1 and Q2 currently use 50% of actual cluster's resources and 
> there is not actual space available. If we use current method to get 
> headroom, CapacityScheduler thinks there are still available resources for 
> users in Q1 but they have been used by Q2. 
> If the CapacityScheduelr has a hierarchy queue structure, it may report 
> incorrect queueMaxCap. Here is a example
>  ||||rootQueue|| ||
> |  |   /   |  
>   \ |
> |  L1ParentQueue1  |  |
> L1ParentQueue2|
> |  (allowed to use up 80% of its parent)|  | (allowed to use 20% 
> in minimum of its parent)|
> |/   | \ ||  
> |  L2LeafQueue1 |L2LeafQueue2 |  | 
> |(50% of its parent) |  (50% of its parent in minimum) |   |
> When we calculate headroom of a user in L2LeafQueue2, current method will 
> think L2LeafQueue2 can use 40% (80%*50%) of actual rootQueue resources. 
> However, without checking L1ParentQueue1, we are not sure. It is possible 
> that L1ParentQueue2 have used 40% of rootQueue resources right now. Actually, 
> L2LeafQueue2 can only use 30% (60%*50%). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (YARN-2376) Too many threads blocking on the global JobTracker lock from getJobCounters, optimize getJobCounters to release global JobTracker lock before access the per job counter i

2014-07-31 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu resolved YARN-2376.
-

Resolution: Duplicate

> Too many threads blocking on the global JobTracker lock from getJobCounters, 
> optimize getJobCounters to release global JobTracker lock before access the 
> per job counter in JobInProgress
> -
>
> Key: YARN-2376
> URL: https://issues.apache.org/jira/browse/YARN-2376
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: YARN-2376.000.patch
>
>
> Too many threads blocking on the global JobTracker lock from getJobCounters, 
> optimize getJobCounters to release global JobTracker lock before access the 
> per job counter in JobInProgress. It may be a lot of JobClients to call 
> getJobCounters in JobTracker at the same time, Current code will lock the 
> JobTracker to block all the threads to get counter from JobInProgress. It is 
> better to unlock the JobTracker when get counter from 
> JobInProgress(job.getCounters(counters)). So all the theads can run parallel 
> when access its own job counter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2376) Too many threads blocking on the global JobTracker lock from getJobCounters, optimize getJobCounters to release global JobTracker lock before access the per job counter in

2014-07-31 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2376:


Attachment: YARN-2376.000.patch

> Too many threads blocking on the global JobTracker lock from getJobCounters, 
> optimize getJobCounters to release global JobTracker lock before access the 
> per job counter in JobInProgress
> -
>
> Key: YARN-2376
> URL: https://issues.apache.org/jira/browse/YARN-2376
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: YARN-2376.000.patch
>
>
> Too many threads blocking on the global JobTracker lock from getJobCounters, 
> optimize getJobCounters to release global JobTracker lock before access the 
> per job counter in JobInProgress. It may be a lot of JobClients to call 
> getJobCounters in JobTracker at the same time, Current code will lock the 
> JobTracker to block all the threads to get counter from JobInProgress. It is 
> better to unlock the JobTracker when get counter from 
> JobInProgress(job.getCounters(counters)). So all the theads can run parallel 
> when access its own job counter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (YARN-2376) Too many threads blocking on the global JobTracker lock from getJobCounters, optimize getJobCounters to release global JobTracker lock before access the per job counter in

2014-07-31 Thread zhihai xu (JIRA)

zhihai xu created YARN-2376:
---

 Summary: Too many threads blocking on the global JobTracker lock 
from getJobCounters, optimize getJobCounters to release global JobTracker lock 
before access the per job counter in JobInProgress
 Key: YARN-2376
 URL: https://issues.apache.org/jira/browse/YARN-2376
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: zhihai xu
Assignee: zhihai xu


Too many threads blocking on the global JobTracker lock from getJobCounters, 
optimize getJobCounters to release global JobTracker lock before access the per 
job counter in JobInProgress. It may be a lot of JobClients to call 
getJobCounters in JobTracker at the same time, Current code will lock the 
JobTracker to block all the threads to get counter from JobInProgress. It is 
better to unlock the JobTracker when get counter from 
JobInProgress(job.getCounters(counters)). So all the theads can run parallel 
when access its own job counter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1994) Expose YARN/MR endpoints on multiple interfaces

2014-07-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081535#comment-14081535
 ] 

Hudson commented on YARN-1994:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5992 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5992/])
YARN-1994. Expose YARN/MR endpoints on multiple interfaces. Contributed by 
Craig Welch, Milan Potocnik,and Arpit Agarwal (xgong: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1614981)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/AppContext.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/MRClientService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockAppContext.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/jobhistory/JHAdminConfig.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRWebAppUtil.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryClientService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistory.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/server/HSAdminServer.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/WebAppUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryClientService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/WebServer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/C

[jira] [Commented] (YARN-1994) Expose YARN/MR endpoints on multiple interfaces

2014-07-31 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081508#comment-14081508
 ] 

Xuan Gong commented on YARN-1994:
-

Committed to trunk and branch-2. Thanks, Craig, Arpit and Milan

> Expose YARN/MR endpoints on multiple interfaces
> ---
>
> Key: YARN-1994
> URL: https://issues.apache.org/jira/browse/YARN-1994
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager, webapp
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>Assignee: Craig Welch
> Fix For: 2.6.0
>
> Attachments: YARN-1994.0.patch, YARN-1994.1.patch, 
> YARN-1994.11.patch, YARN-1994.11.patch, YARN-1994.12.patch, 
> YARN-1994.13.patch, YARN-1994.14.patch, YARN-1994.15-branch2.patch, 
> YARN-1994.15.patch, YARN-1994.2.patch, YARN-1994.3.patch, YARN-1994.4.patch, 
> YARN-1994.5.patch, YARN-1994.6.patch, YARN-1994.7.patch
>
>
> YARN and MapReduce daemons currently do not support specifying a wildcard 
> address for the server endpoints. This prevents the endpoints from being 
> accessible from all interfaces on a multihomed machine.
> Note that if we do specify INADDR_ANY for any of the options, it will break 
> clients as they will attempt to connect to 0.0.0.0. We need a solution that 
> allows specifying a hostname or IP-address for clients while requesting 
> wildcard bind for the servers.
> (List of endpoints is in a comment below)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1994) Expose YARN/MR endpoints on multiple interfaces

2014-07-31 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081505#comment-14081505
 ] 

Xuan Gong commented on YARN-1994:
-

+1 LGTM. Thanks Craig for providing the branch-2 patch

> Expose YARN/MR endpoints on multiple interfaces
> ---
>
> Key: YARN-1994
> URL: https://issues.apache.org/jira/browse/YARN-1994
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager, webapp
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>Assignee: Craig Welch
> Attachments: YARN-1994.0.patch, YARN-1994.1.patch, 
> YARN-1994.11.patch, YARN-1994.11.patch, YARN-1994.12.patch, 
> YARN-1994.13.patch, YARN-1994.14.patch, YARN-1994.15-branch2.patch, 
> YARN-1994.15.patch, YARN-1994.2.patch, YARN-1994.3.patch, YARN-1994.4.patch, 
> YARN-1994.5.patch, YARN-1994.6.patch, YARN-1994.7.patch
>
>
> YARN and MapReduce daemons currently do not support specifying a wildcard 
> address for the server endpoints. This prevents the endpoints from being 
> accessible from all interfaces on a multihomed machine.
> Note that if we do specify INADDR_ANY for any of the options, it will break 
> clients as they will attempt to connect to 0.0.0.0. We need a solution that 
> allows specifying a hostname or IP-address for clients while requesting 
> wildcard bind for the servers.
> (List of endpoints is in a comment below)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1994) Expose YARN/MR endpoints on multiple interfaces

2014-07-31 Thread Craig Welch (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-1994:
--

Attachment: YARN-1994.15-branch2.patch

Adding a version of the patch for branch-2, because the one from trunk doesn't 
cleanly apply.  Minor changes to deal with some other uncommited work from 
trunk in a unit test.  This patch will fail when applied to trunk most likely, 
that can be ignored.

> Expose YARN/MR endpoints on multiple interfaces
> ---
>
> Key: YARN-1994
> URL: https://issues.apache.org/jira/browse/YARN-1994
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager, webapp
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>Assignee: Craig Welch
> Attachments: YARN-1994.0.patch, YARN-1994.1.patch, 
> YARN-1994.11.patch, YARN-1994.11.patch, YARN-1994.12.patch, 
> YARN-1994.13.patch, YARN-1994.14.patch, YARN-1994.15-branch2.patch, 
> YARN-1994.15.patch, YARN-1994.2.patch, YARN-1994.3.patch, YARN-1994.4.patch, 
> YARN-1994.5.patch, YARN-1994.6.patch, YARN-1994.7.patch
>
>
> YARN and MapReduce daemons currently do not support specifying a wildcard 
> address for the server endpoints. This prevents the endpoints from being 
> accessible from all interfaces on a multihomed machine.
> Note that if we do specify INADDR_ANY for any of the options, it will break 
> clients as they will attempt to connect to 0.0.0.0. We need a solution that 
> allows specifying a hostname or IP-address for clients while requesting 
> wildcard bind for the servers.
> (List of endpoints is in a comment below)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2304) TestWebServices fails intermittently

2014-07-31 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081487#comment-14081487
 ] 

Tsuyoshi OZAWA commented on YARN-2304:
--

[~zjshen], thank you for notifying. After the work by [~jlowe], we don't see 
the test failure anymore. Closed as a fixed problem.

> Test*WebServices* fails intermittently
> --
>
> Key: YARN-2304
> URL: https://issues.apache.org/jira/browse/YARN-2304
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Tsuyoshi OZAWA
> Attachments: test-failure-log-RMWeb.txt
>
>
> TestNMWebService, TestRMWebService, and TestAMWebService get failed with 
> address already get bind.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (YARN-2304) TestWebServices fails intermittently

2014-07-31 Thread Tsuyoshi OZAWA (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA resolved YARN-2304.
--

Resolution: Fixed

> Test*WebServices* fails intermittently
> --
>
> Key: YARN-2304
> URL: https://issues.apache.org/jira/browse/YARN-2304
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Tsuyoshi OZAWA
> Attachments: test-failure-log-RMWeb.txt
>
>
> TestNMWebService, TestRMWebService, and TestAMWebService get failed with 
> address already get bind.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2212) ApplicationMaster needs to find a way to update the AMRMToken periodically

2014-07-31 Thread Xuan Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-2212:


Attachment: YARN-2212.5.rebase.patch

> ApplicationMaster needs to find a way to update the AMRMToken periodically
> --
>
> Key: YARN-2212
> URL: https://issues.apache.org/jira/browse/YARN-2212
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2212.1.patch, YARN-2212.2.patch, 
> YARN-2212.3.1.patch, YARN-2212.3.patch, YARN-2212.4.patch, YARN-2212.5.patch, 
> YARN-2212.5.patch, YARN-2212.5.rebase.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2008) CapacityScheduler may report incorrect queueMaxCap if there is hierarchy queue structure

2014-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081382#comment-14081382
 ] 

Hadoop QA commented on YARN-2008:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12658970/YARN-2008.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4500//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4500//console

This message is automatically generated.

> CapacityScheduler may report incorrect queueMaxCap if there is hierarchy 
> queue structure 
> -
>
> Key: YARN-2008
> URL: https://issues.apache.org/jira/browse/YARN-2008
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.3.0
>Reporter: Chen He
>Assignee: Craig Welch
> Attachments: YARN-2008.1.patch, YARN-2008.2.patch, YARN-2008.3.patch, 
> YARN-2008.4.patch
>
>
> If there are two queues, both allowed to use 100% of the actual resources in 
> the cluster. Q1 and Q2 currently use 50% of actual cluster's resources and 
> there is not actual space available. If we use current method to get 
> headroom, CapacityScheduler thinks there are still available resources for 
> users in Q1 but they have been used by Q2. 
> If the CapacityScheduelr has a hierarchy queue structure, it may report 
> incorrect queueMaxCap. Here is a example
>  ||||rootQueue|| ||
> |  |   /   |  
>   \ |
> |  L1ParentQueue1  |  |
> L1ParentQueue2|
> |  (allowed to use up 80% of its parent)|  | (allowed to use 20% 
> in minimum of its parent)|
> |/   | \ ||  
> |  L2LeafQueue1 |L2LeafQueue2 |  | 
> |(50% of its parent) |  (50% of its parent in minimum) |   |
> When we calculate headroom of a user in L2LeafQueue2, current method will 
> think L2LeafQueue2 can use 40% (80%*50%) of actual rootQueue resources. 
> However, without checking L1ParentQueue1, we are not sure. It is possible 
> that L1ParentQueue2 have used 40% of rootQueue resources right now. Actually, 
> L2LeafQueue2 can only use 30% (60%*50%). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1994) Expose YARN/MR endpoints on multiple interfaces

2014-07-31 Thread Milan Potocnik (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081341#comment-14081341
 ] 

Milan Potocnik commented on YARN-1994:
--

[~cwelch] looks good, thanks for the effort!

+1 from me

> Expose YARN/MR endpoints on multiple interfaces
> ---
>
> Key: YARN-1994
> URL: https://issues.apache.org/jira/browse/YARN-1994
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager, webapp
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>Assignee: Craig Welch
> Attachments: YARN-1994.0.patch, YARN-1994.1.patch, 
> YARN-1994.11.patch, YARN-1994.11.patch, YARN-1994.12.patch, 
> YARN-1994.13.patch, YARN-1994.14.patch, YARN-1994.15.patch, 
> YARN-1994.2.patch, YARN-1994.3.patch, YARN-1994.4.patch, YARN-1994.5.patch, 
> YARN-1994.6.patch, YARN-1994.7.patch
>
>
> YARN and MapReduce daemons currently do not support specifying a wildcard 
> address for the server endpoints. This prevents the endpoints from being 
> accessible from all interfaces on a multihomed machine.
> Note that if we do specify INADDR_ANY for any of the options, it will break 
> clients as they will attempt to connect to 0.0.0.0. We need a solution that 
> allows specifying a hostname or IP-address for clients while requesting 
> wildcard bind for the servers.
> (List of endpoints is in a comment below)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2069) CS queue level preemption should respect user-limits

2014-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081331#comment-14081331
 ] 

Hadoop QA commented on YARN-2069:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12658971/YARN-2069-trunk-8.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4499//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/4499//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4499//console

This message is automatically generated.

> CS queue level preemption should respect user-limits
> 
>
> Key: YARN-2069
> URL: https://issues.apache.org/jira/browse/YARN-2069
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Mayank Bansal
> Attachments: YARN-2069-trunk-1.patch, YARN-2069-trunk-2.patch, 
> YARN-2069-trunk-3.patch, YARN-2069-trunk-4.patch, YARN-2069-trunk-5.patch, 
> YARN-2069-trunk-6.patch, YARN-2069-trunk-7.patch, YARN-2069-trunk-8.patch
>
>
> This is different from (even if related to, and likely share code with) 
> YARN-2113.
> YARN-2113 focuses on making sure that even if queue has its guaranteed 
> capacity, it's individual users are treated in-line with their limits 
> irrespective of when they join in.
> This JIRA is about respecting user-limits while preempting containers to 
> balance queue capacities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (YARN-2375) Allow enabling/disabling timeline server per framework

2014-07-31 Thread Jonathan Eagles (JIRA)

Jonathan Eagles created YARN-2375:
-

 Summary: Allow enabling/disabling timeline server per framework
 Key: YARN-2375
 URL: https://issues.apache.org/jira/browse/YARN-2375
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2033) Investigate merging generic-history into the Timeline Store

2014-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081227#comment-14081227
 ] 

Hadoop QA commented on YARN-2033:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12658949/YARN-2033_ALL.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 20 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4498//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4498//console

This message is automatically generated.

> Investigate merging generic-history into the Timeline Store
> ---
>
> Key: YARN-2033
> URL: https://issues.apache.org/jira/browse/YARN-2033
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Zhijie Shen
> Attachments: ProposalofStoringYARNMetricsintotheTimelineStore.pdf, 
> YARN-2033.1.patch, YARN-2033.2.patch, YARN-2033.3.patch, YARN-2033.4.patch, 
> YARN-2033.Prototype.patch, YARN-2033_ALL.1.patch, YARN-2033_ALL.2.patch, 
> YARN-2033_ALL.3.patch, YARN-2033_ALL.4.patch
>
>
> Having two different stores isn't amicable to generic insights on what's 
> happening with applications. This is to investigate porting generic-history 
> into the Timeline Store.
> One goal is to try and retain most of the client side interfaces as close to 
> what we have today.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1707) Making the CapacityScheduler more dynamic

2014-07-31 Thread Carlo Curino (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081234#comment-14081234
 ] 

Carlo Curino commented on YARN-1707:


Agreed on all of the above.
{quote}
I think for moving application across queue is not a ReservationSystem specific 
change. I would suggest to check it will not violate restrictions in target 
queue before moving it.
{quote}

This makes sense, we should compile a list of invariant to check for (I have a 
few in mind, but feedback is likely useful).

Thanks,
Carlo

> Making the CapacityScheduler more dynamic
> -
>
> Key: YARN-1707
> URL: https://issues.apache.org/jira/browse/YARN-1707
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Carlo Curino
>Assignee: Carlo Curino
>  Labels: capacity-scheduler
> Attachments: YARN-1707.patch
>
>
> The CapacityScheduler is a rather static at the moment, and refreshqueue 
> provides a rather heavy-handed way to reconfigure it. Moving towards 
> long-running services (tracked in YARN-896) and to enable more advanced 
> admission control and resource parcelling we need to make the 
> CapacityScheduler more dynamic. This is instrumental to the umbrella jira 
> YARN-1051.
> Concretely this require the following changes:
> * create queues dynamically
> * destroy queues dynamically
> * dynamically change queue parameters (e.g., capacity) 
> * modify refreshqueue validation to enforce sum(child.getCapacity())<= 100% 
> instead of ==100%
> We limit this to LeafQueues. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1994) Expose YARN/MR endpoints on multiple interfaces

2014-07-31 Thread Craig Welch (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081207#comment-14081207
 ] 

Craig Welch commented on YARN-1994:
---

[~mipoto] Can you take a look at the latest patch?

> Expose YARN/MR endpoints on multiple interfaces
> ---
>
> Key: YARN-1994
> URL: https://issues.apache.org/jira/browse/YARN-1994
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager, webapp
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>Assignee: Craig Welch
> Attachments: YARN-1994.0.patch, YARN-1994.1.patch, 
> YARN-1994.11.patch, YARN-1994.11.patch, YARN-1994.12.patch, 
> YARN-1994.13.patch, YARN-1994.14.patch, YARN-1994.15.patch, 
> YARN-1994.2.patch, YARN-1994.3.patch, YARN-1994.4.patch, YARN-1994.5.patch, 
> YARN-1994.6.patch, YARN-1994.7.patch
>
>
> YARN and MapReduce daemons currently do not support specifying a wildcard 
> address for the server endpoints. This prevents the endpoints from being 
> accessible from all interfaces on a multihomed machine.
> Note that if we do specify INADDR_ANY for any of the options, it will break 
> clients as they will attempt to connect to 0.0.0.0. We need a solution that 
> allows specifying a hostname or IP-address for clients while requesting 
> wildcard bind for the servers.
> (List of endpoints is in a comment below)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1994) Expose YARN/MR endpoints on multiple interfaces

2014-07-31 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081203#comment-14081203
 ] 

Xuan Gong commented on YARN-1994:
-

[~mipoto] Do you have any other comments for this ?

> Expose YARN/MR endpoints on multiple interfaces
> ---
>
> Key: YARN-1994
> URL: https://issues.apache.org/jira/browse/YARN-1994
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager, webapp
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>Assignee: Craig Welch
> Attachments: YARN-1994.0.patch, YARN-1994.1.patch, 
> YARN-1994.11.patch, YARN-1994.11.patch, YARN-1994.12.patch, 
> YARN-1994.13.patch, YARN-1994.14.patch, YARN-1994.15.patch, 
> YARN-1994.2.patch, YARN-1994.3.patch, YARN-1994.4.patch, YARN-1994.5.patch, 
> YARN-1994.6.patch, YARN-1994.7.patch
>
>
> YARN and MapReduce daemons currently do not support specifying a wildcard 
> address for the server endpoints. This prevents the endpoints from being 
> accessible from all interfaces on a multihomed machine.
> Note that if we do specify INADDR_ANY for any of the options, it will break 
> clients as they will attempt to connect to 0.0.0.0. We need a solution that 
> allows specifying a hostname or IP-address for clients while requesting 
> wildcard bind for the servers.
> (List of endpoints is in a comment below)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2069) CS queue level preemption should respect user-limits

2014-07-31 Thread Mayank Bansal (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated YARN-2069:


Attachment: YARN-2069-trunk-8.patch

> CS queue level preemption should respect user-limits
> 
>
> Key: YARN-2069
> URL: https://issues.apache.org/jira/browse/YARN-2069
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Mayank Bansal
> Attachments: YARN-2069-trunk-1.patch, YARN-2069-trunk-2.patch, 
> YARN-2069-trunk-3.patch, YARN-2069-trunk-4.patch, YARN-2069-trunk-5.patch, 
> YARN-2069-trunk-6.patch, YARN-2069-trunk-7.patch, YARN-2069-trunk-8.patch
>
>
> This is different from (even if related to, and likely share code with) 
> YARN-2113.
> YARN-2113 focuses on making sure that even if queue has its guaranteed 
> capacity, it's individual users are treated in-line with their limits 
> irrespective of when they join in.
> This JIRA is about respecting user-limits while preempting containers to 
> balance queue capacities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2069) CS queue level preemption should respect user-limits

2014-07-31 Thread Mayank Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081193#comment-14081193
 ] 

Mayank Bansal commented on YARN-2069:
-

Hi [~wangda] ,

Thanks for your review comments.

Updating the patch with the fix.

Thanks,
Mayank

> CS queue level preemption should respect user-limits
> 
>
> Key: YARN-2069
> URL: https://issues.apache.org/jira/browse/YARN-2069
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Mayank Bansal
> Attachments: YARN-2069-trunk-1.patch, YARN-2069-trunk-2.patch, 
> YARN-2069-trunk-3.patch, YARN-2069-trunk-4.patch, YARN-2069-trunk-5.patch, 
> YARN-2069-trunk-6.patch, YARN-2069-trunk-7.patch, YARN-2069-trunk-8.patch
>
>
> This is different from (even if related to, and likely share code with) 
> YARN-2113.
> YARN-2113 focuses on making sure that even if queue has its guaranteed 
> capacity, it's individual users are treated in-line with their limits 
> irrespective of when they join in.
> This JIRA is about respecting user-limits while preempting containers to 
> balance queue capacities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2008) CapacityScheduler may report incorrect queueMaxCap if there is hierarchy queue structure

2014-07-31 Thread Craig Welch (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2008:
--

Attachment: YARN-2008.4.patch

Some additional tests for direct siblings

> CapacityScheduler may report incorrect queueMaxCap if there is hierarchy 
> queue structure 
> -
>
> Key: YARN-2008
> URL: https://issues.apache.org/jira/browse/YARN-2008
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.3.0
>Reporter: Chen He
>Assignee: Craig Welch
> Attachments: YARN-2008.1.patch, YARN-2008.2.patch, YARN-2008.3.patch, 
> YARN-2008.4.patch
>
>
> If there are two queues, both allowed to use 100% of the actual resources in 
> the cluster. Q1 and Q2 currently use 50% of actual cluster's resources and 
> there is not actual space available. If we use current method to get 
> headroom, CapacityScheduler thinks there are still available resources for 
> users in Q1 but they have been used by Q2. 
> If the CapacityScheduelr has a hierarchy queue structure, it may report 
> incorrect queueMaxCap. Here is a example
>  ||||rootQueue|| ||
> |  |   /   |  
>   \ |
> |  L1ParentQueue1  |  |
> L1ParentQueue2|
> |  (allowed to use up 80% of its parent)|  | (allowed to use 20% 
> in minimum of its parent)|
> |/   | \ ||  
> |  L2LeafQueue1 |L2LeafQueue2 |  | 
> |(50% of its parent) |  (50% of its parent in minimum) |   |
> When we calculate headroom of a user in L2LeafQueue2, current method will 
> think L2LeafQueue2 can use 40% (80%*50%) of actual rootQueue resources. 
> However, without checking L1ParentQueue1, we are not sure. It is possible 
> that L1ParentQueue2 have used 40% of rootQueue resources right now. Actually, 
> L2LeafQueue2 can only use 30% (60%*50%). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2283) RM failed to release the AM container

2014-07-31 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081170#comment-14081170
 ] 

Sunil G commented on YARN-2283:
---

Thank you [~jlowe]. Yes, I have taken the thread dump and could see 
ThreadPoolExecutor is still there. I have applied patch and verified the same, 
it is not creating the same problem. Thank you.

> RM failed to release the AM container
> -
>
> Key: YARN-2283
> URL: https://issues.apache.org/jira/browse/YARN-2283
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
> Environment: NM1: AM running
> NM2: Map task running
> mapreduce.map.maxattempts=1
>Reporter: Nishan Shetty
>Priority: Critical
>
> During container stability test i faced this problem
> While job is running map task got killed
> Observe that eventhough application is FAILED MRAppMaster process is running 
> till timeout because RM did not release  the AM container
> {code}
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1405318134611_0002_01_05 Container Transitioned from RUNNING to 
> COMPLETED
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Completed container: container_1405318134611_0002_01_05 in state: 
> COMPLETED event:FINISHED
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=testos 
> OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS  
> APPID=application_1405318134611_0002
> CONTAINERID=container_1405318134611_0002_01_05
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore:
>  Finish information of container container_1405318134611_0002_01_05 is 
> written
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter: 
> Stored the finish data of container container_1405318134611_0002_01_05
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode:
>  Released container container_1405318134611_0002_01_05 of capacity 
>  on host HOST-10-18-40-153:45026, which currently has 
> 1 containers,  used and  
> available, release resources=true
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> default used= numContainers=1 user=testos 
> user-resources=
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> completedContainer container=Container: [ContainerId: 
> container_1405318134611_0002_01_05, NodeId: HOST-10-18-40-153:45026, 
> NodeHttpAddress: HOST-10-18-40-153:45025, Resource: , 
> Priority: 5, Token: Token { kind: ContainerToken, service: 10.18.40.153:45026 
> }, ] queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.25, 
> absoluteUsedCapacity=0.25, numApps=1, numContainers=1 cluster= vCores:8>
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> completedContainer queue=root usedCapacity=0.25 absoluteUsedCapacity=0.25 
> used= cluster=
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Re-sorting completed queue: root.default stats: default: capacity=1.0, 
> absoluteCapacity=1.0, usedResources=, 
> usedCapacity=0.25, absoluteUsedCapacity=0.25, numApps=1, numContainers=1
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Application attempt appattempt_1405318134611_0002_01 released container 
> container_1405318134611_0002_01_05 on node: host: HOST-10-18-40-153:45026 
> #containers=1 available=6144 used=2048 with event: FINISHED
> 2014-07-14 14:43:34,924 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Updating application attempt appattempt_1405318134611_0002_01 with final 
> state: FINISHING
> 2014-07-14 14:43:34,924 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1405318134611_0002_01 State change from RUNNING to FINAL_SAVING
> 2014-07-14 14:43:34,924 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating 
> application application_1405318134611_0002 with final state: FINISHING
> 2014-07-14 14:43:34,947 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Watcher event type: NodeDataChanged with state:SyncConnected for 
> path:/rmstore/ZKRMStateRoot/RMAppRoot/application_1405318134611_0002/app

[jira] [Updated] (YARN-2033) Investigate merging generic-history into the Timeline Store

2014-07-31 Thread Zhijie Shen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-2033:
--

Attachment: YARN-2033_ALL.4.patch

> Investigate merging generic-history into the Timeline Store
> ---
>
> Key: YARN-2033
> URL: https://issues.apache.org/jira/browse/YARN-2033
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Zhijie Shen
> Attachments: ProposalofStoringYARNMetricsintotheTimelineStore.pdf, 
> YARN-2033.1.patch, YARN-2033.2.patch, YARN-2033.3.patch, YARN-2033.4.patch, 
> YARN-2033.Prototype.patch, YARN-2033_ALL.1.patch, YARN-2033_ALL.2.patch, 
> YARN-2033_ALL.3.patch, YARN-2033_ALL.4.patch
>
>
> Having two different stores isn't amicable to generic insights on what's 
> happening with applications. This is to investigate porting generic-history 
> into the Timeline Store.
> One goal is to try and retain most of the client side interfaces as close to 
> what we have today.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2033) Investigate merging generic-history into the Timeline Store

2014-07-31 Thread Zhijie Shen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-2033:
--

Attachment: YARN-2033.4.patch

Rebase against the latest trunk, and fix some bugs

> Investigate merging generic-history into the Timeline Store
> ---
>
> Key: YARN-2033
> URL: https://issues.apache.org/jira/browse/YARN-2033
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Zhijie Shen
> Attachments: ProposalofStoringYARNMetricsintotheTimelineStore.pdf, 
> YARN-2033.1.patch, YARN-2033.2.patch, YARN-2033.3.patch, YARN-2033.4.patch, 
> YARN-2033.Prototype.patch, YARN-2033_ALL.1.patch, YARN-2033_ALL.2.patch, 
> YARN-2033_ALL.3.patch, YARN-2033_ALL.4.patch
>
>
> Having two different stores isn't amicable to generic insights on what's 
> happening with applications. This is to investigate porting generic-history 
> into the Timeline Store.
> One goal is to try and retain most of the client side interfaces as close to 
> what we have today.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2212) ApplicationMaster needs to find a way to update the AMRMToken periodically

2014-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081090#comment-14081090
 ] 

Hadoop QA commented on YARN-2212:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12658944/YARN-2212.5.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4497//console

This message is automatically generated.

> ApplicationMaster needs to find a way to update the AMRMToken periodically
> --
>
> Key: YARN-2212
> URL: https://issues.apache.org/jira/browse/YARN-2212
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2212.1.patch, YARN-2212.2.patch, 
> YARN-2212.3.1.patch, YARN-2212.3.patch, YARN-2212.4.patch, YARN-2212.5.patch, 
> YARN-2212.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2212) ApplicationMaster needs to find a way to update the AMRMToken periodically

2014-07-31 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081078#comment-14081078
 ] 

Xuan Gong commented on YARN-2212:
-

submit the same patch

> ApplicationMaster needs to find a way to update the AMRMToken periodically
> --
>
> Key: YARN-2212
> URL: https://issues.apache.org/jira/browse/YARN-2212
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2212.1.patch, YARN-2212.2.patch, 
> YARN-2212.3.1.patch, YARN-2212.3.patch, YARN-2212.4.patch, YARN-2212.5.patch, 
> YARN-2212.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2212) ApplicationMaster needs to find a way to update the AMRMToken periodically

2014-07-31 Thread Xuan Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-2212:


Attachment: YARN-2212.5.patch

> ApplicationMaster needs to find a way to update the AMRMToken periodically
> --
>
> Key: YARN-2212
> URL: https://issues.apache.org/jira/browse/YARN-2212
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2212.1.patch, YARN-2212.2.patch, 
> YARN-2212.3.1.patch, YARN-2212.3.patch, YARN-2212.4.patch, YARN-2212.5.patch, 
> YARN-2212.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2304) TestWebServices fails intermittently

2014-07-31 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081029#comment-14081029
 ] 

Zhijie Shen commented on YARN-2304:
---

It seems that the test failures don't happen any more. Shall we close the jira?

> Test*WebServices* fails intermittently
> --
>
> Key: YARN-2304
> URL: https://issues.apache.org/jira/browse/YARN-2304
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Tsuyoshi OZAWA
> Attachments: test-failure-log-RMWeb.txt
>
>
> TestNMWebService, TestRMWebService, and TestAMWebService get failed with 
> address already get bind.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1994) Expose YARN/MR endpoints on multiple interfaces

2014-07-31 Thread Craig Welch (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081025#comment-14081025
 ] 

Craig Welch commented on YARN-1994:
---

[~xgong] [~arpitagarwal] [~mipoto] patch .15 should be good to go - please take 
a look.  This is the .11 patch Xuan and Arpit already +1ed with the following 
two changes:
Milan's logic to support overriding the hostname in bind-host + service address 
cases added back it - factored slightly differently to insure it does not 
change behavior unless these have been configured, and moved to overloaded 
methods in Configuration where the base logic resides.  The only other change 
was that I  also moved the getSocketAddr to Configuration as well, I had wanted 
to do this originally to bring it closer to the original code - I didn't 
bother, but since I was making changes/retesting anyway, I went ahead and did 
it.  The new tests were changed to match.  [~mipoto], I successfully tested 
this with an "introduced hostname" which was not the "base hostname" of the 
box, and it worked as desired (this overrode the used name/connect address 
based on bind-host + address configuration to the "introduced hostname")

> Expose YARN/MR endpoints on multiple interfaces
> ---
>
> Key: YARN-1994
> URL: https://issues.apache.org/jira/browse/YARN-1994
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager, webapp
>Affects Versions: 2.4.0
>Reporter: Arpit Agarwal
>Assignee: Craig Welch
> Attachments: YARN-1994.0.patch, YARN-1994.1.patch, 
> YARN-1994.11.patch, YARN-1994.11.patch, YARN-1994.12.patch, 
> YARN-1994.13.patch, YARN-1994.14.patch, YARN-1994.15.patch, 
> YARN-1994.2.patch, YARN-1994.3.patch, YARN-1994.4.patch, YARN-1994.5.patch, 
> YARN-1994.6.patch, YARN-1994.7.patch
>
>
> YARN and MapReduce daemons currently do not support specifying a wildcard 
> address for the server endpoints. This prevents the endpoints from being 
> accessible from all interfaces on a multihomed machine.
> Note that if we do specify INADDR_ANY for any of the options, it will break 
> clients as they will attempt to connect to 0.0.0.0. We need a solution that 
> allows specifying a hostname or IP-address for clients while requesting 
> wildcard bind for the servers.
> (List of endpoints is in a comment below)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2347) Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in yarn-server-common

2014-07-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080961#comment-14080961
 ] 

Hudson commented on YARN-2347:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1848 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1848/])
YARN-2347. Consolidated RMStateVersion and NMDBSchemaVersion into Version in 
yarn-server-common. Contributed by Junping Du. (zjshen: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1614838)
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/Version.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/impl
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/impl/pb
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/impl/pb/VersionPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/records
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/proto/yarn_server_nodemanager_recovery.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/TestNMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/RMStateVersion.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/impl/pb/RMStateVersionPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java


> Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in 
> yarn-server-common
> -

[jira] [Commented] (YARN-2198) Remove the need to run NodeManager as privileged account for Windows Secure Container Executor

2014-07-31 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080941#comment-14080941
 ] 

Vinod Kumar Vavilapalli commented on YARN-2198:
---

Also, a nit: WintuilsProcessStubExecutor.assumeComplete -> assertComplete?

> Remove the need to run NodeManager as privileged account for Windows Secure 
> Container Executor
> --
>
> Key: YARN-2198
> URL: https://issues.apache.org/jira/browse/YARN-2198
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-2198.1.patch, YARN-2198.2.patch
>
>
> YARN-1972 introduces a Secure Windows Container Executor. However this 
> executor requires a the process launching the container to be LocalSystem or 
> a member of the a local Administrators group. Since the process in question 
> is the NodeManager, the requirement translates to the entire NM to run as a 
> privileged account, a very large surface area to review and protect.
> This proposal is to move the privileged operations into a dedicated NT 
> service. The NM can run as a low privilege account and communicate with the 
> privileged NT service when it needs to launch a container. This would reduce 
> the surface exposed to the high privileges. 
> There has to exist a secure, authenticated and authorized channel of 
> communication between the NM and the privileged NT service. Possible 
> alternatives are a new TCP endpoint, Java RPC etc. My proposal though would 
> be to use Windows LPC (Local Procedure Calls), which is a Windows platform 
> specific inter-process communication channel that satisfies all requirements 
> and is easy to deploy. The privileged NT service would register and listen on 
> an LPC port (NtCreatePort, NtListenPort). The NM would use JNI to interop 
> with libwinutils which would host the LPC client code. The client would 
> connect to the LPC port (NtConnectPort) and send a message requesting a 
> container launch (NtRequestWaitReplyPort). LPC provides authentication and 
> the privileged NT service can use authorization API (AuthZ) to validate the 
> caller.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2051) Fix code bug and add more unit tests for PBImpls

2014-07-31 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080934#comment-14080934
 ] 

Junping Du commented on YARN-2051:
--

+1. Patch looks good to me. Will commit it tomorrow if no more feedback from 
others.

> Fix code bug and add more unit tests for PBImpls
> 
>
> Key: YARN-2051
> URL: https://issues.apache.org/jira/browse/YARN-2051
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Junping Du
>Assignee: Binglin Chang
>Priority: Critical
> Attachments: YARN-2051.v1.patch, YARN-2051.v2.patch
>
>
> From YARN-2016, we can see some bug could exist in PB implementation of 
> protocol. The bad news is most of these PBImpl don't have any unit test to 
> verify the info is not lost or changed after serialization/deserialization. 
> We should add more tests for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2198) Remove the need to run NodeManager as privileged account for Windows Secure Container Executor

2014-07-31 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080936#comment-14080936
 ] 

Vinod Kumar Vavilapalli commented on YARN-2198:
---

Skimmed through the Windows native code and the common changes, look fine 
overall. Hoping someone with Windows knowledge ([~ivanmi]?) look at the native 
code and someone else ([~cnauroth]?) at the common changes more carefully.

Reviewed the patch with focus on the YARN changes. Some comments follow..

bq. With a helper service the nodemanager no longer gets a free lunch of 
accessing the task stdout/stderr
The NM never explicitly reads the stdout/stderr from the container, the streams 
are redirected today to their own log files according as the user's code 
dictates (for e.g in linux bash -c "user-command.sh 1> stderr 2>stdout"). Do we 
need to do this in the WintuilsProcessStubExecutor ?

The LinuxContainerExecutor reads the configuration from a 
container-executor.cfg. We may want to unify the configuration for the 
executors if in another JIRA.

Rename hadoopwinutilsvc* interfaces, file-names, classes to be something like 
WindowsContainerLauncherService or similar to be explicit?

Not sure to me from the patch as to how the service's port is configured. Is it 
at the start time or through some configuration?

bq. 1. Service Access check.
Sorry for repeating what you said but if I understand correctly,  we need two 
things (1) restricting users who can launch the special service and (2) 
restricting callers who can invoke the RPCs. So, this is done by the 
combination of the OS doing the authentication and the authorization being 
explicitly done by the service using the allowed list. Right?

> Remove the need to run NodeManager as privileged account for Windows Secure 
> Container Executor
> --
>
> Key: YARN-2198
> URL: https://issues.apache.org/jira/browse/YARN-2198
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-2198.1.patch, YARN-2198.2.patch
>
>
> YARN-1972 introduces a Secure Windows Container Executor. However this 
> executor requires a the process launching the container to be LocalSystem or 
> a member of the a local Administrators group. Since the process in question 
> is the NodeManager, the requirement translates to the entire NM to run as a 
> privileged account, a very large surface area to review and protect.
> This proposal is to move the privileged operations into a dedicated NT 
> service. The NM can run as a low privilege account and communicate with the 
> privileged NT service when it needs to launch a container. This would reduce 
> the surface exposed to the high privileges. 
> There has to exist a secure, authenticated and authorized channel of 
> communication between the NM and the privileged NT service. Possible 
> alternatives are a new TCP endpoint, Java RPC etc. My proposal though would 
> be to use Windows LPC (Local Procedure Calls), which is a Windows platform 
> specific inter-process communication channel that satisfies all requirements 
> and is easy to deploy. The privileged NT service would register and listen on 
> an LPC port (NtCreatePort, NtListenPort). The NM would use JNI to interop 
> with libwinutils which would host the LPC client code. The client would 
> connect to the LPC port (NtConnectPort) and send a message requesting a 
> container launch (NtRequestWaitReplyPort). LPC provides authentication and 
> the privileged NT service can use authorization API (AuthZ) to validate the 
> caller.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2371) Wrong NMToken is issued when NM preserving restarts with containers running

2014-07-31 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080931#comment-14080931
 ] 

Junping Du commented on YARN-2371:
--

Look at the code on trunk again. It looks like the check is already on appID 
rather than appAttemptID, so exception in description above shouldn't happen on 
latest trunk if only appAttemptID is different. [~zhiguohong], are you using 
trunk to have this exception or some previous released version?
{code}
if (!nmTokenIdentifier.getApplicationAttemptId().getApplicationId().equals(
containerId.getApplicationAttemptId().getApplicationId())) {
  unauthorized = true;
  messageBuilder.append("\nNMToken for application attempt : ")
.append(nmTokenIdentifier.getApplicationAttemptId())
.append(" was used for starting container with container token")
.append(" issued for application attempt : ")
.append(containerId.getApplicationAttemptId());
}
{code}
Though, the message should be improved to reflect applicationID but not 
attemptID.

> Wrong NMToken is issued when NM preserving restarts with containers running
> ---
>
> Key: YARN-2371
> URL: https://issues.apache.org/jira/browse/YARN-2371
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Hong Zhiguo
>Assignee: Hong Zhiguo
> Attachments: YARN-2371.patch
>
>
> When application is submitted with 
> "ApplicationSubmissionContext.getKeepContainersAcrossApplicationAttempts() == 
> true", and NM is restarted with containers running, wrong NMToken is issued 
> to AM through RegisterApplicationMasterResponse.
> See the NM log:
> {code}
> 2014-07-30 11:59:58,941 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
>  Unauthorized request to start container.-
> NMToken for application attempt : appattempt_1406691610864_0002_01 was 
> used for starting container with container token issued for application 
> attempt : appattempt_1406691610864_0002_02
> {code}
> The reason is in below code:
> {code} 
> createAndGetNMToken(String applicationSubmitter,
>   ApplicationAttemptId appAttemptId, Container container) {
>   ..
>   Token token =
>   createNMToken(container.getId().getApplicationAttemptId(),
> container.getNodeId(), applicationSubmitter);
>  ..
> }
> {code} 
> "appAttemptId" instead of "container.getId().getApplicationAttemptId()" 
> should be passed to "createNMToken".



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2371) Wrong NMToken is issued when NM preserving restarts with containers running

2014-07-31 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080925#comment-14080925
 ] 

Junping Du commented on YARN-2371:
--

Nice finding, [~zhiguohong]! The fix here looks reasonable to me. It reminds me 
that we also have recently changes to replace checking appAttemptID with 
checking appID in authorizing NMToken for the similar reason. For unit test, I 
suggest to have a separated test method or at least  separated code segment for 
your case with proper document on scenario of cases.

> Wrong NMToken is issued when NM preserving restarts with containers running
> ---
>
> Key: YARN-2371
> URL: https://issues.apache.org/jira/browse/YARN-2371
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Hong Zhiguo
>Assignee: Hong Zhiguo
> Attachments: YARN-2371.patch
>
>
> When application is submitted with 
> "ApplicationSubmissionContext.getKeepContainersAcrossApplicationAttempts() == 
> true", and NM is restarted with containers running, wrong NMToken is issued 
> to AM through RegisterApplicationMasterResponse.
> See the NM log:
> {code}
> 2014-07-30 11:59:58,941 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
>  Unauthorized request to start container.-
> NMToken for application attempt : appattempt_1406691610864_0002_01 was 
> used for starting container with container token issued for application 
> attempt : appattempt_1406691610864_0002_02
> {code}
> The reason is in below code:
> {code} 
> createAndGetNMToken(String applicationSubmitter,
>   ApplicationAttemptId appAttemptId, Container container) {
>   ..
>   Token token =
>   createNMToken(container.getId().getApplicationAttemptId(),
> container.getNodeId(), applicationSubmitter);
>  ..
> }
> {code} 
> "appAttemptId" instead of "container.getId().getApplicationAttemptId()" 
> should be passed to "createNMToken".



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2347) Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in yarn-server-common

2014-07-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080922#comment-14080922
 ] 

Hudson commented on YARN-2347:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1823 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1823/])
YARN-2347. Consolidated RMStateVersion and NMDBSchemaVersion into Version in 
yarn-server-common. Contributed by Junping Du. (zjshen: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1614838)
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/Version.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/impl
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/impl/pb
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/impl/pb/VersionPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/records
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/proto/yarn_server_nodemanager_recovery.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/TestNMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/RMStateVersion.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/impl/pb/RMStateVersionPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java


> Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in 
> yarn-server-common
> ---

[jira] [Commented] (YARN-2374) YARN trunk build failing TestDistributedShell.testDSShell

2014-07-31 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080914#comment-14080914
 ] 

Naganarasimha G R commented on YARN-2374:
-

I dont know how much this might help . refer this link with jvm bug 
"http://bugs.java.com/view_bug.do?bug_id=7166687";. 

> YARN trunk build failing TestDistributedShell.testDSShell
> -
>
> Key: YARN-2374
> URL: https://issues.apache.org/jira/browse/YARN-2374
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: apache-yarn-2374.0.patch
>
>
> The YARN trunk build has been failing for the last few days in the 
> distributed shell module.
> {noformat}
> testDSShell(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell)
>   Time elapsed: 27.269 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:188)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (YARN-2283) RM failed to release the AM container

2014-07-31 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved YARN-2283.
--

Resolution: Duplicate

Yes, it is very likely a duplicate of MAPREDUCE-5888, especially since it no 
longer reproduces on later releases.  Resolving as a duplicate.

The RM is not failing to release the container, rather the RM is intentionally 
giving the AM some time to clean things up after unregistering (i.e.: the 
FINISHING state).  Unfortunately before MAPREDUCE-5888 was fixed the AM could 
hang during a failed job because of a non-daemon thread that was lingering 
around and preventing the JVM from shutting down.  The RM eventually decides 
that the AM has used too much time to cleanup and kills it.

> RM failed to release the AM container
> -
>
> Key: YARN-2283
> URL: https://issues.apache.org/jira/browse/YARN-2283
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
> Environment: NM1: AM running
> NM2: Map task running
> mapreduce.map.maxattempts=1
>Reporter: Nishan Shetty
>Priority: Critical
>
> During container stability test i faced this problem
> While job is running map task got killed
> Observe that eventhough application is FAILED MRAppMaster process is running 
> till timeout because RM did not release  the AM container
> {code}
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1405318134611_0002_01_05 Container Transitioned from RUNNING to 
> COMPLETED
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Completed container: container_1405318134611_0002_01_05 in state: 
> COMPLETED event:FINISHED
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=testos 
> OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS  
> APPID=application_1405318134611_0002
> CONTAINERID=container_1405318134611_0002_01_05
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore:
>  Finish information of container container_1405318134611_0002_01_05 is 
> written
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter: 
> Stored the finish data of container container_1405318134611_0002_01_05
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode:
>  Released container container_1405318134611_0002_01_05 of capacity 
>  on host HOST-10-18-40-153:45026, which currently has 
> 1 containers,  used and  
> available, release resources=true
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> default used= numContainers=1 user=testos 
> user-resources=
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> completedContainer container=Container: [ContainerId: 
> container_1405318134611_0002_01_05, NodeId: HOST-10-18-40-153:45026, 
> NodeHttpAddress: HOST-10-18-40-153:45025, Resource: , 
> Priority: 5, Token: Token { kind: ContainerToken, service: 10.18.40.153:45026 
> }, ] queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.25, 
> absoluteUsedCapacity=0.25, numApps=1, numContainers=1 cluster= vCores:8>
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> completedContainer queue=root usedCapacity=0.25 absoluteUsedCapacity=0.25 
> used= cluster=
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Re-sorting completed queue: root.default stats: default: capacity=1.0, 
> absoluteCapacity=1.0, usedResources=, 
> usedCapacity=0.25, absoluteUsedCapacity=0.25, numApps=1, numContainers=1
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Application attempt appattempt_1405318134611_0002_01 released container 
> container_1405318134611_0002_01_05 on node: host: HOST-10-18-40-153:45026 
> #containers=1 available=6144 used=2048 with event: FINISHED
> 2014-07-14 14:43:34,924 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Updating application attempt appattempt_1405318134611_0002_01 with final 
> state: FINISHING
> 2014-07-14 14:43:34,924 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1405318134611_0002_01 State change from RUNNING to FINAL_SAVING
> 2014-07-14 14:43:34,924 INFO 
> org.apache.hadoop.yarn

[jira] [Commented] (YARN-2051) Fix code bug and add more unit tests for PBImpls

2014-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080829#comment-14080829
 ] 

Hadoop QA commented on YARN-2051:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12658915/YARN-2051.v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4496//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4496//console

This message is automatically generated.

> Fix code bug and add more unit tests for PBImpls
> 
>
> Key: YARN-2051
> URL: https://issues.apache.org/jira/browse/YARN-2051
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Junping Du
>Assignee: Binglin Chang
>Priority: Critical
> Attachments: YARN-2051.v1.patch, YARN-2051.v2.patch
>
>
> From YARN-2016, we can see some bug could exist in PB implementation of 
> protocol. The bad news is most of these PBImpl don't have any unit test to 
> verify the info is not lost or changed after serialization/deserialization. 
> We should add more tests for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2374) YARN trunk build failing TestDistributedShell.testDSShell

2014-07-31 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080805#comment-14080805
 ] 

Varun Vasudev commented on YARN-2374:
-

>From Jenkins:
{noformat}
TestDistributedShell.testDSShell:193 Expected host name to start with 
'asf905.gq1.ygridcore.net/67.195.81.149', was 'asf905/67.195.81.149'. Expected 
rpc port to be '-1', was '-1'.
{noformat}

It looks like the calls to NetUtils.getHostName() can return a short name or a 
fully qualified domain name. I'm not sure how to resolve this. The test code 
and the code in distributed shell app master call NetUtils.getHostName() and 
are getting different results. One solution could be to modify both the 
distributed shell app master and the test to use fully qualified domain names, 
but I'm open to suggestions.

> YARN trunk build failing TestDistributedShell.testDSShell
> -
>
> Key: YARN-2374
> URL: https://issues.apache.org/jira/browse/YARN-2374
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: apache-yarn-2374.0.patch
>
>
> The YARN trunk build has been failing for the last few days in the 
> distributed shell module.
> {noformat}
> testDSShell(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell)
>   Time elapsed: 27.269 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:188)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2283) RM failed to release the AM container

2014-07-31 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080804#comment-14080804
 ] 

Sunil G commented on YARN-2283:
---

Seems to be duplicate to MAPREDUCE-5888 
[~jlowe] cud u pls confirm whether its the same issue.

> RM failed to release the AM container
> -
>
> Key: YARN-2283
> URL: https://issues.apache.org/jira/browse/YARN-2283
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
> Environment: NM1: AM running
> NM2: Map task running
> mapreduce.map.maxattempts=1
>Reporter: Nishan Shetty
>Priority: Critical
>
> During container stability test i faced this problem
> While job is running map task got killed
> Observe that eventhough application is FAILED MRAppMaster process is running 
> till timeout because RM did not release  the AM container
> {code}
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1405318134611_0002_01_05 Container Transitioned from RUNNING to 
> COMPLETED
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Completed container: container_1405318134611_0002_01_05 in state: 
> COMPLETED event:FINISHED
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=testos 
> OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS  
> APPID=application_1405318134611_0002
> CONTAINERID=container_1405318134611_0002_01_05
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore:
>  Finish information of container container_1405318134611_0002_01_05 is 
> written
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter: 
> Stored the finish data of container container_1405318134611_0002_01_05
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode:
>  Released container container_1405318134611_0002_01_05 of capacity 
>  on host HOST-10-18-40-153:45026, which currently has 
> 1 containers,  used and  
> available, release resources=true
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> default used= numContainers=1 user=testos 
> user-resources=
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> completedContainer container=Container: [ContainerId: 
> container_1405318134611_0002_01_05, NodeId: HOST-10-18-40-153:45026, 
> NodeHttpAddress: HOST-10-18-40-153:45025, Resource: , 
> Priority: 5, Token: Token { kind: ContainerToken, service: 10.18.40.153:45026 
> }, ] queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.25, 
> absoluteUsedCapacity=0.25, numApps=1, numContainers=1 cluster= vCores:8>
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> completedContainer queue=root usedCapacity=0.25 absoluteUsedCapacity=0.25 
> used= cluster=
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Re-sorting completed queue: root.default stats: default: capacity=1.0, 
> absoluteCapacity=1.0, usedResources=, 
> usedCapacity=0.25, absoluteUsedCapacity=0.25, numApps=1, numContainers=1
> 2014-07-14 14:43:33,899 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Application attempt appattempt_1405318134611_0002_01 released container 
> container_1405318134611_0002_01_05 on node: host: HOST-10-18-40-153:45026 
> #containers=1 available=6144 used=2048 with event: FINISHED
> 2014-07-14 14:43:34,924 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Updating application attempt appattempt_1405318134611_0002_01 with final 
> state: FINISHING
> 2014-07-14 14:43:34,924 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1405318134611_0002_01 State change from RUNNING to FINAL_SAVING
> 2014-07-14 14:43:34,924 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating 
> application application_1405318134611_0002 with final state: FINISHING
> 2014-07-14 14:43:34,947 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Watcher event type: NodeDataChanged with state:SyncConnected for 
> path:/rmstore/ZKRMStateRoot/RMAppRoot/application_1405318134611_0002/appattempt_1405318134611_0002_01
>  for Service 
> org.apache.hadoop.yarn.server.resourcemanager.rec

[jira] [Updated] (YARN-2051) Fix code bug and add more unit tests for PBImpls

2014-07-31 Thread Binglin Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated YARN-2051:


Attachment: YARN-2051.v2.patch

Thanks for the review and comments Junping, I updated the patch addressing your 
comments.

> Fix code bug and add more unit tests for PBImpls
> 
>
> Key: YARN-2051
> URL: https://issues.apache.org/jira/browse/YARN-2051
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Junping Du
>Assignee: Binglin Chang
>Priority: Critical
> Attachments: YARN-2051.v1.patch, YARN-2051.v2.patch
>
>
> From YARN-2016, we can see some bug could exist in PB implementation of 
> protocol. The bad news is most of these PBImpl don't have any unit test to 
> verify the info is not lost or changed after serialization/deserialization. 
> We should add more tests for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2051) Fix code bug and add more unit tests for PBImpls

2014-07-31 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080780#comment-14080780
 ] 

Junping Du commented on YARN-2051:
--

Again, good work, [~decster]!
Some comments below, most of them are trivial:

{code}
+System.out.printf("Validate %s %s\n", recordClass.getName(),
+protoClass.getName());
{code}
Please replace this and other places that try to print to console with LOG.

{code}
+ret = Sets.newHashSet(genTypeValue(params[0]));
{code}
Please remove unnecessary space in the end of this line.

{code}
throw new IllegalArgumentException("type not support: " + type);
{code}
May be "type: " + type + " is not supported" is more readable?

{code}
+  private static Object genByNewInstance(Class clazz) throws Exception {
{code}
generateNewInstance() sounds like a better name?

{code}
ret = newInstance.invoke(null, args);
{code}
The code here has risk of NPE if newInstance method is not found previously (it 
is possible, as newInstance() method is not forced to have, although most class 
obey this rule). Better to add some exception handling here.

{code}
+  } else if (clazz.equals(ByteBuffer.class)) {
+// return new ByteBuffer every time
+// to prevent potential side effects
+return ByteBuffer.allocate(4);
+  }
{code}
What's reasonable value we generate here for ByteBuffer? Just empty. Isn't it?


> Fix code bug and add more unit tests for PBImpls
> 
>
> Key: YARN-2051
> URL: https://issues.apache.org/jira/browse/YARN-2051
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Junping Du
>Assignee: Binglin Chang
>Priority: Critical
> Attachments: YARN-2051.v1.patch
>
>
> From YARN-2016, we can see some bug could exist in PB implementation of 
> protocol. The bad news is most of these PBImpl don't have any unit test to 
> verify the info is not lost or changed after serialization/deserialization. 
> We should add more tests for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2051) Fix code bug and add more unit tests for PBImpls

2014-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080750#comment-14080750
 ] 

Hadoop QA commented on YARN-2051:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12655677/YARN-2051.v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4495//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4495//console

This message is automatically generated.

> Fix code bug and add more unit tests for PBImpls
> 
>
> Key: YARN-2051
> URL: https://issues.apache.org/jira/browse/YARN-2051
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Junping Du
>Assignee: Binglin Chang
>Priority: Critical
> Attachments: YARN-2051.v1.patch
>
>
> From YARN-2016, we can see some bug could exist in PB implementation of 
> protocol. The bad news is most of these PBImpl don't have any unit test to 
> verify the info is not lost or changed after serialization/deserialization. 
> We should add more tests for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2372) There are Chinese Characters in the FairScheduler's document

2014-07-31 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080735#comment-14080735
 ] 

Junping Du commented on YARN-2372:
--

Nice catch, [~azuryy]! Actually, these are special punctuation from Chinese 
input which is hardly to find. 
+1 on the patch. [~azuryy], any more places with the same issue? If not, I will 
commit it shortly.

> There are Chinese Characters in the FairScheduler's document
> 
>
> Key: YARN-2372
> URL: https://issues.apache.org/jira/browse/YARN-2372
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.4.1
>Reporter: Fengdong Yu
>Assignee: Fengdong Yu
>Priority: Minor
> Attachments: YARN-2372.patch, YARN-2372.patch, YARN-2372.patch, 
> YARN-2372.patch, YARN-2372.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2051) Fix code bug and add more unit tests for PBImpls

2014-07-31 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080727#comment-14080727
 ] 

Junping Du commented on YARN-2051:
--

Not sure if patch is still updated, manually kick off Jenkins test again.

> Fix code bug and add more unit tests for PBImpls
> 
>
> Key: YARN-2051
> URL: https://issues.apache.org/jira/browse/YARN-2051
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Junping Du
>Assignee: Binglin Chang
>Priority: Critical
> Attachments: YARN-2051.v1.patch
>
>
> From YARN-2016, we can see some bug could exist in PB implementation of 
> protocol. The bad news is most of these PBImpl don't have any unit test to 
> verify the info is not lost or changed after serialization/deserialization. 
> We should add more tests for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2347) Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in yarn-server-common

2014-07-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080722#comment-14080722
 ] 

Hudson commented on YARN-2347:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #629 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/629/])
YARN-2347. Consolidated RMStateVersion and NMDBSchemaVersion into Version in 
yarn-server-common. Contributed by Junping Du. (zjshen: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1614838)
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/Version.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/impl
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/impl/pb
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/impl/pb/VersionPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/records
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/proto/yarn_server_nodemanager_recovery.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/TestNMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/RMStateVersion.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/impl/pb/RMStateVersionPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java


> Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in 
> yarn-server-common
> -

[jira] [Commented] (YARN-2051) Fix code bug and add more unit tests for PBImpls

2014-07-31 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080719#comment-14080719
 ] 

Junping Du commented on YARN-2051:
--

Forget to mention, +1 on the idea for testing these PB objects automatically. I 
love it so much! :) 

> Fix code bug and add more unit tests for PBImpls
> 
>
> Key: YARN-2051
> URL: https://issues.apache.org/jira/browse/YARN-2051
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Junping Du
>Assignee: Binglin Chang
>Priority: Critical
> Attachments: YARN-2051.v1.patch
>
>
> From YARN-2016, we can see some bug could exist in PB implementation of 
> protocol. The bad news is most of these PBImpl don't have any unit test to 
> verify the info is not lost or changed after serialization/deserialization. 
> We should add more tests for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2051) Fix code bug and add more unit tests for PBImpls

2014-07-31 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080716#comment-14080716
 ] 

Junping Du commented on YARN-2051:
--

Hi [~decster], thanks for working on this. I will review your patch ASAP.

> Fix code bug and add more unit tests for PBImpls
> 
>
> Key: YARN-2051
> URL: https://issues.apache.org/jira/browse/YARN-2051
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Junping Du
>Assignee: Binglin Chang
>Priority: Critical
> Attachments: YARN-2051.v1.patch
>
>
> From YARN-2016, we can see some bug could exist in PB implementation of 
> protocol. The bad news is most of these PBImpl don't have any unit test to 
> verify the info is not lost or changed after serialization/deserialization. 
> We should add more tests for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2372) There are Chinese Characters in the FairScheduler's document

2014-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080689#comment-14080689
 ] 

Hadoop QA commented on YARN-2372:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12658877/YARN-2372.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4494//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4494//console

This message is automatically generated.

> There are Chinese Characters in the FairScheduler's document
> 
>
> Key: YARN-2372
> URL: https://issues.apache.org/jira/browse/YARN-2372
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.4.1
>Reporter: Fengdong Yu
>Assignee: Fengdong Yu
>Priority: Minor
> Attachments: YARN-2372.patch, YARN-2372.patch, YARN-2372.patch, 
> YARN-2372.patch, YARN-2372.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2347) Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in yarn-server-common

2014-07-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080680#comment-14080680
 ] 

Hudson commented on YARN-2347:
--

FAILURE: Integrated in Hadoop-trunk-Commit #5991 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5991/])
YARN-2347. Consolidated RMStateVersion and NMDBSchemaVersion into Version in 
yarn-server-common. Contributed by Junping Du. (zjshen: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1614838)
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/Version.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/impl
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/impl/pb
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/records/impl/pb/VersionPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/records
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/proto/yarn_server_nodemanager_recovery.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/TestNMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/RMStateVersion.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/impl/pb/RMStateVersionPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java


> Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in 
> yarn-server-common
> ---

[jira] [Updated] (YARN-2372) There are Chinese Characters in the FairScheduler's document

2014-07-31 Thread Fengdong Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengdong Yu updated YARN-2372:
--

Attachment: YARN-2372.patch

> There are Chinese Characters in the FairScheduler's document
> 
>
> Key: YARN-2372
> URL: https://issues.apache.org/jira/browse/YARN-2372
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.4.1
>Reporter: Fengdong Yu
>Assignee: Fengdong Yu
>Priority: Minor
> Attachments: YARN-2372.patch, YARN-2372.patch, YARN-2372.patch, 
> YARN-2372.patch, YARN-2372.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2347) Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in yarn-server-common

2014-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080669#comment-14080669
 ] 

Hadoop QA commented on YARN-2347:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12658862/YARN-2347-v6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4492//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4492//console

This message is automatically generated.

> Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in 
> yarn-server-common
> 
>
> Key: YARN-2347
> URL: https://issues.apache.org/jira/browse/YARN-2347
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-2347-v2.patch, YARN-2347-v3.patch, 
> YARN-2347-v4.patch, YARN-2347-v5.patch, YARN-2347-v6.patch, YARN-2347.patch
>
>
> We have similar things for version state for RM, NM, TS (TimelineServer), 
> etc. I think we should consolidate them into a common object.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2212) ApplicationMaster needs to find a way to update the AMRMToken periodically

2014-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080665#comment-14080665
 ] 

Hadoop QA commented on YARN-2212:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12658859/YARN-2212.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebApp
  org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
  
org.apache.hadoop.yarn.client.api.impl.TestAMRMClientOnAMRMTokenRollOver
  org.apache.hadoop.yarn.client.TestApplicationMasterServiceOnHA
  org.apache.hadoop.yarn.client.TestRMFailover
  org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
  org.apache.hadoop.yarn.client.api.impl.TestNMClient
  org.apache.hadoop.yarn.client.TestGetGroups
  
org.apache.hadoop.yarn.client.TestResourceManagerAdministrationProtocolPBClientImpl
  
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
  org.apache.hadoop.yarn.client.api.impl.TestYarnClient
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication
  
org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerQueueACLs
  
org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens
  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerQueueACLs
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps
  
org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService
  org.apache.hadoop.yarn.server.resourcemanager.TestRMHA
  
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4491//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4491//console

This message is automatically generated.

> ApplicationMaster needs to find a way to update the AMRMToken periodically
> --
>
> Key: YARN-2212
> URL: https://issues.apache.org/jira/browse/YARN-2212
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2212.1.patch, YARN-2212.2.patch, 
> YARN-2212.3.1.patch, YARN-2212.3.patch, YARN-2212.4.patch, YARN-2212.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1572) Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal

2014-07-31 Thread Wenwu Peng (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenwu Peng updated YARN-1572:
-

Description: 
we have lower chance to hit NPE in allocateNodeLocal  when run benchmark(hit 4 
in 20 times).

2014-07-31 04:18:19,653 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned 
container container_1406794589275_0001_01_21 of capacity  on host datanode10:57281, which has 6 containers,  used and  available after allocation
2014-07-31 04:18:19,654 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
handling event type NODE_UPDATE to the scheduler
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:311)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:268)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:136)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:683)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:602)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:560)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:488)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:729)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:774)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:101)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:599)
at java.lang.Thread.run(Thread.java:662)
2014-07-31 04:18:19,655 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..



  was:
we have lower chance to hit NPE in allocateNodeLocal  when run benchmark(hit 4 
in 20 times).

Steps:
1. setup hadoop 2.2.0 environment
2. Run for i in {1..10}; do /hadoop/hadoop-smoke/bin/hadoop jar 
/hadoop/hadoop-smoke/share/hadoop/mapreduce/hadoop-mapreduce-client-common-*.jar
 org.apache.hadoop.fs.TestDFSIO -write -nrFiles 30 -fileSize 64MB; sleep 10;done


2014-01-08 03:56:14,082 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
handling event type NODE_UPDATE to the scheduler
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:291)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:252)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:294)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:614)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:524)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:482)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:419)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:658)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:687)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:95)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:440)
at java.lang.Thread.run(Thread.java:662)

will attach log and configure files later

Note: 
My topology file:
10.111.89.230   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
10.111.89.231   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
10.111.89.232   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
10.111.89.239   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
10.111.89.233   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
10.111.89.234   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
10.111.89.240   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
10.111.89.236   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
10.111.89.241   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
10.111.89.238   /QE2/sin2-pekaurora-bdcqe048.en

[jira] [Commented] (YARN-2374) YARN trunk build failing TestDistributedShell.testDSShell

2014-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080655#comment-14080655
 ] 

Hadoop QA commented on YARN-2374:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12658863/apache-yarn-2374.0.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell:

  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4493//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4493//console

This message is automatically generated.

> YARN trunk build failing TestDistributedShell.testDSShell
> -
>
> Key: YARN-2374
> URL: https://issues.apache.org/jira/browse/YARN-2374
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: apache-yarn-2374.0.patch
>
>
> The YARN trunk build has been failing for the last few days in the 
> distributed shell module.
> {noformat}
> testDSShell(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell)
>   Time elapsed: 27.269 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:188)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1572) Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal

2014-07-31 Thread Wenwu Peng (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenwu Peng updated YARN-1572:
-

Attachment: YARN-1572-log.tar.gz

Thanks a lot Junping! please refer to YARN-1572-log.tar.gz  for the log of NPE 
for latest trunk.

java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:311)


> Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal
> --
>
> Key: YARN-1572
> URL: https://issues.apache.org/jira/browse/YARN-1572
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.2.0
>Reporter: Wenwu Peng
>Assignee: Wenwu Peng
> Attachments: YARN-1572-log.tar.gz, conf.tar.gz, log.tar.gz
>
>
> we have lower chance to hit NPE in allocateNodeLocal  when run benchmark(hit 
> 4 in 20 times).
> Steps:
> 1. setup hadoop 2.2.0 environment
> 2. Run for i in {1..10}; do /hadoop/hadoop-smoke/bin/hadoop jar 
> /hadoop/hadoop-smoke/share/hadoop/mapreduce/hadoop-mapreduce-client-common-*.jar
>  org.apache.hadoop.fs.TestDFSIO -write -nrFiles 30 -fileSize 64MB; sleep 
> 10;done
> 2014-01-08 03:56:14,082 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type NODE_UPDATE to the scheduler
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:291)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:252)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:294)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:614)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:524)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:482)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:419)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:658)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:687)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:95)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:440)
> at java.lang.Thread.run(Thread.java:662)
> will attach log and configure files later
> Note: 
> My topology file:
> 10.111.89.230   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
> 10.111.89.231   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
> 10.111.89.232   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
> 10.111.89.239   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
> 10.111.89.233   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
> 10.111.89.234   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
> 10.111.89.240   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
> 10.111.89.236   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
> 10.111.89.241   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
> 10.111.89.238   /QE2/sin2-pekaurora-bdcqe048.eng.vmware.com
> 10.111.89.242   /QE2/sin2-pekaurora-bdcqe048.eng.vmware.com



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2374) YARN trunk build failing TestDistributedShell.testDSShell

2014-07-31 Thread Varun Vasudev (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-2374:


Attachment: apache-yarn-2374.0.patch

Patch with debug information added to figure out root cause.

> YARN trunk build failing TestDistributedShell.testDSShell
> -
>
> Key: YARN-2374
> URL: https://issues.apache.org/jira/browse/YARN-2374
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: apache-yarn-2374.0.patch
>
>
> The YARN trunk build has been failing for the last few days in the 
> distributed shell module.
> {noformat}
> testDSShell(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell)
>   Time elapsed: 27.269 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:188)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2347) Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in yarn-server-common

2014-07-31 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-2347:
-

Attachment: YARN-2347-v6.patch

Address latest comments from [~zjshen] in v6 patch.

> Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in 
> yarn-server-common
> 
>
> Key: YARN-2347
> URL: https://issues.apache.org/jira/browse/YARN-2347
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-2347-v2.patch, YARN-2347-v3.patch, 
> YARN-2347-v4.patch, YARN-2347-v5.patch, YARN-2347-v6.patch, YARN-2347.patch
>
>
> We have similar things for version state for RM, NM, TS (TimelineServer), 
> etc. I think we should consolidate them into a common object.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2347) Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in yarn-server-common

2014-07-31 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080613#comment-14080613
 ] 

Junping Du commented on YARN-2347:
--

Sounds good. Will upload a new patch soon. Thx!

> Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in 
> yarn-server-common
> 
>
> Key: YARN-2347
> URL: https://issues.apache.org/jira/browse/YARN-2347
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-2347-v2.patch, YARN-2347-v3.patch, 
> YARN-2347-v4.patch, YARN-2347-v5.patch, YARN-2347.patch
>
>
> We have similar things for version state for RM, NM, TS (TimelineServer), 
> etc. I think we should consolidate them into a common object.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2347) Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in yarn-server-common

2014-07-31 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080610#comment-14080610
 ] 

Zhijie Shen commented on YARN-2347:
---

Make sense. As MR has already used Version, should we at least mark Version as 
\@LimitedPrivate(\{"YARN", "MAPREDUCE"\})?

> Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in 
> yarn-server-common
> 
>
> Key: YARN-2347
> URL: https://issues.apache.org/jira/browse/YARN-2347
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-2347-v2.patch, YARN-2347-v3.patch, 
> YARN-2347-v4.patch, YARN-2347-v5.patch, YARN-2347.patch
>
>
> We have similar things for version state for RM, NM, TS (TimelineServer), 
> etc. I think we should consolidate them into a common object.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2212) ApplicationMaster needs to find a way to update the AMRMToken periodically

2014-07-31 Thread Xuan Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-2212:


Attachment: YARN-2212.5.patch

> ApplicationMaster needs to find a way to update the AMRMToken periodically
> --
>
> Key: YARN-2212
> URL: https://issues.apache.org/jira/browse/YARN-2212
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2212.1.patch, YARN-2212.2.patch, 
> YARN-2212.3.1.patch, YARN-2212.3.patch, YARN-2212.4.patch, YARN-2212.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (YARN-1572) Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal

2014-07-31 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du reassigned YARN-1572:


Assignee: Wenwu Peng  (was: Junping Du)

[~gujilangzi], are you working on this? If so, assign this JIRA to you. Please 
attach the log of NPE for latest trunk, I will also help to look at it. Thx! 

> Low chance to hit NPE issue in AppSchedulingInfo#allocateNodeLocal
> --
>
> Key: YARN-1572
> URL: https://issues.apache.org/jira/browse/YARN-1572
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.2.0
>Reporter: Wenwu Peng
>Assignee: Wenwu Peng
> Attachments: conf.tar.gz, log.tar.gz
>
>
> we have lower chance to hit NPE in allocateNodeLocal  when run benchmark(hit 
> 4 in 20 times).
> Steps:
> 1. setup hadoop 2.2.0 environment
> 2. Run for i in {1..10}; do /hadoop/hadoop-smoke/bin/hadoop jar 
> /hadoop/hadoop-smoke/share/hadoop/mapreduce/hadoop-mapreduce-client-common-*.jar
>  org.apache.hadoop.fs.TestDFSIO -write -nrFiles 30 -fileSize 64MB; sleep 
> 10;done
> 2014-01-08 03:56:14,082 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type NODE_UPDATE to the scheduler
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:291)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:252)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:294)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:614)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:524)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:482)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:419)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:658)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:687)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:95)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:440)
> at java.lang.Thread.run(Thread.java:662)
> will attach log and configure files later
> Note: 
> My topology file:
> 10.111.89.230   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
> 10.111.89.231   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
> 10.111.89.232   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
> 10.111.89.239   /QE1/sin2-pekaurora-bdcqe046.eng.vmware.com
> 10.111.89.233   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
> 10.111.89.234   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
> 10.111.89.240   /QE1/sin2-pekaurora-bdcqe017.eng.vmware.com
> 10.111.89.236   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
> 10.111.89.241   /QE2/sin2-pekaurora-bdcqe047.eng.vmware.com
> 10.111.89.238   /QE2/sin2-pekaurora-bdcqe048.eng.vmware.com
> 10.111.89.242   /QE2/sin2-pekaurora-bdcqe048.eng.vmware.com



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2347) Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in yarn-server-common

2014-07-31 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080602#comment-14080602
 ] 

Junping Du commented on YARN-2347:
--

Thanks for review and comments, [~zjshen]! That's good point and I agree it is 
possible to be used in future by other applications. However, before the real 
requirements comes in (as applications don't have to follow our practice in 
YARN for versioning), let's play safe to keep it as private as it is mostly 
used among YARN and built-in MR components. We can easily to make an API public 
from private in future, but making a public API back to private (or change 
interfaces) should never happen. So, IMO, it is better to keep it as private at 
this moment. We can open a separated JIRA (and work) to discuss more if you 
have a strong feeling to public it. Thoughts?

> Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in 
> yarn-server-common
> 
>
> Key: YARN-2347
> URL: https://issues.apache.org/jira/browse/YARN-2347
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-2347-v2.patch, YARN-2347-v3.patch, 
> YARN-2347-v4.patch, YARN-2347-v5.patch, YARN-2347.patch
>
>
> We have similar things for version state for RM, NM, TS (TimelineServer), 
> etc. I think we should consolidate them into a common object.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2212) ApplicationMaster needs to find a way to update the AMRMToken periodically

2014-07-31 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080598#comment-14080598
 ] 

Xuan Gong commented on YARN-2212:
-

bq. AMS#registerApplicationMaster changes not needed.

make changes on authorizeRequest() to let it return AMRMTokenIdentifier instead 
of RMAppAttemptId. So, all the changes in AMS#registerApplicationMaster are for 
this.

bq. May not say stable now.
{code}
  @Stable
  public abstract Token getAMRMToken();
{code}

DONE

bq. ApplicationReport#getAMRMToken for unmanaged AM needs to be updated as well.

When AMRMToken is rolled up, we will update the AMRMToken for current attempt. 
So, ApplicationReport#getAMRMToken will update

bq. we can move the AMRMToken creation from RMAppAttemptImpl to AMLauncher?
 
DONE

bq. Use newInstance instead. 

DONE

bq. Test AMRMClient automatically takes care of the new AMRMToken transfer.

ADDED

bq. Please run on real cluster also and set roll-over interval to a small value 
to make sure it actually works.

tested.

> ApplicationMaster needs to find a way to update the AMRMToken periodically
> --
>
> Key: YARN-2212
> URL: https://issues.apache.org/jira/browse/YARN-2212
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-2212.1.patch, YARN-2212.2.patch, 
> YARN-2212.3.1.patch, YARN-2212.3.patch, YARN-2212.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1149) NM throws InvalidStateTransitonException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING

2014-07-31 Thread duanfa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080591#comment-14080591
 ] 

duanfa commented on YARN-1149:
--

i want ask : which hadoop version did you change base?  hadoop 2.0.x.x or 
hadoop 2.1.x.x
please send to my email  duanfa1...@gmail.com

thanks!!!

> NM throws InvalidStateTransitonException: Invalid event: 
> APPLICATION_LOG_HANDLING_FINISHED at RUNNING
> -
>
> Key: YARN-1149
> URL: https://issues.apache.org/jira/browse/YARN-1149
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ramya Sunil
>Assignee: Xuan Gong
> Fix For: 2.2.0
>
> Attachments: YARN-1149.1.patch, YARN-1149.2.patch, YARN-1149.3.patch, 
> YARN-1149.4.patch, YARN-1149.5.patch, YARN-1149.6.patch, YARN-1149.7.patch, 
> YARN-1149.8.patch, YARN-1149.9.patch, YARN-1149_branch-2.1-beta.1.patch
>
>
> When nodemanager receives a kill signal when an application has finished 
> execution but log aggregation has not kicked in, 
> InvalidStateTransitonException: Invalid event: 
> APPLICATION_LOG_HANDLING_FINISHED at RUNNING is thrown
> {noformat}
> 2013-08-25 20:45:00,875 INFO  logaggregation.AppLogAggregatorImpl 
> (AppLogAggregatorImpl.java:finishLogAggregation(254)) - Application just 
> finished : application_1377459190746_0118
> 2013-08-25 20:45:00,876 INFO  logaggregation.AppLogAggregatorImpl 
> (AppLogAggregatorImpl.java:uploadLogsForContainer(105)) - Starting aggregate 
> log-file for app application_1377459190746_0118 at 
> /app-logs/foo/logs/application_1377459190746_0118/_45454.tmp
> 2013-08-25 20:45:00,876 INFO  logaggregation.LogAggregationService 
> (LogAggregationService.java:stopAggregators(151)) - Waiting for aggregation 
> to complete for application_1377459190746_0118
> 2013-08-25 20:45:00,891 INFO  logaggregation.AppLogAggregatorImpl 
> (AppLogAggregatorImpl.java:uploadLogsForContainer(122)) - Uploading logs for 
> container container_1377459190746_0118_01_04. Current good log dirs are 
> /tmp/yarn/local
> 2013-08-25 20:45:00,915 INFO  logaggregation.AppLogAggregatorImpl 
> (AppLogAggregatorImpl.java:doAppLogAggregation(182)) - Finished aggregate 
> log-file for app application_1377459190746_0118
> 2013-08-25 20:45:00,925 WARN  application.Application 
> (ApplicationImpl.java:handle(427)) - Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> APPLICATION_LOG_HANDLING_FINISHED at RUNNING
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>  
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:425)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:59)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:697)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:689)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)   
> at java.lang.Thread.run(Thread.java:662)
> 2013-08-25 20:45:00,926 INFO  application.Application 
> (ApplicationImpl.java:handle(430)) - Application 
> application_1377459190746_0118 transitioned from RUNNING to null
> 2013-08-25 20:45:00,927 WARN  monitor.ContainersMonitorImpl 
> (ContainersMonitorImpl.java:run(463)) - 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl
>  is interrupted. Exiting.
> 2013-08-25 20:45:00,938 INFO  ipc.Server (Server.java:stop(2437)) - Stopping 
> server on 8040
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2347) Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in yarn-server-common

2014-07-31 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080587#comment-14080587
 ] 

Zhijie Shen commented on YARN-2347:
---

Sorry for raising another issues so late. When I try to commit the patch, I 
realize ShuffleHandler from MR project has a reference to Version. In this case,
{code}
@Private
@Unstable
public abstract class Version {
{code}
\@Private annotation seems not to be accurate. Moreover, other applications may 
implement their AuxiliaryService as well, right? In this case, their 
AuxiliaryService is likely to use Version as ShuffleHandler does. Therefore, 
should Version be \@Public instead, and be part of o.a.h.y.api.records in 
hadoop-yarn-api?

> Consolidate RMStateVersion and NMDBSchemaVersion into StateVersion in 
> yarn-server-common
> 
>
> Key: YARN-2347
> URL: https://issues.apache.org/jira/browse/YARN-2347
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-2347-v2.patch, YARN-2347-v3.patch, 
> YARN-2347-v4.patch, YARN-2347-v5.patch, YARN-2347.patch
>
>
> We have similar things for version state for RM, NM, TS (TimelineServer), 
> etc. I think we should consolidate them into a common object.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

88 matches

Mail list logo