date:20130429


[ 
https://issues.apache.org/jira/browse/YARN-576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644408#comment-13644408
 ] 

Hudson commented on YARN-576:
-

Integrated in Hadoop-Yarn-trunk #198 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/198/])
YARN-576. Modified ResourceManager to reject NodeManagers that don't satisy 
minimum resource requirements. Contributed by Kenji Kikushima. (Revision 
1476824)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1476824
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMExpiry.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestRMNMRPCResponseId.java


 RM should not allow registrations from NMs that do not satisfy minimum 
 scheduler allocations
 

 Key: YARN-576
 URL: https://issues.apache.org/jira/browse/YARN-576
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Kenji Kikushima
  Labels: newbie
 Fix For: 2.0.5-beta

 Attachments: YARN-576-2.patch, YARN-576-3.patch, YARN-576-4.patch, 
 YARN-576.patch


 If the minimum resource allocation configured for the RM scheduler is 1 GB, 
 the RM should drop all NMs that register with a total capacity of less than 1 
 GB. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-576) RM should not allow registrations from NMs that do not satisfy minimum scheduler allocations


[ 
https://issues.apache.org/jira/browse/YARN-576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644462#comment-13644462
 ] 

Hudson commented on YARN-576:
-

Integrated in Hadoop-Hdfs-trunk #1387 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1387/])
YARN-576. Modified ResourceManager to reject NodeManagers that don't satisy 
minimum resource requirements. Contributed by Kenji Kikushima. (Revision 
1476824)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1476824
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMExpiry.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestRMNMRPCResponseId.java


 RM should not allow registrations from NMs that do not satisfy minimum 
 scheduler allocations
 

 Key: YARN-576
 URL: https://issues.apache.org/jira/browse/YARN-576
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Kenji Kikushima
  Labels: newbie
 Fix For: 2.0.5-beta

 Attachments: YARN-576-2.patch, YARN-576-3.patch, YARN-576-4.patch, 
 YARN-576.patch


 If the minimum resource allocation configured for the RM scheduler is 1 GB, 
 the RM should drop all NMs that register with a total capacity of less than 1 
 GB. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-576) RM should not allow registrations from NMs that do not satisfy minimum scheduler allocations


[ 
https://issues.apache.org/jira/browse/YARN-576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644498#comment-13644498
 ] 

Hudson commented on YARN-576:
-

Integrated in Hadoop-Mapreduce-trunk #1414 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1414/])
YARN-576. Modified ResourceManager to reject NodeManagers that don't satisy 
minimum resource requirements. Contributed by Kenji Kikushima. (Revision 
1476824)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1476824
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMExpiry.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestRMNMRPCResponseId.java


 RM should not allow registrations from NMs that do not satisfy minimum 
 scheduler allocations
 

 Key: YARN-576
 URL: https://issues.apache.org/jira/browse/YARN-576
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Kenji Kikushima
  Labels: newbie
 Fix For: 2.0.5-beta

 Attachments: YARN-576-2.patch, YARN-576-3.patch, YARN-576-4.patch, 
 YARN-576.patch


 If the minimum resource allocation configured for the RM scheduler is 1 GB, 
 the RM should drop all NMs that register with a total capacity of less than 1 
 GB. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-29 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644626#comment-13644626
 ] 

Chris Douglas commented on YARN-45:
---

I'm also a fan of {{ResourceRequest}}, but we're not really using all its 
features, yet. Similarly, {{Resource}} bakes in the fungibility of resources, 
which could be awkward as the RM accommodates richer requests (as in YARN-392).

We could use {{ResourceRequest}}- so the API is there for extensions- but only 
populate the capability as an aggregate. With the convention that \-1 
containers can mean packed as you see fit, it expresses {{Resource}} (which 
we need in practice, since the priorities for requests don't always [match the 
preemption 
order|https://issues.apache.org/jira/browse/YARN-569?focusedCommentId=13638825page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13638825]),
 which is sufficient for the current schedulers.

If we're adding the contract back with the set of containers, the 
[semantics|https://issues.apache.org/jira/browse/YARN-45?focusedCommentId=13628950page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13628950]
 we discussed earlier still seem OK.

 Scheduler feedback to AM to release containers
 --

 Key: YARN-45
 URL: https://issues.apache.org/jira/browse/YARN-45
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Chris Douglas
Assignee: Carlo Curino
 Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
 YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf


 The ResourceManager strikes a balance between cluster utilization and strict 
 enforcement of resource invariants in the cluster. Individual allocations of 
 containers must be reclaimed- or reserved- to restore the global invariants 
 when cluster load shifts. In some cases, the ApplicationMaster can respond to 
 fluctuations in resource availability without losing the work already 
 completed by that task (MAPREDUCE-4584). Supplying it with this information 
 would be helpful for overall cluster utilization [1]. To this end, we want to 
 establish a protocol for the RM to ask the AM to release containers.
 [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-29 Thread Carlo Curino (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644664#comment-13644664
 ] 

Carlo Curino commented on YARN-45:
--

[~acmurthy] I see your point, which was in fact reflected more clearly in our 
initial proposal. The only caveat is not to make this a capacity-only protocol 
(which you are not, but I wanted to reiterate that there are other use cases).  

I like [~bikassaha] and [~chris.douglas] spin on it (i.e., using 
ResourceRequest), as it gives us the immediate capacity angle, but will 
eventually allow to evolve the implementations towards something richer (e.g., 
the preempt on behalf of a specific request
that Bikas considered before) without impact to the protocols. 

I think there is a slightly cleaner version of Chris's proposal: 
use ResourceRequest and to represent a request that only cares about overall 
capacity we could express the ResourceRequest as a multiple of the minimum 
allocation (i.e., if we want 100GB of RAM back and min_container size is 1GB we 
ask for 100 x 1GB containers). This achieves Chris's proposal with a slightly 
prettier use of ResourceRequest. Note that there are size-matching issues 
(e.g., you have 1.5GB containers and I ask for 1x1GB containers, but we have 
very similar problems with Resource).

I would say that as Chris pointed out [these semantics | 
https://issues.apache.org/jira/browse/YARN-45?focusedCommentId=13628950page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13628950]
 plus the use of ResourceRequest I propose here as a minor variation on Chris's 
take should cover Arun's and Bika's comments (and I believe also the prior 45+ 
messages). 

Thoughts?




 Scheduler feedback to AM to release containers
 --

 Key: YARN-45
 URL: https://issues.apache.org/jira/browse/YARN-45
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Chris Douglas
Assignee: Carlo Curino
 Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
 YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf


 The ResourceManager strikes a balance between cluster utilization and strict 
 enforcement of resource invariants in the cluster. Individual allocations of 
 containers must be reclaimed- or reserved- to restore the global invariants 
 when cluster load shifts. In some cases, the ApplicationMaster can respond to 
 fluctuations in resource availability without losing the work already 
 completed by that task (MAPREDUCE-4584). Supplying it with this information 
 would be helpful for overall cluster utilization [1]. To this end, we want to 
 establish a protocol for the RM to ask the AM to release containers.
 [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-29 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644675#comment-13644675
 ] 

Chris Douglas commented on YARN-45:
---

bq. we could express the ResourceRequest as a multiple of the minimum allocation

+1 This is better

 Scheduler feedback to AM to release containers
 --

 Key: YARN-45
 URL: https://issues.apache.org/jira/browse/YARN-45
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Chris Douglas
Assignee: Carlo Curino
 Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
 YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf


 The ResourceManager strikes a balance between cluster utilization and strict 
 enforcement of resource invariants in the cluster. Individual allocations of 
 containers must be reclaimed- or reserved- to restore the global invariants 
 when cluster load shifts. In some cases, the ApplicationMaster can respond to 
 fluctuations in resource availability without losing the work already 
 completed by that task (MAPREDUCE-4584). Supplying it with this information 
 would be helpful for overall cluster utilization [1]. To this end, we want to 
 establish a protocol for the RM to ask the AM to release containers.
 [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-513) Verify all clients will wait for RM to restart

2013-04-29 Thread Xuan Gong (JIRA)

[
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644695#comment-13644695
]

Xuan Gong commented on YARN-513:

bq:We could try to reuse existing RetryPolicy etc inside RMClient as long as we
maintain the RMClient abstraction.
Reuse the RetryPolicy in new patch. The RetryInvocationHandler provides the
retry logic in its invoke method. We can reuse that

bq:Are we not missing an RMClient.disconnect()? This one would internally stop
the proxy?
Yes, we need that. Adding the disconnect code in the new patch

bq:Looks like NMStatusUpdater.getRMClient() can be removed because
createRMClient() is being overridden by all tests.
Removed from the new patch

bq:Why are we throwing YARNException?
Original code throws the YarnException, now i want to keep consistant. And I
think we will change the exception thru YARN-142.

bq:Is any test explicitly testing the new code with a real RM? How about
manually doing it?
Tested the new code in single node cluster

Verify all clients will wait for RM to restart
--

Key: YARN-513
URL: https://issues.apache.org/jira/browse/YARN-513
Project: Hadoop YARN
Issue Type: Sub-task
Components: resourcemanager
Reporter: Bikas Saha
Assignee: Xuan Gong
Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch

When the RM is restarting, the NM, AM and Clients should wait for some time
for the RM to come back up.

[jira] [Updated] (YARN-513) Verify all clients will wait for RM to restart

2013-04-29 Thread Xuan Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-513:
---

Attachment: YARN-513.3.patch

 Verify all clients will wait for RM to restart
 --

 Key: YARN-513
 URL: https://issues.apache.org/jira/browse/YARN-513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch


 When the RM is restarting, the NM, AM and Clients should wait for some time 
 for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-326) Add multi-resource scheduling to the fair scheduler


 [ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-326:


Attachment: FairSchedulerDRFDesignDoc-1.pdf

Uploading a new design doc to reflect the discussion

 Add multi-resource scheduling to the fair scheduler
 ---

 Key: YARN-326
 URL: https://issues.apache.org/jira/browse/YARN-326
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: FairSchedulerDRFDesignDoc-1.pdf, 
 FairSchedulerDRFDesignDoc.pdf, YARN-326.patch, YARN-326.patch


 With YARN-2 in, the capacity scheduler has the ability to schedule based on 
 multiple resources, using dominant resource fairness.  The fair scheduler 
 should be able to do multiple resource scheduling as well, also using 
 dominant resource fairness.
 More details to come on how the corner cases with fair scheduler configs such 
 as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-29 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644734#comment-13644734
 ] 

Karthik Kambatla commented on YARN-326:
---

Sandy - thanks for updating the doc. The approach is clear and fairly 
straight-forward. Nit: might want to add other DRF-followup papers to 
references.

 Add multi-resource scheduling to the fair scheduler
 ---

 Key: YARN-326
 URL: https://issues.apache.org/jira/browse/YARN-326
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: FairSchedulerDRFDesignDoc-1.pdf, 
 FairSchedulerDRFDesignDoc.pdf, YARN-326.patch, YARN-326.patch


 With YARN-2 in, the capacity scheduler has the ability to schedule based on 
 multiple resources, using dominant resource fairness.  The fair scheduler 
 should be able to do multiple resource scheduling as well, also using 
 dominant resource fairness.
 More details to come on how the corner cases with fair scheduler configs such 
 as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler


[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644753#comment-13644753
 ] 

Sandy Ryza commented on YARN-326:
-

Uploaded new patch that reflects design changes

 Add multi-resource scheduling to the fair scheduler
 ---

 Key: YARN-326
 URL: https://issues.apache.org/jira/browse/YARN-326
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: FairSchedulerDRFDesignDoc-1.pdf, 
 FairSchedulerDRFDesignDoc.pdf, YARN-326-1.patch, YARN-326.patch, 
 YARN-326.patch


 With YARN-2 in, the capacity scheduler has the ability to schedule based on 
 multiple resources, using dominant resource fairness.  The fair scheduler 
 should be able to do multiple resource scheduling as well, also using 
 dominant resource fairness.
 More details to come on how the corner cases with fair scheduler configs such 
 as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

[
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644759#comment-13644759
]

Hadoop QA commented on YARN-326:

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12581014/YARN-326-1.patch
against trunk revision .

{color:red}-1 patch{color}. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/837//console

This message is automatically generated.

Add multi-resource scheduling to the fair scheduler
---

Key: YARN-326
URL: https://issues.apache.org/jira/browse/YARN-326
Project: Hadoop YARN
Issue Type: New Feature
Components: scheduler
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Attachments: FairSchedulerDRFDesignDoc-1.pdf,
FairSchedulerDRFDesignDoc.pdf, YARN-326-1.patch, YARN-326.patch,
YARN-326.patch

With YARN-2 in, the capacity scheduler has the ability to schedule based on
multiple resources, using dominant resource fairness. The fair scheduler
should be able to do multiple resource scheduling as well, also using
dominant resource fairness.
More details to come on how the corner cases with fair scheduler configs such
as min and max resources will be handled.

[jira] [Commented] (YARN-582) Restore appToken and clientToken for app attempt after RM restart


[ 
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644770#comment-13644770
 ] 

Hadoop QA commented on YARN-582:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581012/YARN-582.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/836//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/836//console

This message is automatically generated.

 Restore appToken and clientToken for app attempt after RM restart
 -

 Key: YARN-582
 URL: https://issues.apache.org/jira/browse/YARN-582
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Jian He
 Attachments: YARN-582.1.patch


 These need to be saved and restored on a per app attempt basis. This is 
 required only when work preserving restart is implemented for secure 
 clusters. In non-preserving restart app attempts are killed and so this does 
 not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-528) Make IDs read only

2013-04-29 Thread Robert Joseph Evans (JIRA)

[
https://issues.apache.org/jira/browse/YARN-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644784#comment-13644784
]

Robert Joseph Evans commented on YARN-528:
--

Thanks for doing this Sid. I started pulling on the string and there was just
too much involved, so I had to stop.

Make IDs read only
--

Key: YARN-528
URL: https://issues.apache.org/jira/browse/YARN-528
Project: Hadoop YARN
Issue Type: Sub-task
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
Attachments: y528_AppIdPart_01_Refactor.txt,
y528_AppIdPart_02_AppIdChanges.txt, y528_AppIdPart_03_fixUsage.txt,
y528_ApplicationIdComplete_WIP.txt, YARN-528.txt, YARN-528.txt

I really would like to rip out most if not all of the abstraction layer that
sits in-between Protocol Buffers, the RPC, and the actual user code. We have
no plans to support any other serialization type, and the abstraction layer
just, makes it more difficult to change protocols, makes changing them more
error prone, and slows down the objects themselves.
Completely doing that is a lot of work. This JIRA is a first step towards
that. It makes the various ID objects immutable. If this patch is wel
received I will try to go through other objects/classes of objects and update
them in a similar way.
This is probably the last time we will be able to make a change like this
before 2.0 stabilizes and YARN APIs will not be able to be changed.

[jira] [Reopened] (YARN-579) Make ApplicationToken part of Container's token list to help RM-restart


 [ 
https://issues.apache.org/jira/browse/YARN-579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp reopened YARN-579:
--


This has broken secure clusters.  The AM is unable to find the token to 
register with the RM.  I've debugged it far enough to see that localization has 
put the token in the nm-private dir, so it looks like the AM has amnesia when 
it connects to the RM.

{noformat}
2013-04-29 17:47:02,666 DEBUG [IPC Client (4914628) connection to $RM:8030 from 
$USER] org.apache.hadoop.ipc.Client: IPC Client (4914628) connection to 
$RM:8030 from $USER: stopped, remaining connections 1
2013-04-29 17:47:02,667 ERROR [main] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Exception while 
registering
java.lang.reflect.UndeclaredThrowableException
at 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128)
at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:103)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:153)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.start(RMCommunicator.java:112)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.start(RMContainerAllocator.java:211)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.start(MRAppMaster.java:797)
at 
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1369)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1365)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
 SIMPLE authentication is not enabled.  Available:[KERBEROS, DIGEST]
at org.apache.hadoop.ipc.Client.call(Client.java:1229)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:100)
... 12 more
2013-04-29 17:47:02,668 ERROR [main] 
org.apache.hadoop.yarn.service.CompositeService: Error starting services 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster
org.apache.hadoop.yarn.YarnException: 
java.lang.reflect.UndeclaredThrowableException
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:166)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.start(RMCommunicator.java:112)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.start(RMContainerAllocator.java:211)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.start(MRAppMaster.java:797)
at 
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1369)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1365)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
Caused by: java.lang.reflect.UndeclaredThrowableException
at 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128)
at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:103)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:153)
... 11 more
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
 SIMPLE authentication is not enabled.  Available:[KERBEROS, DIGEST]
at org.apache.hadoop.ipc.Client.call(Client.java:1229)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source)

[jira] [Commented] (YARN-575) ContainerManager APIs should be user accessible


[ 
https://issues.apache.org/jira/browse/YARN-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644793#comment-13644793
 ] 

Daryn Sharp commented on YARN-575:
--

I agree with your 2nd point, I think allowing users to directly stop containers 
will lead to problems.

 ContainerManager APIs should be user accessible
 ---

 Key: YARN-575
 URL: https://issues.apache.org/jira/browse/YARN-575
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Vinod Kumar Vavilapalli
Priority: Critical

 Auth for ContainerManager is based on the containerId being accessed - since 
 this is what is used to launch containers (There's likely another jira 
 somewhere to change this to not be containerId based).
 What this also means is the API is effectively not usable with kerberos 
 credentials.
 Also, it should be possible to use this API with some generic tokens 
 (RMDelegation?), instead of with Container specific tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements


[ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644795#comment-13644795
 ] 

Daryn Sharp commented on YARN-617:
--

Does there really need to be different NM behavior?  Ie. Why can't the NM 
always require container tokens regardless of security setting?

 In unsercure mode, AM can fake resource requirements 
 -

 Key: YARN-617
 URL: https://issues.apache.org/jira/browse/YARN-617
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Minor

 Without security, it is impossible to completely avoid AMs faking resources. 
 We can at the least make it as difficult as possible by using the same 
 container tokens and the RM-NM shared key mechanism over unauthenticated 
 RM-NM channel.
 In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-582) Restore appToken and clientToken for app attempt after RM restart


[ 
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644805#comment-13644805
 ] 

Daryn Sharp commented on YARN-582:
--

I've only glanced over the patch, but do these tokens actually need to be 
handled specially?  Is it feasible to handle all tokens in an opaque 
credentials within the store?  I think that may reduce the copy-n-paste code 
throughout the stores for restoring these tokens.

 Restore appToken and clientToken for app attempt after RM restart
 -

 Key: YARN-582
 URL: https://issues.apache.org/jira/browse/YARN-582
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Jian He
 Attachments: YARN-582.1.patch


 These need to be saved and restored on a per app attempt basis. This is 
 required only when work preserving restart is implemented for secure 
 clusters. In non-preserving restart app attempts are killed and so this does 
 not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-613) Create NM proxy per NM instead of per container


[ 
https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644809#comment-13644809
 ] 

Daryn Sharp commented on YARN-613:
--

Question: How do you plan for NMs to authenticate the AM tokens?

 Create NM proxy per NM instead of per container
 ---

 Key: YARN-613
 URL: https://issues.apache.org/jira/browse/YARN-613
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Vinod Kumar Vavilapalli

 Currently a new NM proxy has to be created per container since the secure 
 authentication is using a containertoken from the container.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements


[ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644814#comment-13644814
 ] 

Vinod Kumar Vavilapalli commented on YARN-617:
--

bq. Does there really need to be different NM behavior? Ie. Why can't the NM 
always require container tokens regardless of security setting?
That is what I meant in my points above. ContainerTokens will always be sent 
irrespective of security and are used for *authorization*. I just put them as 
separate points to highlight that in secure mode, we also use ContainerTokens 
for *authentication*.

 In unsercure mode, AM can fake resource requirements 
 -

 Key: YARN-617
 URL: https://issues.apache.org/jira/browse/YARN-617
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Minor

 Without security, it is impossible to completely avoid AMs faking resources. 
 We can at the least make it as difficult as possible by using the same 
 container tokens and the RM-NM shared key mechanism over unauthenticated 
 RM-NM channel.
 In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-613) Create NM proxy per NM instead of per container


[ 
https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644816#comment-13644816
 ] 

Vinod Kumar Vavilapalli commented on YARN-613:
--

bq. Question: How do you plan for NMs to authenticate the AM tokens?
I thought I covered it but missed stating that - RM will share the underlying 
secret key corresponding to AM tokens as part of node-registration just like 
the one corresponding to ContainerTokens.

 Create NM proxy per NM instead of per container
 ---

 Key: YARN-613
 URL: https://issues.apache.org/jira/browse/YARN-613
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Vinod Kumar Vavilapalli

 Currently a new NM proxy has to be created per container since the secure 
 authentication is using a containertoken from the container.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-582) Restore appToken and clientToken for app attempt after RM restart


[ 
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644820#comment-13644820
 ] 

Vinod Kumar Vavilapalli commented on YARN-582:
--

bq.  Is it feasible to handle all tokens in an opaque credentials within the 
store?
Agreed. But because there are two types of tokens - application level and 
application-attempt level, we should have two credential fields.

 Restore appToken and clientToken for app attempt after RM restart
 -

 Key: YARN-582
 URL: https://issues.apache.org/jira/browse/YARN-582
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Jian He
 Attachments: YARN-582.1.patch


 These need to be saved and restored on a per app attempt basis. This is 
 required only when work preserving restart is implemented for secure 
 clusters. In non-preserving restart app attempts are killed and so this does 
 not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-620) TestContainerLocalizer.testContainerLocalizerMain failed on branch-2


[ 
https://issues.apache.org/jira/browse/YARN-620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644826#comment-13644826
 ] 

Jian He commented on YARN-620:
--

checked, it works fine now

 TestContainerLocalizer.testContainerLocalizerMain  failed on branch-2
 -

 Key: YARN-620
 URL: https://issues.apache.org/jira/browse/YARN-620
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-620.1.patch


 Argument(s) are different! Wanted:
 localFs.mkdir(
 
 /Users/jhe/hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer/0/usercache/yak/filecache,
 isA(org.apache.hadoop.fs.permission.FsPermission),
 false
 );
 - at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer.testContainerLocalizerMain(TestContainerLocalizer.java:170)
 Actual invocation has different arguments:
 localFs.mkdir(
 
 file:/Users/jhe/hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer/0/usercache/yak/filecache,
 rwxr-xr-x,
 false
 );
 - at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer.testContainerLocalizerMain(TestContainerLocalizer.java:162)
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer.testContainerLocalizerMain(TestContainerLocalizer.java:170)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (YARN-624) Support gang scheduling in the AM RM protocol

Sandy Ryza created YARN-624:
---

 Summary: Support gang scheduling in the AM RM protocol
 Key: YARN-624
 URL: https://issues.apache.org/jira/browse/YARN-624
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza


Per discussion on YARN-392 and elsewhere, gang scheduling, in which a scheduler 
runs a set of tasks when they can all be run at the same time, would be a 
useful feature for YARN schedulers to support.

Currently, AMs can approximate this by holding on to containers until they get 
all the ones they need.  However, this lends itself to deadlocks when different 
AMs are waiting on the same containers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-582) Restore appToken and clientToken for app attempt after RM restart


[ 
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644828#comment-13644828
 ] 

Jian He commented on YARN-582:
--

Yes, application-level token is stored along with ApplicationSubmissionContext, 
no need additional handle for that

 Restore appToken and clientToken for app attempt after RM restart
 -

 Key: YARN-582
 URL: https://issues.apache.org/jira/browse/YARN-582
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Jian He
 Attachments: YARN-582.1.patch


 These need to be saved and restored on a per app attempt basis. This is 
 required only when work preserving restart is implemented for secure 
 clusters. In non-preserving restart app attempts are killed and so this does 
 not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-582) Restore appToken and clientToken for app attempt after RM restart

2013-04-29 Thread Bikas Saha (JIRA)

[
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644845#comment-13644845
]

Bikas Saha commented on YARN-582:
-

The RMStore stores applications and their attempts. And it is used to restore
applications and their attempts from the data that they had earlier stored.
This allows the recovery code to follow existing code paths to the fullest
extent and prevent recovery logic from diverging from the normal code path.
So I would like to avoid storing tokens separately from apps/attempts and then
have to manage their relationship later on during recovery. As far as saving
appToken and clientToken, I agree it would be nice to have a single object
store all attempt tokens in one place. At AppSubmitContext does that for app
tokens.

Restore appToken and clientToken for app attempt after RM restart
-

Key: YARN-582
URL: https://issues.apache.org/jira/browse/YARN-582
Project: Hadoop YARN
Issue Type: Sub-task
Components: resourcemanager
Reporter: Bikas Saha
Assignee: Jian He
Attachments: YARN-582.1.patch

These need to be saved and restored on a per app attempt basis. This is
required only when work preserving restart is implemented for secure
clusters. In non-preserving restart app attempts are killed and so this does
not matter.

[jira] [Commented] (YARN-579) Make ApplicationToken part of Container's token list to help RM-restart


[ 
https://issues.apache.org/jira/browse/YARN-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644859#comment-13644859
 ] 

Vinod Kumar Vavilapalli commented on YARN-579:
--

I validated this on trunk, I can run it successfully on trunk even now. It 
seems like it is failing on branch-2. Something at RPC level I suppose, digging 
through..

 Make ApplicationToken part of Container's token list to help RM-restart
 ---

 Key: YARN-579
 URL: https://issues.apache.org/jira/browse/YARN-579
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.0.4-alpha
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 2.0.5-beta

 Attachments: YARN-579-20130422.1.txt, 
 YARN-579-20130422.1_YARNChanges.txt


 Container is already persisted for helping RM restart. Instead of explicitly 
 setting ApplicationToken in AM's env, if we change it to be in Container, we 
 can avoid env and can also help restart.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-513) Verify all clients will wait for RM to restart


[ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644891#comment-13644891
 ] 

Hadoop QA commented on YARN-513:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581001/YARN-513.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1366 javac 
compiler warnings (more than the trunk's current 1365 warnings).

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/838//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/838//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/838//console

This message is automatically generated.

 Verify all clients will wait for RM to restart
 --

 Key: YARN-513
 URL: https://issues.apache.org/jira/browse/YARN-513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch


 When the RM is restarting, the NM, AM and Clients should wait for some time 
 for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-618) Modify RM_INVALID_IDENTIFIER to a -ve number


 [ 
https://issues.apache.org/jira/browse/YARN-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-618:
-

Attachment: YARN-618.patch

This patch changed RM_INVALID_IDENTIFIER to a -ve number, and changed the tests 
accordingly

 Modify RM_INVALID_IDENTIFIER to  a -ve number
 -

 Key: YARN-618
 URL: https://issues.apache.org/jira/browse/YARN-618
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-618.patch


 RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. 
 Probably a -ve number is what we want.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-618) Modify RM_INVALID_IDENTIFIER to a -ve number


[ 
https://issues.apache.org/jira/browse/YARN-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644954#comment-13644954
 ] 

Hadoop QA commented on YARN-618:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581054/YARN-618.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitor

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/839//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/839//console

This message is automatically generated.

 Modify RM_INVALID_IDENTIFIER to  a -ve number
 -

 Key: YARN-618
 URL: https://issues.apache.org/jira/browse/YARN-618
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-618.patch


 RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. 
 Probably a -ve number is what we want.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-575) ContainerManager APIs should be user accessible

2013-04-29 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644964#comment-13644964
 ] 

Siddharth Seth commented on YARN-575:
-

I'm fine going the route of getting container status from the RM - when 
required. Assuming we keep the NM equivalent though, for AMs to use.
The AppTokens will be used for Authentication as well as Authorization for 
getContainerStatus calls ?


 ContainerManager APIs should be user accessible
 ---

 Key: YARN-575
 URL: https://issues.apache.org/jira/browse/YARN-575
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Vinod Kumar Vavilapalli
Priority: Critical

 Auth for ContainerManager is based on the containerId being accessed - since 
 this is what is used to launch containers (There's likely another jira 
 somewhere to change this to not be containerId based).
 What this also means is the API is effectively not usable with kerberos 
 credentials.
 Also, it should be possible to use this API with some generic tokens 
 (RMDelegation?), instead of with Container specific tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-528) Make IDs read only

2013-04-29 Thread Siddharth Seth (JIRA)

[
https://issues.apache.org/jira/browse/YARN-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644970#comment-13644970
]

Siddharth Seth commented on YARN-528:
-

bq Thanks for doing this Sid. I started pulling on the string and there was
just too much involved, so I had to stop.
Any thoughts on the approach used in the patch. Making IDs immutable should be
reasonably fast using this - changing the PB mechanisms for other classes is a
different beast though.

Make IDs read only
--

[jira] [Updated] (YARN-618) Modify RM_INVALID_IDENTIFIER to a -ve number


 [ 
https://issues.apache.org/jira/browse/YARN-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-618:
-

Attachment: YARN-618.1.patch

fixed test failure

 Modify RM_INVALID_IDENTIFIER to  a -ve number
 -

 Key: YARN-618
 URL: https://issues.apache.org/jira/browse/YARN-618
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-618.1.patch, YARN-618.patch


 RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. 
 Probably a -ve number is what we want.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-326) Add multi-resource scheduling to the fair scheduler


 [ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-326:


Attachment: YARN-326-1.patch

 Add multi-resource scheduling to the fair scheduler
 ---

 Key: YARN-326
 URL: https://issues.apache.org/jira/browse/YARN-326
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: FairSchedulerDRFDesignDoc-1.pdf, 
 FairSchedulerDRFDesignDoc.pdf, YARN-326-1.patch, YARN-326-1.patch, 
 YARN-326.patch, YARN-326.patch


 With YARN-2 in, the capacity scheduler has the ability to schedule based on 
 multiple resources, using dominant resource fairness.  The fair scheduler 
 should be able to do multiple resource scheduling as well, also using 
 dominant resource fairness.
 More details to come on how the corner cases with fair scheduler configs such 
 as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-506) Move to common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute


[ 
https://issues.apache.org/jira/browse/YARN-506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644996#comment-13644996
 ] 

Hadoop QA commented on YARN-506:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12580366/YARN-506.commonfileutils.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/841//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/841//console

This message is automatically generated.

 Move to common utils FileUtil#setReadable/Writable/Executable and 
 FileUtil#canRead/Write/Execute
 

 Key: YARN-506
 URL: https://issues.apache.org/jira/browse/YARN-506
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: YARN-506.commonfileutils.2.patch, 
 YARN-506.commonfileutils.patch


 Move to common utils described in HADOOP-9413 that work well cross-platform.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-618) Modify RM_INVALID_IDENTIFIER to a -ve number


[ 
https://issues.apache.org/jira/browse/YARN-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644997#comment-13644997
 ] 

Hadoop QA commented on YARN-618:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581060/YARN-618.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/840//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/840//console

This message is automatically generated.

 Modify RM_INVALID_IDENTIFIER to  a -ve number
 -

 Key: YARN-618
 URL: https://issues.apache.org/jira/browse/YARN-618
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-618.1.patch, YARN-618.patch


 RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. 
 Probably a -ve number is what we want.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-513) Verify all clients will wait for RM to restart

2013-04-29 Thread Xuan Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-513:
---

Attachment: YARN-513.4.patch

Fix -1 on javadoc warning

 Verify all clients will wait for RM to restart
 --

 Key: YARN-513
 URL: https://issues.apache.org/jira/browse/YARN-513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch, 
 YARN-513.4.patch


 When the RM is restarting, the NM, AM and Clients should wait for some time 
 for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler


[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645015#comment-13645015
 ] 

Hadoop QA commented on YARN-326:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581061/YARN-326-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/842//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/842//console

This message is automatically generated.

 Add multi-resource scheduling to the fair scheduler
 ---

 Key: YARN-326
 URL: https://issues.apache.org/jira/browse/YARN-326
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: FairSchedulerDRFDesignDoc-1.pdf, 
 FairSchedulerDRFDesignDoc.pdf, YARN-326-1.patch, YARN-326-1.patch, 
 YARN-326.patch, YARN-326.patch


 With YARN-2 in, the capacity scheduler has the ability to schedule based on 
 multiple resources, using dominant resource fairness.  The fair scheduler 
 should be able to do multiple resource scheduling as well, also using 
 dominant resource fairness.
 More details to come on how the corner cases with fair scheduler configs such 
 as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-506) Move to common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute


[ 
https://issues.apache.org/jira/browse/YARN-506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645023#comment-13645023
 ] 

Hudson commented on YARN-506:
-

Integrated in Hadoop-trunk-Commit #3695 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3695/])
YARN-506. Move to common utils FileUtil#setReadable/Writable/Executable and 
FileUtil#canRead/Write/Execute. Contributed by Ivan Mitic. (Revision 1477408)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1477408
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeHealthScriptRunner.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeHealthService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java


 Move to common utils FileUtil#setReadable/Writable/Executable and 
 FileUtil#canRead/Write/Execute
 

 Key: YARN-506
 URL: https://issues.apache.org/jira/browse/YARN-506
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 3.0.0

 Attachments: YARN-506.commonfileutils.2.patch, 
 YARN-506.commonfileutils.patch


 Move to common utils described in HADOOP-9413 that work well cross-platform.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-513) Verify all clients will wait for RM to restart


[ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645042#comment-13645042
 ] 

Hadoop QA commented on YARN-513:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581065/YARN-513.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/843//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/843//console

This message is automatically generated.

 Verify all clients will wait for RM to restart
 --

 Key: YARN-513
 URL: https://issues.apache.org/jira/browse/YARN-513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch, 
 YARN-513.4.patch


 When the RM is restarting, the NM, AM and Clients should wait for some time 
 for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-142) Change YARN APIs to throw IOException

2013-04-29 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645060#comment-13645060
 ] 

Siddharth Seth commented on YARN-142:
-

After HADOOP-9343, it should be possible for YarnException to not be rooted at 
IOException. So all methods can declare IOException and YarnException - and 
have the specializations of YarnException listed in the Javadoc. 

 Change YARN APIs to throw IOException
 -

 Key: YARN-142
 URL: https://issues.apache.org/jira/browse/YARN-142
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Xuan Gong
Priority: Blocker
 Attachments: YARN-142.1.patch, YARN-142.2.patch, YARN-142.3.patch, 
 YARN-142.4.patch


 Ref: MAPREDUCE-4067
 All YARN APIs currently throw YarnRemoteException.
 1) This cannot be extended in it's current form.
 2) The RPC layer can throw IOExceptions. These end up showing up as 
 UndeclaredThrowableExceptions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-599) Refactoring submitApplication in ClientRMService and RMAppManager

2013-04-29 Thread Zhijie Shen (JIRA)

[
https://issues.apache.org/jira/browse/YARN-599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhijie Shen updated YARN-599:
-

Attachment: YARN-599.2.patch

In the newer patch, I've updated the comments in ClientRMService and
RMAppManager, and added audit logging for user, and duplicate Id exceptions.

Refactoring submitApplication in ClientRMService and RMAppManager
-

Key: YARN-599
URL: https://issues.apache.org/jira/browse/YARN-599
Project: Hadoop YARN
Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
Attachments: YARN-599.1.patch, YARN-599.2.patch

Currently, ClientRMService#submitApplication call RMAppManager#handle, and
consequently call RMAppMangager#submitApplication directly, though the code
looks like scheduling an APP_SUBMIT event.
In addition, the validation code before creating an RMApp instance is not
well organized. Ideally, the dynamic validation, which depends on the RM's
configuration, should be put in RMAppMangager#submitApplication.
RMAppMangager#submitApplication is called by
ClientRMService#submitApplication and RMAppMangager#recover. Since the
configuration may be changed after RM restarts, the validation needs to be
done again even in recovery mode. Therefore, resource request validation,
which based on min/max resource limits, should be moved from
ClientRMService#submitApplication to RMAppMangager#submitApplication. On the
other hand, the static validation, which is independent of the RM's
configuration should be put in ClientRMService#submitApplication, because it
is only need to be done once during the first submission.
Furthermore, try-catch flow in RMAppMangager#submitApplication has a flaw.
RMAppMangager#submitApplication has a flaw is not synchronized. If two
application submissions with the same application ID enter the function, and
one progresses to the completion of RMApp instantiation, and the other
progresses the completion of putting the RMApp instance into rmContext, the
slower submission will cause an exception due to the duplicate application
ID. However, the exception will cause the RMApp instance already in rmContext
(belongs to the faster submission) being rejected with the current code flow.

[jira] [Commented] (YARN-578) NodeManager should use SecureIOUtils for serving logs and intermediate outputs


[ 
https://issues.apache.org/jira/browse/YARN-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645240#comment-13645240
 ] 

Vinod Kumar Vavilapalli commented on YARN-578:
--

Can you use this only for YARN changes i.e. serving logs and open a separate 
MAPREDUCE ticket for ShuffleHandler?

For the YARN changes:
 - Remove the comment above the code which talks about SecureIOUtils ;)
 - I think we should separate the exception message to clearly say whether this 
was an permission-issue or something else.

 NodeManager should use SecureIOUtils for serving logs and intermediate outputs
 --

 Key: YARN-578
 URL: https://issues.apache.org/jira/browse/YARN-578
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Omkar Vinit Joshi
 Attachments: yarn-578-20130426.patch


 Log servlets for serving logs and the ShuffleService for serving intermediate 
 outputs both should use SecureIOUtils for avoiding symlink attacks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (YARN-621) RM triggers web auth failure before first job


 [ 
https://issues.apache.org/jira/browse/YARN-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reassigned YARN-621:


Assignee: Vinod Kumar Vavilapalli  (was: Omkar Vinit Joshi)

Allen, can you share your environment details, I am not able to reproduce this 
in my setup.

 RM triggers web auth failure before first job
 -

 Key: YARN-621
 URL: https://issues.apache.org/jira/browse/YARN-621
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.4-alpha
Reporter: Allen Wittenauer
Assignee: Vinod Kumar Vavilapalli
Priority: Critical

 On a secure YARN setup, before the first job is executed, going to the web 
 interface of the resource manager triggers authentication errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-599) Refactoring submitApplication in ClientRMService and RMAppManager

[
https://issues.apache.org/jira/browse/YARN-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645248#comment-13645248
]

Hadoop QA commented on YARN-599:

{color:green}+1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12581118/YARN-599.2.patch
against trunk revision .

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 2 new
or modified test files.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 core tests{color}. The patch passed unit tests in
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-YARN-Build/844//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/844//console

This message is automatically generated.

Refactoring submitApplication in ClientRMService and RMAppManager
-

Key: YARN-599
URL: https://issues.apache.org/jira/browse/YARN-599
Project: Hadoop YARN
Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
Attachments: YARN-599.1.patch, YARN-599.2.patch

[jira] [Commented] (YARN-599) Refactoring submitApplication in ClientRMService and RMAppManager

[
https://issues.apache.org/jira/browse/YARN-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645288#comment-13645288
]

Vinod Kumar Vavilapalli commented on YARN-599:
--

Hm, it isn't straight-forward to figure that failures during
RMAppManager.submitApplication() are properly put in Audit logs. But they are,
I just verified.

The latest patch looks good to me. +1, checking it in..

Refactoring submitApplication in ClientRMService and RMAppManager
-

Key: YARN-599
URL: https://issues.apache.org/jira/browse/YARN-599
Project: Hadoop YARN
Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
Attachments: YARN-599.1.patch, YARN-599.2.patch

[jira] [Commented] (YARN-599) Refactoring submitApplication in ClientRMService and RMAppManager