[jira] [Commented] (YARN-600) Hook up cgroups CPU settings to the number of virtual cores allocated

2013-05-01 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646436#comment-13646436
 ] 

Sandy Ryza commented on YARN-600:
-

(by which I mean YARN doesn't give hints to the OS on how and when to assign 
cores to processes)

 Hook up cgroups CPU settings to the number of virtual cores allocated
 -

 Key: YARN-600
 URL: https://issues.apache.org/jira/browse/YARN-600
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza

 YARN-3 introduced CPU isolation and monitoring through cgroups.  YARN-2 and 
 introduced CPU scheduling in the capacity scheduler, and YARN-326 will 
 introduce it in the fair scheduler.  The number of virtual cores allocated to 
 a container should be used to weight the number of cgroups CPU shares given 
 to it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-600) Hook up cgroups CPU settings to the number of virtual cores allocated

2013-05-01 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646435#comment-13646435
 ] 

Sandy Ryza commented on YARN-600:
-

I don't believe anything special happens when it's not enabled.

 Hook up cgroups CPU settings to the number of virtual cores allocated
 -

 Key: YARN-600
 URL: https://issues.apache.org/jira/browse/YARN-600
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza

 YARN-3 introduced CPU isolation and monitoring through cgroups.  YARN-2 and 
 introduced CPU scheduling in the capacity scheduler, and YARN-326 will 
 introduce it in the fair scheduler.  The number of virtual cores allocated to 
 a container should be used to weight the number of cgroups CPU shares given 
 to it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-422) Add AM-NM client library

2013-05-01 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-422:
-

Attachment: YARN-422.1.patch

Here's the first patch, which is ready for review. Based on the previous 
definition file, the patch has the following updates:

1. Refactor some code (add some more logs, paraphrase the javadoc, and etc)

2. Rename AMNMClient to NMClient since not only AM will use this client.

3. In NMClientAsync, join the threadpool's thread, which is set to non-daeomon.

4. Enhance the test cases.

As the patch is already big, I suggest to defer the code changes of using the 
client in AM and RM in the follow-up patches. In addition, as I found, maven 
seems not to distinguish scope when checking cyclic dependency. Therefore, 
making resourcemanager project depend on client (to use NMClient in AMLauncher) 
will fail the build.

 Add AM-NM client library
 

 Key: YARN-422
 URL: https://issues.apache.org/jira/browse/YARN-422
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Zhijie Shen
 Attachments: AMNMClient_Defination.txt, 
 AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf, 
 YARN-422.1.patch


 Create a simple wrapper over the AM-NM container protocol to provide hide the 
 details of the protocol implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-422) Add AM-NM client library

2013-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646448#comment-13646448
 ] 

Hadoop QA commented on YARN-422:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581346/YARN-422.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common:

  org.apache.hadoop.yarn.client.TestNMClientAsync

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/852//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/852//console

This message is automatically generated.

 Add AM-NM client library
 

 Key: YARN-422
 URL: https://issues.apache.org/jira/browse/YARN-422
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Zhijie Shen
 Attachments: AMNMClient_Defination.txt, 
 AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf, 
 YARN-422.1.patch


 Create a simple wrapper over the AM-NM container protocol to provide hide the 
 details of the protocol implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-422) Add AM-NM client library

2013-05-01 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-422:
-

Attachment: YARN-422.2.patch

Try to fix the test failure. The problem seems that the mockNMClient is changed 
before all the test cases are completed. So, split the success and the failure 
tests.

 Add AM-NM client library
 

 Key: YARN-422
 URL: https://issues.apache.org/jira/browse/YARN-422
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Zhijie Shen
 Attachments: AMNMClient_Defination.txt, 
 AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf, 
 YARN-422.1.patch, YARN-422.2.patch


 Create a simple wrapper over the AM-NM container protocol to provide hide the 
 details of the protocol implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-422) Add AM-NM client library

2013-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646473#comment-13646473
 ] 

Hadoop QA commented on YARN-422:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581348/YARN-422.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/853//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/853//console

This message is automatically generated.

 Add AM-NM client library
 

 Key: YARN-422
 URL: https://issues.apache.org/jira/browse/YARN-422
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Zhijie Shen
 Attachments: AMNMClient_Defination.txt, 
 AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf, 
 YARN-422.1.patch, YARN-422.2.patch


 Create a simple wrapper over the AM-NM container protocol to provide hide the 
 details of the protocol implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-613) Create NM proxy per NM instead of per container

2013-05-01 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646558#comment-13646558
 ] 

Daryn Sharp commented on YARN-613:
--

I just have general concerns with assuming the entire hadoop environment is 
trusted and thus introducing weaknesses at a global level .  Ex. A weakness is 
introduced every time one entity shares a secret to validate a token created by 
another entity.  Compromising one of hundreds or thousands of node shouldn't 
put the entire cluster at risk.  If I can gain access to one NM host and its 
keytab, I believe I can secretly launch a malicious NM?  NMs currently share a 
global key container token secrets, but there is a jira to move to per-NM 
secrets so sharing a global AM secret would be another step backwards.

Exploring alternate avenues to avoid global trust, is passing the allowed am 
token allowed to get status and stop the container with the launch request not 
feasible?

 Create NM proxy per NM instead of per container
 ---

 Key: YARN-613
 URL: https://issues.apache.org/jira/browse/YARN-613
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Vinod Kumar Vavilapalli

 Currently a new NM proxy has to be created per container since the secure 
 authentication is using a containertoken from the container.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-05-01 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646572#comment-13646572
 ] 

Daryn Sharp commented on YARN-617:
--

bq. we are trying to change the auth to use AMTokens and authorization will 
continue to be via ContainerTokens

I may have misinterpreted the other jira...  I thought the goal is continue to 
auth container launches with a container token, but change status and stop to 
authenticate with the am token?  Are you saying the goal is to auth container 
launches with the am token too?

{quote}bq. A RPC server also enables SASL DIGEST-MD5 if a secret manager is 
active.{quote}
bq. Off topic, but this is what I guessed is the reason underlying YARN-626, do 
you know when this got merged into branch-2?

The SASL changes HADOOP-8783/HADOOP-8784 went in Oct 3-4 2012.  The change 
allowed servers to accept tokens regardless of security setting if a secret 
manager is present, and for clients to always use a token if present regardless 
of security setting.  This didn't change behavior for secure cluster, so 
YARN-626 can't be related because security is enabled and the AM is lacking a 
token for the RM in its UGI.



 In unsercure mode, AM can fake resource requirements 
 -

 Key: YARN-617
 URL: https://issues.apache.org/jira/browse/YARN-617
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Minor

 Without security, it is impossible to completely avoid AMs faking resources. 
 We can at the least make it as difficult as possible by using the same 
 container tokens and the RM-NM shared key mechanism over unauthenticated 
 RM-NM channel.
 In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-05-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-18:
---

Attachment: Pluggable topologies with NodeGroup for YARN.pdf

Implementation doc for NodeGroup layer support in YARN.

 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, Pluggable topologies with NodeGroup for YARN.pdf, 
 YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, YARN-18-v3.2.patch, 
 YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, YARN-18-v4.3.patch, 
 YARN-18-v4.patch, YARN-18-v5.1.patch, YARN-18-v5.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-05-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-18:
---

Attachment: YARN-18-v6.patch

Sync patch to latest trunk which keep consist with doc just attached.

 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, Pluggable topologies with NodeGroup for YARN.pdf, 
 YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, YARN-18-v3.2.patch, 
 YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, YARN-18-v4.3.patch, 
 YARN-18-v4.patch, YARN-18-v5.1.patch, YARN-18-v5.patch, YARN-18-v6.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-05-01 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646652#comment-13646652
 ] 

Junping Du commented on YARN-18:


Thanks Luke for explanation. 
Hey Arun ([~acmurthy]), I just attached a doc with implementation details for 
this patch and YARN-19. Hopefully that will be helpful for your review. Your 
thought to abstract notion of topology logic in scheduler make great sense to 
me. We already did this for each scheduler, but you mean that need to abstract 
common part for all schedulers which is a little complicated but still doable. 
Can we do this code refactoring work in a separated jira and I am glad to work 
on it? For a new NodeGroup Scheduler, I think it may not be necessary as it 
address different topologies rather than different algorithm to 
isolate/prioritise job. So even under topology with NodeGroup layer, different 
user still need different scheduler like: Fair, Capacity. etc. Thoughts?

 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, Pluggable topologies with NodeGroup for YARN.pdf, 
 YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, YARN-18-v3.2.patch, 
 YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, YARN-18-v4.3.patch, 
 YARN-18-v4.patch, YARN-18-v5.1.patch, YARN-18-v5.patch, YARN-18-v6.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646659#comment-13646659
 ] 

Hadoop QA commented on YARN-18:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581362/YARN-18-v6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 3 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/854//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/854//console

This message is automatically generated.

 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, Pluggable topologies with NodeGroup for YARN.pdf, 
 YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, YARN-18-v3.2.patch, 
 YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, YARN-18-v4.3.patch, 
 YARN-18-v4.patch, YARN-18-v5.1.patch, YARN-18-v5.patch, YARN-18-v6.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-582) Restore appToken for app attempt after RM restart

2013-05-01 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-582:
-

Attachment: YARN-582.3.patch

 Restore appToken for app attempt after RM restart
 -

 Key: YARN-582
 URL: https://issues.apache.org/jira/browse/YARN-582
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Jian He
 Attachments: YARN-582.1.patch, YARN-582.2.patch, YARN-582.3.patch


 These need to be saved and restored on a per app attempt basis. This is 
 required only when work preserving restart is implemented for secure 
 clusters. In non-preserving restart app attempts are killed and so this does 
 not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-582) Restore appToken for app attempt after RM restart

2013-05-01 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646806#comment-13646806
 ] 

Jian He commented on YARN-582:
--

new patch addressed last comments

 Restore appToken for app attempt after RM restart
 -

 Key: YARN-582
 URL: https://issues.apache.org/jira/browse/YARN-582
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Jian He
 Attachments: YARN-582.1.patch, YARN-582.2.patch, YARN-582.3.patch


 These need to be saved and restored on a per app attempt basis. This is 
 required only when work preserving restart is implemented for secure 
 clusters. In non-preserving restart app attempts are killed and so this does 
 not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-582) Restore appToken for app attempt after RM restart

2013-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646835#comment-13646835
 ] 

Hadoop QA commented on YARN-582:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581393/YARN-582.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/855//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/855//console

This message is automatically generated.

 Restore appToken for app attempt after RM restart
 -

 Key: YARN-582
 URL: https://issues.apache.org/jira/browse/YARN-582
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Jian He
 Attachments: YARN-582.1.patch, YARN-582.2.patch, YARN-582.3.patch


 These need to be saved and restored on a per app attempt basis. This is 
 required only when work preserving restart is implemented for secure 
 clusters. In non-preserving restart app attempts are killed and so this does 
 not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-513) Create common proxy client for communicating with RM

2013-05-01 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646914#comment-13646914
 ] 

Xuan Gong commented on YARN-513:


Actually, the RMClient.invoke() can be removed. I followed the pattern on how 
NMProxies created the proxy. So, in current patch, we just need to expose the 
proxy object, and client calls proxy.method(). It should be good to go. I will 
do the further test. 

 Create common proxy client for communicating with RM
 

 Key: YARN-513
 URL: https://issues.apache.org/jira/browse/YARN-513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch, 
 YARN-513.4.patch


 When the RM is restarting, the NM, AM and Clients should wait for some time 
 for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-614) Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1

2013-05-01 Thread Chris Riccomini (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646993#comment-13646993
 ] 

Chris Riccomini commented on YARN-614:
--

bq. One solution could be to move the check from finishAttempt() to 
createAttempt(). finishAttempt() always enqueues a new attempt. the new attempt 
creation checks if one can still be created based on failed count etc.

This wouldn't fix the problem with RMAppManager.recover(), would it? Whether we 
enqueue attempts in finishAttempt or createAttempt, if the attempt account ever 
goes above maxAppAttempts, it seems like RMAppManager would not recover the 
app, right?

Are you proposing we always call appImpl.recover() in RMAppManager, always 
retry in RMAppImpl.AttemptFailedTransition, and call 
RMAppImpl.countFailureToAttemptLimit() inside RMAppImpl.createNewAttempt?

 Retry attempts automatically for hardware failures or YARN issues and set 
 default app retries to 1
 --

 Key: YARN-614
 URL: https://issues.apache.org/jira/browse/YARN-614
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Bikas Saha
 Attachments: YARN-614-0.patch


 Attempts can fail due to a large number of user errors and they should not be 
 retried unnecessarily. The only reason YARN should retry an attempt is when 
 the hardware fails or YARN has an error. NM failing, lost NM and NM disk 
 errors are the hardware errors that come to mind.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-618) Modify RM_INVALID_IDENTIFIER to a -ve number

2013-05-01 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647062#comment-13647062
 ] 

Vinod Kumar Vavilapalli commented on YARN-618:
--

+1, checking it in.

 Modify RM_INVALID_IDENTIFIER to  a -ve number
 -

 Key: YARN-618
 URL: https://issues.apache.org/jira/browse/YARN-618
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-618.1.patch, YARN-618.2.patch, YARN-618.3.patch, 
 YARN-618.patch


 RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. 
 Probably a -ve number is what we want.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-618) Modify RM_INVALID_IDENTIFIER to a -ve number

2013-05-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647075#comment-13647075
 ] 

Hudson commented on YARN-618:
-

Integrated in Hadoop-trunk-Commit #3708 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3708/])
YARN-618. Modified RM_INVALID_IDENTIFIER to be -1 instead of zero. 
Contributed by Jian He. (Revision 1478230)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1478230
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/ResourceManagerConstants.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestEventFlow.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitor.java


 Modify RM_INVALID_IDENTIFIER to  a -ve number
 -

 Key: YARN-618
 URL: https://issues.apache.org/jira/browse/YARN-618
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-618.1.patch, YARN-618.2.patch, YARN-618.3.patch, 
 YARN-618.patch


 RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. 
 Probably a -ve number is what we want.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-513) Create common proxy client for communicating with RM

2013-05-01 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647085#comment-13647085
 ] 

Xuan Gong commented on YARN-513:


The new patch includes:
1. remove RMClient, create RMProxy, followed the pattern of NameNodeProxy, 
includes several static CreateXXX() method. This will be much simpler.
2. At NodeManagerUpdaterImpl, there is not RMProxy object anymore, use 
RMProxy.createRMProxy() to create ResourceTracker object directly.
3. Remove invoke method, call proxy.method() directly. 
4. change the testcases, including changes localRMClient to LocalRMProxy

 Create common proxy client for communicating with RM
 

 Key: YARN-513
 URL: https://issues.apache.org/jira/browse/YARN-513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch, 
 YARN-513.4.patch


 When the RM is restarting, the NM, AM and Clients should wait for some time 
 for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-513) Create common proxy client for communicating with RM

2013-05-01 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-513:
---

Attachment: YARN.513.5.patch

 Create common proxy client for communicating with RM
 

 Key: YARN-513
 URL: https://issues.apache.org/jira/browse/YARN-513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch, 
 YARN-513.4.patch, YARN.513.5.patch


 When the RM is restarting, the NM, AM and Clients should wait for some time 
 for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-618) Modify RM_INVALID_IDENTIFIER to a -ve number

2013-05-01 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-618:
-

Attachment: YARN-618.3-branch-2.patch

The patch didn't apply cleanly against branch-2. Here's the one that I 
generated one myself which compiles successfully and passes 
TestContainerManager which had the merge conflict.

 Modify RM_INVALID_IDENTIFIER to  a -ve number
 -

 Key: YARN-618
 URL: https://issues.apache.org/jira/browse/YARN-618
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-618.1.patch, YARN-618.2.patch, 
 YARN-618.3-branch-2.patch, YARN-618.3.patch, YARN-618.patch


 RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. 
 Probably a -ve number is what we want.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-513) Create common proxy client for communicating with RM

2013-05-01 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647089#comment-13647089
 ] 

Xuan Gong commented on YARN-513:


The new patch is YARN.513.5.patch

 Create common proxy client for communicating with RM
 

 Key: YARN-513
 URL: https://issues.apache.org/jira/browse/YARN-513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch, 
 YARN-513.4.patch, YARN.513.5.patch


 When the RM is restarting, the NM, AM and Clients should wait for some time 
 for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-422) Add NM client library

2013-05-01 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-422:
-

Description: Create a simple wrapper over the ContainerManager protocol to 
provide hide the details of the protocol implementation.  (was: Create a simple 
wrapper over the AM-NM container protocol to provide hide the details of the 
protocol implementation.)

 Add NM client library
 -

 Key: YARN-422
 URL: https://issues.apache.org/jira/browse/YARN-422
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Zhijie Shen
 Attachments: AMNMClient_Defination.txt, 
 AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf, 
 YARN-422.1.patch, YARN-422.2.patch


 Create a simple wrapper over the ContainerManager protocol to provide hide 
 the details of the protocol implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-422) Add NM client library

2013-05-01 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-422:
-

Summary: Add NM client library  (was: Add AM-NM client library)

 Add NM client library
 -

 Key: YARN-422
 URL: https://issues.apache.org/jira/browse/YARN-422
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Zhijie Shen
 Attachments: AMNMClient_Defination.txt, 
 AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf, 
 YARN-422.1.patch, YARN-422.2.patch


 Create a simple wrapper over the AM-NM container protocol to provide hide the 
 details of the protocol implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-639) Make AM of Distributed Shell Use NMClient

2013-05-01 Thread Zhijie Shen (JIRA)
Zhijie Shen created YARN-639:


 Summary: Make AM of Distributed Shell Use NMClient
 Key: YARN-639
 URL: https://issues.apache.org/jira/browse/YARN-639
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: YARN-422 adds 
Reporter: Zhijie Shen




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-639) Make AM of Distributed Shell Use NMClient

2013-05-01 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-639:
-

Description: YARN-422 ads

 Make AM of Distributed Shell Use NMClient
 -

 Key: YARN-639
 URL: https://issues.apache.org/jira/browse/YARN-639
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: YARN-422 adds 
Reporter: Zhijie Shen

 YARN-422 ads

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-513) Create common proxy client for communicating with RM

2013-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647109#comment-13647109
 ] 

Hadoop QA commented on YARN-513:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581439/YARN.513.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/856//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/856//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/856//console

This message is automatically generated.

 Create common proxy client for communicating with RM
 

 Key: YARN-513
 URL: https://issues.apache.org/jira/browse/YARN-513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch, 
 YARN-513.4.patch, YARN.513.5.patch


 When the RM is restarting, the NM, AM and Clients should wait for some time 
 for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-639) Make AM of Distributed Shell Use NMClient

2013-05-01 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen reassigned YARN-639:


Assignee: Zhijie Shen

 Make AM of Distributed Shell Use NMClient
 -

 Key: YARN-639
 URL: https://issues.apache.org/jira/browse/YARN-639
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: YARN-422 adds 
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 YARN-422 adds NMClient. AM of Distributed Shell should use it instead of 
 using ContainerManager directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-639) Make AM of Distributed Shell Use NMClient

2013-05-01 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-639:
-

Environment: (was: YARN-422 adds )

 Make AM of Distributed Shell Use NMClient
 -

 Key: YARN-639
 URL: https://issues.apache.org/jira/browse/YARN-639
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 YARN-422 adds NMClient. AM of Distributed Shell should use it instead of 
 using ContainerManager directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-640) Make AM of M/R Use NMClient

2013-05-01 Thread Zhijie Shen (JIRA)
Zhijie Shen created YARN-640:


 Summary: Make AM of M/R Use NMClient
 Key: YARN-640
 URL: https://issues.apache.org/jira/browse/YARN-640
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen


YARN-422 adds NMClient. AM of mapreduce should use it instead of using the raw 
ContainerManager proxy directly. ContainerLauncherImpl needs to be changed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-638) Add RMDelegationTokens back to DelegationTokenSecretManager after RM Restart

2013-05-01 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-638:
-

Summary: Add RMDelegationTokens back to DelegationTokenSecretManager after 
RM Restart  (was: Add DelegationTokens back to DelegationTokenSecretManager 
after RM Restart)

 Add RMDelegationTokens back to DelegationTokenSecretManager after RM Restart
 

 Key: YARN-638
 URL: https://issues.apache.org/jira/browse/YARN-638
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-638.1.patch


 This is missed in YARN-581. After RM restart, delegation tokens need to be 
 added both in DelegationTokenRenewer (addressed in YARN-581), and 
 delegationTokenSecretManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-606) negative queue metrics apps Failed

2013-05-01 Thread nemon lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nemon lou updated YARN-606:
---

Assignee: nemon lou

 negative  queue metrics apps Failed
 -

 Key: YARN-606
 URL: https://issues.apache.org/jira/browse/YARN-606
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: nemon lou
Assignee: nemon lou
Priority: Minor

 Queue metrcis apps Failed can be negative in some cases(more than one 
 attempt for an application can cause this).
 It's confusing if we use this metrics directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-638) Add RMDelegationTokens back to DelegationTokenSecretManager after RM Restart

2013-05-01 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-638:
-

Description: This is missed in YARN-581. After RM restart, 
RMDelegationTokens need to be added both in DelegationTokenRenewer (addressed 
in YARN-581), and delegationTokenSecretManager  (was: This is missed in 
YARN-581. After RM restart, delegation tokens need to be added both in 
DelegationTokenRenewer (addressed in YARN-581), and 
delegationTokenSecretManager)

 Add RMDelegationTokens back to DelegationTokenSecretManager after RM Restart
 

 Key: YARN-638
 URL: https://issues.apache.org/jira/browse/YARN-638
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-638.1.patch


 This is missed in YARN-581. After RM restart, RMDelegationTokens need to be 
 added both in DelegationTokenRenewer (addressed in YARN-581), and 
 delegationTokenSecretManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-05-01 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi reassigned YARN-617:
--

Assignee: Omkar Vinit Joshi  (was: Vinod Kumar Vavilapalli)

 In unsercure mode, AM can fake resource requirements 
 -

 Key: YARN-617
 URL: https://issues.apache.org/jira/browse/YARN-617
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Omkar Vinit Joshi
Priority: Minor

 Without security, it is impossible to completely avoid AMs faking resources. 
 We can at the least make it as difficult as possible by using the same 
 container tokens and the RM-NM shared key mechanism over unauthenticated 
 RM-NM channel.
 In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-05-01 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-617:
---

Attachment: YARN-617.20130501.patch

 In unsercure mode, AM can fake resource requirements 
 -

 Key: YARN-617
 URL: https://issues.apache.org/jira/browse/YARN-617
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Omkar Vinit Joshi
Priority: Minor
 Attachments: YARN-617.20130501.patch


 Without security, it is impossible to completely avoid AMs faking resources. 
 We can at the least make it as difficult as possible by using the same 
 container tokens and the RM-NM shared key mechanism over unauthenticated 
 RM-NM channel.
 In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-05-01 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-617:
---

Attachment: YARN-617.20130501.1.patch

 In unsercure mode, AM can fake resource requirements 
 -

 Key: YARN-617
 URL: https://issues.apache.org/jira/browse/YARN-617
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Omkar Vinit Joshi
Priority: Minor
 Attachments: YARN-617.20130501.1.patch, YARN-617.20130501.patch


 Without security, it is impossible to completely avoid AMs faking resources. 
 We can at the least make it as difficult as possible by using the same 
 container tokens and the RM-NM shared key mechanism over unauthenticated 
 RM-NM channel.
 In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-05-01 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647211#comment-13647211
 ] 

Omkar Vinit Joshi commented on YARN-617:


I am attaching the patch. (Junit tests not included). I will update the patch 
with tests soon.

* At present master key is exchanged between RM and NM only if the environment 
is secured. I am updating this to make sure that RM - NM exchange master key in 
both the scenarios Secured / Unsecured.
** During NM register
** During NM heartbeat (status updater only if key is updated as it is today)
* At present master key is not genertated/sent during container launch for 
unsecured case. Now making sure that it is send as a part of the payload to 
AMLauncher to NodeManager.. On Node Manager this token will be used to verify 
container start request.
** For Secured case retrieving token from remoteUgi
** For unsecured case retrieving token from passed in container payload.

There are some other changes related to this patch
* start Container requires UGI-username to be that of container-id ... still I 
have not understood why so? (ContainerLauncherImpl)
* Making sure that NMContainerTokenSecretManager is created for both cases.

 In unsercure mode, AM can fake resource requirements 
 -

 Key: YARN-617
 URL: https://issues.apache.org/jira/browse/YARN-617
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Omkar Vinit Joshi
Priority: Minor
 Attachments: YARN-617.20130501.1.patch, YARN-617.20130501.patch


 Without security, it is impossible to completely avoid AMs faking resources. 
 We can at the least make it as difficult as possible by using the same 
 container tokens and the RM-NM shared key mechanism over unauthenticated 
 RM-NM channel.
 In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-629) Make YarnRemoteException not be rooted at IOException

2013-05-01 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647229#comment-13647229
 ] 

Xuan Gong commented on YARN-629:


bq:Need a specific test which validates that the final exception is same as the 
one thrown on the remote side. Can you check if TestNodeManagerResync and 
TestContainerManager can be changed to validate this? Look for refs to YARN-142 
in those tests.
Should it be YarnRemoteException ???

 Make YarnRemoteException not be rooted at IOException
 -

 Key: YARN-629
 URL: https://issues.apache.org/jira/browse/YARN-629
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-629.1.patch, YARN-629.2.patch


 After HADOOP-9343, it should be possible for YarnException to not be rooted 
 at IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-629) Make YarnRemoteException not be rooted at IOException

2013-05-01 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647228#comment-13647228
 ] 

Xuan Gong commented on YARN-629:


bq:Need a specific test which validates that the final exception is same as the 
one thrown on the remote side. Can you check if TestNodeManagerResync and 
TestContainerManager can be changed to validate this? Look for refs to YARN-142 
in those tests.
bq:Please create a sister JIRA in MapReduce, I'll review MR changes there but 
commit that together with this patch.
Create MR ticket. MAPREDUCE-5204
bq:Investigated the test-failures?
they are all passing now.
bq:TestClientTokens: Explicitly catch YarnRemoteException and fail instead of 
instance checks?
Can't understand the changes in TestClientRMService, explain please why we need 
to only now do e.getCause().getMessage() to capture the remote exception 
message.
YarnRemoteException is not rooted as IOException now. For ugi.doAs(), it throws 
IOException, InterruptedException, UndeclaredThrowableException,etc, but does 
not throw YarnRemoteException. So, we can not Explicitly catch 
YarnRemoteException. I think if YarnRemoteException is throw inside the doAs(), 
it will be wrapped. So by calling getCause() will get it back. 

 Make YarnRemoteException not be rooted at IOException
 -

 Key: YARN-629
 URL: https://issues.apache.org/jira/browse/YARN-629
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-629.1.patch, YARN-629.2.patch


 After HADOOP-9343, it should be possible for YarnException to not be rooted 
 at IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-05-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-18:
---

Attachment: YARN-18-v6.1.patch

Address minor javadoc issue in v6.1 patch. 

 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, Pluggable topologies with NodeGroup for YARN.pdf, 
 YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, YARN-18-v3.2.patch, 
 YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, YARN-18-v4.3.patch, 
 YARN-18-v4.patch, YARN-18-v5.1.patch, YARN-18-v5.patch, YARN-18-v6.1.patch, 
 YARN-18-v6.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-05-01 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647242#comment-13647242
 ] 

Vinod Kumar Vavilapalli commented on YARN-617:
--

bq. Are you saying the goal is to auth container launches with the am token too?
Yes. All communication with NM to be authenticated by AMToken.

We could keep it separate from startContainer() and stop/getStatus, but we want 
to solve YARN-613 too. Having the authentication via container-token is forcing 
us to create a connection per-container. You must have seen the gory MR 
ContainerLauncher resorting to tricks like creating lots of threads, opening 
and closing connections immediately to avoid hitting ulimits etc. Some of that 
ugliness will go away if we perform all authentication using AMTokens and use 
ContainerTokens for authorization.

Thanks for the tip on HADOOP-8783/HADOOP-8784.

 In unsercure mode, AM can fake resource requirements 
 -

 Key: YARN-617
 URL: https://issues.apache.org/jira/browse/YARN-617
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Omkar Vinit Joshi
Priority: Minor
 Attachments: YARN-617.20130501.1.patch, YARN-617.20130501.patch


 Without security, it is impossible to completely avoid AMs faking resources. 
 We can at the least make it as difficult as possible by using the same 
 container tokens and the RM-NM shared key mechanism over unauthenticated 
 RM-NM channel.
 In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647249#comment-13647249
 ] 

Hadoop QA commented on YARN-18:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581468/YARN-18-v6.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/857//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/857//console

This message is automatically generated.

 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, Pluggable topologies with NodeGroup for YARN.pdf, 
 YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, YARN-18-v3.2.patch, 
 YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, YARN-18-v4.3.patch, 
 YARN-18-v4.patch, YARN-18-v5.1.patch, YARN-18-v5.patch, YARN-18-v6.1.patch, 
 YARN-18-v6.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-613) Create NM proxy per NM instead of per container

2013-05-01 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647263#comment-13647263
 ] 

Vinod Kumar Vavilapalli commented on YARN-613:
--

bq. I just have general concerns with assuming the entire hadoop environment is 
trusted and thus introducing weaknesses at a global level . Ex. A weakness is 
introduced every time one entity shares a secret to validate a token created by 
another entity. Compromising one of hundreds or thousands of node shouldn't put 
the entire cluster at risk.
Agree with you in general. Read on.

bq. If I can gain access to one NM host and its keytab, I believe I can 
secretly launch a malicious NM?
That is true in general.  And I am not sure how we can even contain such a 
break-in. I suppose going the way of DataNode to start the server on privileged 
ports will contain it [1]. If one can get hold of the keytab(owned by YARN 
user), I suppose at that point he can launch the container-executor binary too, 
which will give him root access. So it's all predicated on secure setup to not 
do stupid things.

bq. NMs currently share a global key container token secrets, but there is a 
jira to move to per-NM secrets so sharing a global AM secret would be another 
step backwards.
Agreed.

bq. Exploring alternate avenues to avoid global trust, is passing the allowed 
am token allowed to get status and stop the container with the launch request 
not feasible?
May be it isn't clear in my proposal, but let me state it again anyways, mostly 
repeating what I just commented about on YARN-617.
 - Having the authentication via container-token is forcing us to create a 
connection per-container.
 - MR's ContainerLauncher for example resorts to tricks like creating lots of 
threads, opening and closing connections immediately to avoid hitting ulimits 
etc.
 - Most of that ugliness will go away if we perform all authentication using 
AMTokens for *all* AM-NM APIs and use ContainerTokens for authorization of 
startContainer() requests.

May be we should just do [1] above (previleged ports).

To sum it up, I am open to suggestions. My fundamental requirements are:
 - If possible, AMs should open only one connection - secure one - to each NM. 
Not one per container
 - All connections (all APIs) between AM and NM should be authenticated - 
DIGEST based at best here - and if possible without AMs having to latch on to 
things like ContainerTokens for potentially long periods.

 Create NM proxy per NM instead of per container
 ---

 Key: YARN-613
 URL: https://issues.apache.org/jira/browse/YARN-613
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Vinod Kumar Vavilapalli

 Currently a new NM proxy has to be created per container since the secure 
 authentication is using a containertoken from the container.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-629) Make YarnRemoteException not be rooted at IOException

2013-05-01 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647272#comment-13647272
 ] 

Vinod Kumar Vavilapalli commented on YARN-629:
--

bq.  YarnRemoteException is not rooted as IOException now. For ugi.doAs(), it 
throws IOException, InterruptedException, UndeclaredThrowableException,etc, but 
does not throw YarnRemoteException. So, we can not Explicitly catch 
YarnRemoteException.
Then why was this code catching YarnRemoteException *before* your patch ? Try 
changing client.ping() to throw YarnRemoteException, like we do in real 
protocols.

bq. I think if YarnRemoteException is throw inside the doAs(), it will be 
wrapped. So by calling getCause() will get it back.
Why didn't we need this before and only now?

bq.  Should it be YarnRemoteException ???
No, see NMNotYetReadyException and InvalidContainerException.

 Make YarnRemoteException not be rooted at IOException
 -

 Key: YARN-629
 URL: https://issues.apache.org/jira/browse/YARN-629
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-629.1.patch, YARN-629.2.patch


 After HADOOP-9343, it should be possible for YarnException to not be rooted 
 at IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira