[jira] [Commented] (YARN-435) Make it easier to access cluster topology information in an AM

2014-04-14 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968558#comment-13968558
 ] 

Bikas Saha commented on YARN-435:
-

Pasting the description from YARN-1722 that was closed as a dup of this.
{code}There is no way for an AM to find out the names of all the nodes in the 
cluster via the AMRMProtocol. An AM can only at best ask for containers at * 
location. The only way to get that information is via the ClientRMProtocol but 
that is secured by Kerberos or RMDelegationToken while the AM has an AMRMToken. 
This is a pretty important piece of missing functionality. There are other 
jiras opened about getting cluster topology etc. but they havent been addressed 
due to a clear definition of cluster topology perhaps. Adding a means to at 
least get the node information would be a good first step.{code}
This jira may have stalled in trying to figure out how to layout the topology. 
YARN-1722 simply asks for the list of nodes in the cluster. While defining a 
way to generically describe topology may be tricky - all such methods must list 
all the nodes in the cluster. So YARN-1722 is a much simpler problem. Whatever 
object we define for topology, can start with a simple list of nodes and then 
use the integer id of the nodes (for compaction) in the list to reference them 
in the other objects describing the hierarchy.

 Make it easier to access cluster topology information in an AM
 --

 Key: YARN-435
 URL: https://issues.apache.org/jira/browse/YARN-435
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Omkar Vinit Joshi

 ClientRMProtocol exposes a getClusterNodes api that provides a report on all 
 nodes in the cluster including their rack information. 
 However, this requires the AM to open and establish a separate connection to 
 the RM in addition to one for the AMRMProtocol. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-435) Make it easier to access cluster topology information in an AM

2013-08-26 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750755#comment-13750755
 ] 

Thomas Weise commented on YARN-435:
---

For DataTorrent, we require the node report to implement locality constraints 
for streaming. We need the initial list (currently only available through 
ClientRMProtocol), and then use incremental updates from AMResponse.


 Make it easier to access cluster topology information in an AM
 --

 Key: YARN-435
 URL: https://issues.apache.org/jira/browse/YARN-435
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Omkar Vinit Joshi

 ClientRMProtocol exposes a getClusterNodes api that provides a report on all 
 nodes in the cluster including their rack information. 
 However, this requires the AM to open and establish a separate connection to 
 the RM in addition to one for the AMRMProtocol. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-435) Make it easier to access cluster topology information in an AM

2013-08-13 Thread shenhong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13738647#comment-13738647
 ] 

shenhong commented on YARN-435:
---

Firstly, if AM get all nodes in the cluster including their rack information by 
calling RM. This will increase pressure on the RM's network. For example, the 
cluster had more than 5000 datanodes.

Secondly, if the yarn cluster only has 100 nodemanagers, but the hdfs it 
accessed is a cluster with more than 5000 datanodes, we can't get all the nodes 
including their rack information. However, AM need all the datanode information 
in it's job.splitmetainfo file, in order to init TaskAttempt. In this case, we 
can't get all nodes by calling RM.

 Make it easier to access cluster topology information in an AM
 --

 Key: YARN-435
 URL: https://issues.apache.org/jira/browse/YARN-435
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Omkar Vinit Joshi

 ClientRMProtocol exposes a getClusterNodes api that provides a report on all 
 nodes in the cluster including their rack information. 
 However, this requires the AM to open and establish a separate connection to 
 the RM in addition to one for the AMRMProtocol. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-435) Make it easier to access cluster topology information in an AM

2013-05-29 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669570#comment-13669570
 ] 

Omkar Vinit Joshi commented on YARN-435:


After discussion with [~vinodkv] yesterday.. Here are the possible solutions to 
this
* At the time of application master registration along with AMToken also issue 
ClientDelegationToken or may be on request in AMRMProtocol.
** Advantages: -AM doesn't need to request token to get additional information 
(no kerberos authentication for secure case).
** Disadvantages: -
*** AM has to open 2 connections and manage 2 tokens (AMToken and 
ClientDelegationToken)
*** AM can now do all the activities which earlier only client was allowed to 
do (getAllApps, forceKillApp, submitApp)

* Create a new interface something like, ClusterInfo and add ClusterNodes and 
ClusterMatrics and similar info to it. Let ClientRMProtocol and AMRMProtocol 
extend this.
** Advantages: -
*** AM has to create only one connection
*** AM doesn't get by default or on request ClientDelegationToken (for secure 
env.). So AM is not allowed to do app activities on ClientRMProtocol 
(submitApp, killApp, getAppStatus) unless Client itself share this token with 
AM.
*** Connection management will be very simple for AM and will get all required 
info.
** Disadvantages: -
*** AMRMProtocol will get modified. As AM-RM heartbeat is mandatory for all 
active AMs, adding this will add burden on ApplicationMasterService.

* Allow ClientRMProtocol to accept AMToken too along with 
ClientDelegationToken. Thereby AM can communicate with RM even on 
ClientRMProtocol. 
** Advantages: -No need to share/create ClientDelegationToken
** Disadvantages: -same as issuing ClientDelegationToken.

 Make it easier to access cluster topology information in an AM
 --

 Key: YARN-435
 URL: https://issues.apache.org/jira/browse/YARN-435
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Hitesh Shah

 ClientRMProtocol exposes a getClusterNodes api that provides a report on all 
 nodes in the cluster including their rack information. 
 However, this requires the AM to open and establish a separate connection to 
 the RM in addition to one for the AMRMProtocol. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-435) Make it easier to access cluster topology information in an AM

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642450#comment-13642450
 ] 

Vinod Kumar Vavilapalli commented on YARN-435:
--

I think that opening a new connection isn't really a problem. I propose:
 - Keeping the protocols separate - ClientRMProtocol and AMRMProtocol and not 
duplicate any APIs
 - Expect AMs to open two connections always. The AMRMClient library can do 
this for java users
 - For secure mode, make the ClientRMProtocol also accept AMTokens. Once we do 
that, any AM even in secure mode can talk to RM on both the protocols. After 
this and YARN-613, AMToken becomes a single sign-on token - with a AMToken, an 
AM can talk to ClientRMProtocol and AMRMProtocol on ResourceManager and also on 
ContainerManagerProtocol on NodeManager. 

 Make it easier to access cluster topology information in an AM
 --

 Key: YARN-435
 URL: https://issues.apache.org/jira/browse/YARN-435
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Hitesh Shah

 ClientRMProtocol exposes a getClusterNodes api that provides a report on all 
 nodes in the cluster including their rack information. 
 However, this requires the AM to open and establish a separate connection to 
 the RM in addition to one for the AMRMProtocol. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira