squito commented on issue #23951: [SPARK-13704][CORE][YARN] Re-implement 
RackResolver to reduce resolving time
URL: https://github.com/apache/spark/pull/23951#issuecomment-474023289
 
 
   wrt `AMRMClient` -- I don't think we should make any change related to it as 
part of this PR.  But I was thinking that we could:
   
   1) Since `ContainerRequest` explicitly says its going to add all racks for 
the hosts in the request:
   
   
https://github.com/apache/hadoop/blob/6fa229891e06eea62cb9634efde755f40247e816/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/AMRMClient.java#L136-L139
   
   Spark should probably not do the same thing itself:
   
   
https://github.com/apache/spark/blob/7043aee1ba95e92e1cbd0ebafcc5b09b69ee3082/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/LocalityPreferredContainerPlacementStrategy.scala#L141-L143
   
   2a) YARN could change to add another api, where it does *not* add racks for 
hosts in the list
   2b) then spark could change to send the rack from the scheduler to the 
YarnAllocator, and then use the new api, to avoid making another set of rack 
lookups in the AM.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to