[GitHub] [spark] tgravescs commented on pull request #32804: [SPARK-26867][YARN] Spark Support of YARN Placement Constraint

GitBox Tue, 22 Jun 2021 06:56:29 -0700


tgravescs commented on pull request #32804:
URL: https://github.com/apache/spark/pull/32804#issuecomment-866006388



   thanks for working on this, it looks very interesting
   
   > Add LocalityPreferredSchedulingRequestContainerPlacementStrategy for 
compute locality for SchedulingRequest
    
   Can you add more description about this?  This seems like a lot of changes 
and not what I expected from the description. I was expecting us just to pass 
the node attributes along to yarn but this is much more then that so please 
describe in detail how this is working.  How does this work exactly with 
constraint vs data locality?  Is there some sort of timed wait or error 
handling if it never becomes available or cluster doesn't have node with that 
attribute?  You had like 4 examples of things you could do, but are there more, 
what dependencies does it need. How does an attribute allow you to get 
cardinality or affinity?
   
   is there any constraints on Hadoop version? (I thought it was just 
introduced in 3.2.0)
   
   
   eventually the documentation .md file would need to be updated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] tgravescs commented on pull request #32804: [SPARK-26867][YARN] Spark Support of YARN Placement Constraint

Reply via email to