xkrogen commented on pull request #31936: URL: https://github.com/apache/spark/pull/31936#issuecomment-805023081
Working on addressing test failures, will update the PR soon ... @tgravescs thanks for the insightful comments! > this was tried once before under #26000 so just putting it here for reference. I think the main thing is if we are going to support that we make sure its truly configurable and not just half done. But it sounds like you are trying to address the concerns brought up there but I need to do a more detailed review. Thanks a lot for sharing this, I did not find it in my previous search. I agree that the concerns raised there around supporting different configurations should be addressed by this PR. Just making the shuffle service name configurable was the easy part :) > One thing doing a quick skim I don't see if in YarnShuffleService.java its still registering the AuxiliaryService with the name spark_shuffle, Is that not really used by the yarn auxiliary service and it uses what you configure in the node manager? In the very least that needs to be documented in the code. Great callout. I mentioned this in the JIRA description but it looks like I forgot to include it in the PR description. The name in the NodeManager config (`yarn.nodemanager.aux-services`) is the source-of-truth for the name. The name provided by the service (hard-coded to `spark_shuffle` in our case) is compared to the configured one, and the NodeManager will log a warning if it doesn't match because it indicates you might have a misconfiguration, but then it is never used again. Actually when using the isolated classloader, the hard-coded name is completely ignored, and instead the name of the class is used: https://github.com/apache/hadoop/blob/a89ca56a1b0eb949f56e7c6c5c25fdf87914a02f/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxiliaryServiceWithCustomClassLoader.java#L170-L172 I would use the value of `spark.shuffle.service.name` from the configuration instead of hard-coding it, just to keep things clean, but we need the name at the time of class instantiation, and we don't get a `Configuration` object until `serviceInit` is called, at which point it's too late. I will update the code comments accordingly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
