[
https://issues.apache.org/jira/browse/SPARK-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162254#comment-14162254
]
Andrew Or commented on SPARK-3797:
----------------------------------
Thanks for detailing the considerations Sandy. I agree with every single one of
the drawbacks you listed.
The alternative of launching the shuffle service inside containers has been
given much thought. However, it will be overkill if we allocate one such
service for each executor or even application. In general, these services are
intended to be long-running local resource managers that are really more suited
to be run per-node. As you suggested, these services tend to have low memory
requirements and would be forced to take up more than what is needed.
For the rolling upgrades point, we can add some logic as in MR to handle short
outages as Tom suggested. The dependency and deployment stories are a little
harder to workaround. I think the point here is that either way we need to
offer an alternative of running it independently of the NM in case the cluster
has conflicting dependencies. Perhaps we'll need some
`start-shuffle-service.sh` script to launch these containers on all nodes
before running any actual Spark application. I should note that our shuffle
service is intended to be fairly lightweight and will have very limited
dependencies (e.g. we are considering building it with Java because we don't
want to bundle Scala). Hopefully that mitigates the issue.
> Run the shuffle service inside the YARN NodeManager as an AuxiliaryService
> --------------------------------------------------------------------------
>
> Key: SPARK-3797
> URL: https://issues.apache.org/jira/browse/SPARK-3797
> Project: Spark
> Issue Type: Sub-task
> Components: YARN
> Reporter: Patrick Wendell
> Assignee: Andrew Or
>
> It's also worth considering running the shuffle service in a YARN container
> beside the executor(s) on each node.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]