xkrogen opened a new pull request #33090:
URL: https://github.com/apache/spark/pull/33090


   ### What changes were proposed in this pull request?
   Refactor the logic for constructing the user classpath from 
`yarn.ApplicationMaster` into `yarn.Client` so that it can be leveraged on the 
executor side as well, instead of having the driver construct it and pass it to 
the executor via command-line arguments. A new method, `getUserClassPath`, is 
added to `CoarseGrainedExecutorBackend` which defaults to `Nil` (consistent 
with the existing behavior where non-YARN resource managers do not configure 
the user classpath). `YarnCoarseGrainedExecutorBackend` overrides this to 
construct the user classpath from the existing `APP_JAR` and `SECONDARY_JARS` 
configs.
   
   ### Why are the changes needed?
   User-provided JARs are made available to executors using a custom 
classloader, so they do not appear on the standard Java classpath. Instead, 
they are passed as a list to the executor which then creates a classloader out 
of the URLs. Currently in the case of YARN, this list of JARs is crafted by the 
Driver (in `ExecutorRunnable`), which then passes the information to the 
executors (`CoarseGrainedExecutorBackend`) by specifying each JAR on the 
executor command line as `--user-class-path /path/to/myjar.jar`. This can cause 
extremely long argument lists when there are many JARs, which can cause the OS 
argument length to be exceeded, typically manifesting as the error message:
   
   > /bin/bash: Argument list too long
   
   A [Google 
search](https://www.google.com/search?q=spark%20%22%2Fbin%2Fbash%3A%20argument%20list%20too%20long%22&oq=spark%20%22%2Fbin%2Fbash%3A%20argument%20list%20too%20long%22)
 indicates that this is not a theoretical problem and afflicts real users, 
including ours. Passing this list using the configurations instead resolves 
this issue.
   
   ### Does this PR introduce _any_ user-facing change?
   No, except for fixing the bug, allowing for larger JAR lists to be passed 
successfully. Configuration of JARs is identical to before.
   
   ### How was this patch tested?
   New unit tests were added in `YarnClusterSuite`. Also, we have been running 
a similar fix internally for 4 months with great success.
   
   Note that this is a backport of #32810 with minor conflicts around imports.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to