xkrogen opened a new pull request #32810:
URL: https://github.com/apache/spark/pull/32810


   ### What changes were proposed in this pull request?
   Pass the list of user JARs, or the `userClassPath`, into executors 
(`CoarseGrainedExecutorBackend`) using a new (internal) configuration, 
`spark.executor.userClassPath.entries`. Though on the executor side, this 
config is supported for all resource managers, only the YARN `Client` will 
configure it. This is consistent with the existing behavior, where only YARN 
drivers will pass the list of user JARs using the `--user-class-path` option. 
The configurations get written to a file and distributed via the YARN 
distributed cache, bypassing the command line and using a more scalable 
approach for passing the required information.
   
   ### Why are the changes needed?
   User-provided JARs are made available to executors using a custom 
classloader, so they do not appear on the standard Java classpath. Instead, 
they are passed as a list to the executor which then creates a classloader out 
of the URLs. Currently in the case of YARN, this list of JARs is crafted by the 
Driver (in `ExecutorRunnable`), which then passes the information to the 
executors (`CoarseGrainedExecutorBackend`) by specifying each JAR on the 
executor command line as `--user-class-path /path/to/myjar.jar`. This can cause 
extremely long argument lists when there are many JARs, which can cause the OS 
argument length to be exceeded, typically manifesting as the error message:
   
   > /bin/bash: Argument list too long
   
   A [Google 
search](https://www.google.com/search?q=spark%20%22%2Fbin%2Fbash%3A%20argument%20list%20too%20long%22&oq=spark%20%22%2Fbin%2Fbash%3A%20argument%20list%20too%20long%22)
 indicates that this is not a theoretical problem and afflicts real users, 
including ours. Passing this list using the configurations instead resolves 
this issue.
   
   ### Does this PR introduce _any_ user-facing change?
   No, except for fixing the bug, allowing for larger JAR lists to be passed 
successfully. Configuration of JARs is identical to before.
   
   ### How was this patch tested?
   New unit tests were added in `YarnClusterSuite`. Also, we have been running 
this fix internally for 4 months with great success.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to