shanyu opened a new pull request #27781: [SPARK-31028] Add 
"-XX:ActiveProcessorCount" to Spark driver and executor in Yarn mode
URL: https://github.com/apache/spark/pull/27781
 
 
   # What changes were proposed in this pull request?
   When starting Spark driver and executors on Yarn cluster, the JVM process 
can discover all CPU cores on the system and set thread-pool or GC threads 
based on that value. We should limit what the JVM sees for the number of cores 
set by the user (spark.driver.cores or spark.executor.cores) by 
"-XX:ActiveProcessorCount", which was introduced in Java 8u191.
   
   Especially in running Spark on Yarn inside Kubernetes container, the number 
of CPU cores discovered sometimes is 1, which means it always use 1 thread in 
the default thread pool, or GC threads.
   
   ### Why are the changes needed?
   Without the change, when running Spark on Yarn, the number of available 
processors discovered by JVM is not correct. User has assigned driver and 
executors the number of cores to use and we should honor that. A simple test 
would be using this Java code:
       Runtime.getRuntime().availableProcessors()
   
   ### Does this PR introduce any user-facing change?
   No
   
   ### How was this patch tested?
   It is a simple change to the JVM start command, verified manually.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to