jzhuge opened a new pull request, #51948:
URL: https://github.com/apache/spark/pull/51948

   ### What changes were proposed in this pull request?
   
   When starting Spark driver and executors on Yarn cluster, the JVM process 
can discover all CPU cores on the system and set thread-pool or GC threads 
based on that value. We should limit what the JVM sees for the number of cores 
set by the user (spark.driver.cores or spark.executor.cores) by 
"-XX:ActiveProcessorCount", which was introduced in Java 8u191.
   
   Especially in running Spark on Yarn inside Kubernetes container, the number 
of CPU cores discovered sometimes is 1, which means it always use 1 thread in 
the default thread pool, or GC threads.
   
   ### Why are the changes needed?
   
   Without the change, when running Spark on Yarn, the number of available 
processors discovered by JVM is not correct. User has assigned driver and 
executors the number of cores to use and we should honor that. A simple test 
would be using this Java code:
   Runtime.getRuntime().availableProcessors()
   
   ### Does this PR introduce any user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   New unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to