HyukjinKwon edited a comment on issue #25545: [SPARK-28843][PYTHON] Set OMP_NUM_THREADS to executor cores for python URL: https://github.com/apache/spark/pull/25545#issuecomment-525960367 Some users might just want to use less cores in executors but more cores in Python workers although it sounds a bit weird. We just don't know what UDFs or RDD will execute. They might intermittently use more threads in its execution but suddenly meet performance "regression" after this change. Seems it will be tricky to debug in such cases. It affects _all_ functionalities in PySpark - RDD, SQL, ML. Hope we can better stay safer and investigate thoroughly and stay clear before we go ahead.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
