John Zhuge created SPARK-42596:
----------------------------------
Summary: [YARN] OMP_NUM_THREADS not set to number of executor
cores by default
Key: SPARK-42596
URL: https://issues.apache.org/jira/browse/SPARK-42596
Project: Spark
Issue Type: Bug
Components: PySpark, YARN
Affects Versions: 3.3.2
Reporter: John Zhuge
Run this PySpark script with `spark.executor.cores=1`
{code:python}
import os
from pyspark.sql import SparkSession
from pyspark.sql.functions import udf
spark = SparkSession.builder.getOrCreate()
var_name = 'OMP_NUM_THREADS'
def get_env_var():
return os.getenv(var_name)
udf_get_env_var = udf(get_env_var)
spark.range(1).toDF("id").withColumn(f"env_{var_name}",
udf_get_env_var()).show(truncate=False)
{code}
Output with release `3.3.2`:
{noformat}
+---+-----------------------+
|id |env_OMP_NUM_THREADS|
+---+-----------------------+
|0 |null |
+---+-----------------------+
{noformat}
Output with release `3.3.0`:
{noformat}
+---+-----------------------+
|id |env_OMP_NUM_THREADS|
+---+-----------------------+
|0 |1 |
+---+-----------------------+
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]