[ 
https://issues.apache.org/jira/browse/SPARK-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123167#comment-14123167
 ] 

Kousuke Saruta commented on SPARK-3399:
---------------------------------------

Some test for pyspark, for instance rdd.py, use NamedTemporaryFile for creating 
input data.
NamedTemporaryFile creates temporary file on /tmp on local filesystem.
rdd.py is kicked by pyspark script in python/run-tests.
If we set environment variables HADOOP_CONF_DIR or YARN_CONF_DIR in 
spark-env.sh before testing, pyspark command load values from those variables.
After loading those value, Spark expects input data is on the filesystem 
configured by the environment variables.


> Test for PySpark should ignore HADOOP_CONF_DIR and YARN_CONF_DIR
> ----------------------------------------------------------------
>
>                 Key: SPARK-3399
>                 URL: https://issues.apache.org/jira/browse/SPARK-3399
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>            Reporter: Kousuke Saruta
>
> Some tests for PySpark make temporary files on /tmp of local file system but 
> if environment variable HADOOP_CONF_DIR or YARN_CONF_DIR is set in 
> spark-env.sh, tests expects temporary files are on FileSystem configured in 
> core-site.xml even though actual files are on local file system.
> I think, we should ignore HADOOP_CONF_DIR and YARN_CONF_DIR.
> If we need those variables in some tests, we should set those variables in 
> such tests specially.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to