HyukjinKwon commented on code in PR #37288:
URL: https://github.com/apache/spark/pull/37288#discussion_r929816651
##########
python/run-tests.py:
##########
@@ -107,20 +118,26 @@ def run_individual_python_test(target_dir, test_name,
pyspark_python):
env["PYSPARK_SUBMIT_ARGS"] = " ".join(spark_args)
output_prefix = get_valid_filename(pyspark_python + "__" + test_name +
"__").lstrip("_")
- per_test_output = tempfile.NamedTemporaryFile(prefix=output_prefix,
suffix=".log")
+
+ if keep_test_output:
+ # The location is unique because the test is already in a unique
directory.
Review Comment:
One option is to change `env["TMPDIR"] = tmp_dir` if `TMPDIR` isn't set. So
users can run the script with `TMPDIR=custom_directory ./run-tests.py ....`. In
that case, everything will be stored under the custom directory (both
stderr/output and the temp files written by Spark).
Technically test output and the files being generated by Spark are slightly
different because here we capture the stdout/stderr from the top level process,
and Spark runs inside the process individually.
So, setting `delete=opts.keep_test_output` would only keep the stdout/stderr
from the top level process but still remove the temp files written by Spark.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]