grundprinzip commented on code in PR #37288:
URL: https://github.com/apache/spark/pull/37288#discussion_r929762629
##########
python/run-tests.py:
##########
@@ -107,20 +118,26 @@ def run_individual_python_test(target_dir, test_name,
pyspark_python):
env["PYSPARK_SUBMIT_ARGS"] = " ".join(spark_args)
output_prefix = get_valid_filename(pyspark_python + "__" + test_name +
"__").lstrip("_")
- per_test_output = tempfile.NamedTemporaryFile(prefix=output_prefix,
suffix=".log")
+
+ if keep_test_output:
+ # The location is unique because the test is already in a unique
directory.
Review Comment:
I was actually looking into this first. However, if you look at the above
code we already pass in a target directory that is used for the hive warehouse
path. For debugging purposes it seems related that if you want to retain the
test output the actually written files will be interesting as well.
As for this, my proposal here would be to reuse the existing target dir
option and simply move the log output into this path.
If we add another path option it becomes harder to reconcile why one is
deleted and the other one is not and we would need an additional flag to to
allow retaining the other one too.
WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]