This is an automated email from the ASF dual-hosted git repository.
ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 949f1cd70350 [SPARK-55858][PYTHON][INFRA] Fix UDF logging tests under
coverage
949f1cd70350 is described below
commit 949f1cd7035055799bb743a52acc7086a65404f3
Author: Tian Gao <[email protected]>
AuthorDate: Tue Mar 10 08:41:15 2026 +0800
[SPARK-55858][PYTHON][INFRA] Fix UDF logging tests under coverage
### What changes were proposed in this pull request?
When we patch workers with coverage, pretend the `sitecustomize` module
that coverage uses to be the worker module.
### Why are the changes needed?
UDF logging uses module name to determine whether the code is user code.
With coverage, we have a strange module `sitecustomize` and UDF logging
believes it's user code - which breaks the test.
We should not modify our production code to make coverage work, so we patch
the coverage code.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Locally the failed test passed.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #54649 from gaogaotiantian/fix-udf-logging-coverage.
Authored-by: Tian Gao <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
---
python/test_coverage/sitecustomize.py | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/python/test_coverage/sitecustomize.py
b/python/test_coverage/sitecustomize.py
index b84e62a20c33..f1b8e494accb 100644
--- a/python/test_coverage/sitecustomize.py
+++ b/python/test_coverage/sitecustomize.py
@@ -63,6 +63,11 @@ try:
return wrapper
frame.f_globals["worker"] =
save_when_exit(frame.f_globals["worker"])
+ # Pretend that this module has the same name as the worker
module.
+ # UDF logging checks where pyspark code firstly calls into
user code, and if
+ # the module name is "sitecustomize", it will confuse UDF
logging and make
+ # it believe this is user code, which will result in a wrong
context.
+ globals()["__name__"] = frame.f_globals.get("__name__",
globals()["__name__"])
os.register_at_fork(after_in_child=patch_worker)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]