This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new a0539bf440fb [SPARK-50310][CONNECT][PYTHON][FOLLOW-UP] Delay
is_debugging_enabled call after modules are initialized
a0539bf440fb is described below
commit a0539bf440fb645d14d955d386e6df2413e08d86
Author: Hyukjin Kwon <[email protected]>
AuthorDate: Fri Dec 27 15:07:56 2024 +0900
[SPARK-50310][CONNECT][PYTHON][FOLLOW-UP] Delay is_debugging_enabled call
after modules are initialized
### What changes were proposed in this pull request?
This PR is a retry of https://github.com/apache/spark/pull/49054 that
avoids hacky monkey patch.
### Why are the changes needed?
- This disables DataFrameQueryContext for `pyspark.sql.functions` too
- It avoids circular import in pyspark-connect package.
### Does this PR introduce _any_ user-facing change?
Yes, after this followup, `spark.python.sql.dataFrameDebugging.enabled`
also works with `pyspark.sql.functions.*`.
### How was this patch tested?
Manually ran profilers:
```python
import cProfile
from pyspark.sql.functions import col
def foo():
for _ in range(1000):
col("id")
cProfile.run('foo()', sort='tottime')
```
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #49311 from HyukjinKwon/SPARK-50310-followup.
Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
python/pyspark/errors/utils.py | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/python/pyspark/errors/utils.py b/python/pyspark/errors/utils.py
index 416a2323b170..f9f60637bd57 100644
--- a/python/pyspark/errors/utils.py
+++ b/python/pyspark/errors/utils.py
@@ -255,7 +255,8 @@ def _with_origin(func: FuncT) -> FuncT:
from pyspark.sql.utils import is_remote
spark = SparkSession.getActiveSession()
- if spark is not None and hasattr(func, "__name__"):
+
+ if spark is not None and hasattr(func, "__name__") and
is_debugging_enabled():
if is_remote():
global current_origin
@@ -313,10 +314,7 @@ def with_origin_to_class(
return lambda cls: with_origin_to_class(cls, ignores)
else:
cls = cls_or_ignores
- if (
- os.environ.get("PYSPARK_PIN_THREAD", "true").lower() == "true"
- and is_debugging_enabled()
- ):
+ if os.environ.get("PYSPARK_PIN_THREAD", "true").lower() == "true":
skipping = set(
["__init__", "__new__", "__iter__", "__nonzero__", "__repr__",
"__bool__"]
+ (ignores or [])
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]