This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 1c1216f18f3 [SPARK-38882][PYTHON] Fix usage logger attachment to
handle static methods properly
1c1216f18f3 is described below
commit 1c1216f18f3008b410a601516b2fde49a9e27f7d
Author: Takuya UESHIN <[email protected]>
AuthorDate: Wed Apr 13 09:21:55 2022 +0900
[SPARK-38882][PYTHON] Fix usage logger attachment to handle static methods
properly
### What changes were proposed in this pull request?
Fixes usage logger attachment to handle static methods properly.
### Why are the changes needed?
The usage logger attachment logic has an issue when handling static methods.
For example,
```
$ PYSPARK_PANDAS_USAGE_LOGGER=pyspark.pandas.usage_logging.usage_logger
./bin/pyspark
```
```py
>>> import pyspark.pandas as ps
>>> psdf = ps.DataFrame({"a": [1,2,3], "b": [4,5,6]})
>>> psdf.from_records([(1, 2), (3, 4)])
A function `DataFrame.from_records(data, index, exclude, columns,
coerce_float, nrows)` was failed after 2007.430 ms: 0
Traceback (most recent call last):
...
```
without usage logger:
```py
>>> import pyspark.pandas as ps
>>> psdf = ps.DataFrame({"a": [1,2,3], "b": [4,5,6]})
>>> psdf.from_records([(1, 2), (3, 4)])
0 1
0 1 2
1 3 4
```
### Does this PR introduce _any_ user-facing change?
Yes, for a user attaches the usage logger, static methods will work as
static methods.
### How was this patch tested?
Manually tested.
```py
>>> import pyspark.pandas as ps
>>> import logging
>>> import sys
>>> root = logging.getLogger()
>>> root.setLevel(logging.INFO)
>>> handler = logging.StreamHandler(sys.stdout)
>>> handler.setLevel(logging.INFO)
>>>
>>> formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s -
%(message)s')
>>> handler.setFormatter(formatter)
>>> root.addHandler(handler)
>>> psdf = ps.DataFrame({"a": [1,2,3], "b": [4,5,6]})
2022-04-12 14:43:52,254 - pyspark.pandas.usage_logger - INFO - A function
`DataFrame.__init__(self, data, index, columns, dtype, copy)` was successfully
finished after 2714.896 ms.
>>> psdf.from_records([(1, 2), (3, 4)])
2022-04-12 14:43:59,765 - pyspark.pandas.usage_logger - INFO - A function
`DataFrame.from_records(data, index, exclude, columns, coerce_float, nrows)`
was successfully finished after 51.105 ms.
2022-04-12 14:44:01,371 - pyspark.pandas.usage_logger - INFO - A function
`DataFrame.__repr__(self)` was successfully finished after 1605.759 ms.
0 1
0 1 2
1 3 4
>>> ps.DataFrame.from_records([(1, 2), (3, 4)])
2022-04-12 14:44:25,301 - pyspark.pandas.usage_logger - INFO - A function
`DataFrame.from_records(data, index, exclude, columns, coerce_float, nrows)`
was successfully finished after 43.446 ms.
2022-04-12 14:44:25,493 - pyspark.pandas.usage_logger - INFO - A function
`DataFrame.__repr__(self)` was successfully finished after 192.053 ms.
0 1
0 1 2
1 3 4
```
Closes #36167 from ueshin/issues/SPARK-38882/staticmethod.
Authored-by: Takuya UESHIN <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
python/pyspark/instrumentation_utils.py | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/python/pyspark/instrumentation_utils.py
b/python/pyspark/instrumentation_utils.py
index 908f5cbb3d4..256c09068cd 100644
--- a/python/pyspark/instrumentation_utils.py
+++ b/python/pyspark/instrumentation_utils.py
@@ -163,7 +163,14 @@ def _attach(
for name, func in inspect.getmembers(target_class, inspect.isfunction):
if name.startswith("_") and name not in special_functions:
continue
- setattr(target_class, name, _wrap_function(target_class.__name__,
name, func, logger))
+ try:
+ isstatic = isinstance(inspect.getattr_static(target_class,
name), staticmethod)
+ except AttributeError:
+ isstatic = False
+ wrapped_function = _wrap_function(target_class.__name__, name,
func, logger)
+ setattr(
+ target_class, name, staticmethod(wrapped_function) if isstatic
else wrapped_function
+ )
for name, prop in inspect.getmembers(target_class, lambda o:
isinstance(o, property)):
if name.startswith("_"):
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]