Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20137
Hey @gatorsmile, I was just looking into this now. How about we have
`_unwrapped` for wrapped function and then we return wrapped function from
wrapped function and `UserDefinedFunction` from `UserDefinedFunction`, for
example, roughly, in `udf.py`?:
```diff
wrapper.returnType = self.returnType
wrapper.evalType = self.evalType
- wrapper.asNondeterministic = self.asNondeterministic
+ wrapper.asNondeterministic = lambda:
self.asNondeterministic._wrapped()
+ wrapper._unwrapped = lambda: self
return wrapper
```
and then we do?
```python
if hasattr(f, "_unwrapped"):
f = f._unwrapped()
if isinstance(f, UserDefinedFunction):
udf = UserDefinedFunction(f.func, returnType=returnType, name=name,
evalType=PythonEvalType.SQL_BATCHED_UDF)
udf = udf if (f._deterministic) else udf.asNondeterministic()
else:
# Existing logics.
```
Retruning `UserDefinedFunction` from wrapped function by
`asNondeterministic` seems actually an issue because it breaks pydoc, for
example,
```python
from pyspark.sql.functions import udf
help(udf(lambda: 1, "integer").asNondeterministic())
```
I haven't tested the suggestion above but I think this is going to roughly
work fine and resolve two issues too.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]