[GitHub] spark issue #20137: [SPARK-22939] [PySpark] Support Spark UDF in registerFun...

HyukjinKwon Tue, 02 Jan 2018 18:24:17 -0800

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20137
  
    Hey @gatorsmile, I was just looking into this now. How about we have 
`_unwrapped` for wrapped function and then we return wrapped function from 
wrapped function and `UserDefinedFunction` from `UserDefinedFunction`, for 
example, roughly, in `udf.py`?:
    
    ```diff
             wrapper.returnType = self.returnType
             wrapper.evalType = self.evalType
    -        wrapper.asNondeterministic = self.asNondeterministic
    +        wrapper.asNondeterministic = lambda: 
self.asNondeterministic._wrapped()
    +        wrapper._unwrapped = lambda: self
             return wrapper
    ```
    
    and then we do?
    
    ```python
    if hasattr(f, "_unwrapped"):
        f = f._unwrapped()
    if isinstance(f, UserDefinedFunction):
        udf = UserDefinedFunction(f.func, returnType=returnType, name=name,
                                  evalType=PythonEvalType.SQL_BATCHED_UDF)
        udf = udf if (f._deterministic) else udf.asNondeterministic()
    else:
        # Existing logics.
    ```
    
    Retruning `UserDefinedFunction` from wrapped function by 
`asNondeterministic` seems actually an issue because it breaks pydoc, for 
example,
    
    ```python
    from pyspark.sql.functions import udf
    help(udf(lambda: 1, "integer").asNondeterministic())
    ```
    
    I haven't tested the suggestion above but I think this is going to roughly 
work fine and resolve two issues too.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20137: [SPARK-22939] [PySpark] Support Spark UDF in registerFun...

Reply via email to