Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19592#discussion_r147892336
  
    --- Diff: python/pyspark/worker.py ---
    @@ -105,8 +105,14 @@ def read_single_udf(pickleSer, infile, eval_type):
         elif eval_type == PythonEvalType.SQL_PANDAS_GROUPED_UDF:
             # a groupby apply udf has already been wrapped under apply()
             return arg_offsets, row_func
    -    else:
    +    elif eval_type == PythonEvalType.SQL_BATCHED_UDF:
             return arg_offsets, wrap_udf(row_func, return_type)
    +    elif eval_type == PythonEvalType.SQL_BATCHED_OPT_UDF:
    --- End diff --
    
    One possible is, we do the wrapping when creating UDFs in Python side. Even 
for UDFs not used in conditional expressions, we still add an extra boolean 
argument to the end of its argument list. We don't need another eval_type with 
this fix.
    
    But currently I think documenting it seems a more acceptable fix for others.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to