[GitHub] spark pull request #19592: [SPARK-22347][SQL][PySpark] Support optionally ru...

viirya Mon, 30 Oct 2017 21:15:54 -0700

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19592#discussion_r147892336
  
    --- Diff: python/pyspark/worker.py ---
    @@ -105,8 +105,14 @@ def read_single_udf(pickleSer, infile, eval_type):
         elif eval_type == PythonEvalType.SQL_PANDAS_GROUPED_UDF:
             # a groupby apply udf has already been wrapped under apply()
             return arg_offsets, row_func
    -    else:
    +    elif eval_type == PythonEvalType.SQL_BATCHED_UDF:
             return arg_offsets, wrap_udf(row_func, return_type)
    +    elif eval_type == PythonEvalType.SQL_BATCHED_OPT_UDF:
    --- End diff --
    
    One possible is, we do the wrapping when creating UDFs in Python side. Even 
for UDFs not used in conditional expressions, we still add an extra boolean 
argument to the end of its argument list. We don't need another eval_type with 
this fix.
    
    But currently I think documenting it seems a more acceptable fix for others.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19592: [SPARK-22347][SQL][PySpark] Support optionally ru...

Reply via email to