[GitHub] spark pull request #19505: [SPARK-20396][SQL][PySpark][FOLLOW-UP] groupby()....

cloud-fan Mon, 16 Oct 2017 00:13:16 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19505#discussion_r144768380
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -2044,7 +2044,7 @@ class UserDefinedFunction(object):
     
         .. versionadded:: 1.3
         """
    -    def __init__(self, func, returnType, name=None, vectorized=False):
    +    def __init__(self, func, returnType, name=None, vectorized=False, 
grouped=False):
    --- End diff --
    
    `vectorized=False, grouped=True` is an invalid combination. How about we 
introduce a `udfType` and `0` means normal udf, `1` means pandas udf, and `2` 
means pandas grouped udf?  We can create something like `object PythonEvalType` 
to sync this encoding between python and java.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19505: [SPARK-20396][SQL][PySpark][FOLLOW-UP] groupby()....

Reply via email to