[GitHub] spark pull request #22620: [SPARK-25601][PYTHON] Register Grouped aggregate ...

icexelloss Wed, 03 Oct 2018 13:31:54 -0700

Github user icexelloss commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22620#discussion_r222456993
  
    --- Diff: python/pyspark/sql/udf.py ---
    @@ -310,9 +319,11 @@ def register(self, name, f, returnType=None):
                         "Invalid returnType: data type can not be specified 
when f is"
                         "a user-defined function, but got %s." % returnType)
                 if f.evalType not in [PythonEvalType.SQL_BATCHED_UDF,
    -                                  PythonEvalType.SQL_SCALAR_PANDAS_UDF]:
    +                                  PythonEvalType.SQL_SCALAR_PANDAS_UDF,
    +                                  
PythonEvalType.SQL_GROUPED_AGG_PANDAS_UDF]:
    --- End diff --
    
    We don't need it here:
    
    Users specify GROUPED_AGG only. GROUPED_AGG is turned to WINDOW_AGG eval 
type in WindowInPandasExec.
    
    Admittedly, there is a bit confusion here we can improve. We just haven't 
got a user specified udf type that maps to multiple evalType before WINDOW_AGG.




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22620: [SPARK-25601][PYTHON] Register Grouped aggregate ...

Reply via email to