[GitHub] spark pull request #21082: [SPARK-22239][SQL][Python] Enable grouped aggrega...

viirya Tue, 24 Apr 2018 01:38:11 -0700

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21082#discussion_r183416353
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -2321,7 +2323,30 @@ def pandas_udf(f=None, returnType=None, 
functionType=None):
            |  2|        6.0|
            +---+-----------+
     
    -       .. seealso:: :meth:`pyspark.sql.GroupedData.agg`
    +       This example shows using grouped aggregated UDFs as window 
functions. Note that only
    +       unbounded window frame is supported at the moment:
    +
    +       >>> from pyspark.sql.functions import pandas_udf, PandasUDFType
    +       >>> from pyspark.sql import Window
    +       >>> df = spark.createDataFrame(
    +       ...     [(1, 1.0), (1, 2.0), (2, 3.0), (2, 5.0), (2, 10.0)],
    +       ...     ("id", "v"))
    +       >>> @pandas_udf("double", PandasUDFType.GROUPED_AGG)  # doctest: 
+SKIP
    --- End diff --
    
    So we don't have `PandasUDFType.WINDOW_AGG` and a pandas udf defined as 
`PandasUDFType.GROUPED_AGG` can be both used with `groupby` and `Window`?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21082: [SPARK-22239][SQL][Python] Enable grouped aggrega...

Reply via email to