[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

HyukjinKwon Sun, 01 Oct 2017 06:46:16 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18732#discussion_r142029720
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -2181,31 +2186,69 @@ def udf(f=None, returnType=StringType()):
     @since(2.3)
     def pandas_udf(f=None, returnType=StringType()):
         """
    -    Creates a :class:`Column` expression representing a user defined 
function (UDF) that accepts
    -    `Pandas.Series` as input arguments and outputs a `Pandas.Series` of 
the same length.
    +    Creates a :class:`Column` expression representing a vectorized user 
defined function (UDF).
    +
    +    The user-defined function can define one of the following 
transformations:
    +    1. One or more `pandas.Series` -> A `pandas.Series`
    +
    +       This udf is used with `DataFrame.withColumn` and `DataFrame.select`.
    +       The returnType should be a primitive data type, e.g., DoubleType()
    --- End diff --
    
    little nit: `` `DoubleType()` ``



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

Reply via email to