[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

cloud-fan Mon, 16 Oct 2017 09:26:41 -0700

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18732
  
    > Use different function name for different input/output type
    
    Yea it's a bad idea as there are many combinations, and I just wanna use 
different APIs for different scenarios, e,g, `@pandas_udf` for 
select/withColumn(`Series* -> Series`), `@pandas_grouped_udf` for 
groupBy(apply: `DataFrame -> DataFrame`, reduce: `DataFrame -> Scalar` and 
more) and `@pandas_udaf` for aggregate.
    
    Different scenarios usually have different requirements, having different 
APIs can help us satisfy these requirements individually.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

Reply via email to