Li Jin created SPARK-22239:
------------------------------
Summary: Used-defined window functions with pandas udf
Key: SPARK-22239
URL: https://issues.apache.org/jira/browse/SPARK-22239
Project: Spark
Issue Type: Sub-task
Components: PySpark
Affects Versions: 2.2.0
Environment:
Reporter: Li Jin
Window function is another place we can benefit from vectored udf and add
another useful function to the pandas_udf suite.
Example usage (preliminary):
{code:java}
w = Window.partitionBy('id').orderBy('time').rangeBetween(-200, 0)
@pandas_udf(DoubleType())
def ema(v1):
return v1.ewm(alpha=0.5).mean().iloc[-1]
df.withColumn('v1_ema', ema(df.v1).over(window))
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]