[
https://issues.apache.org/jira/browse/SPARK-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662233#comment-14662233
]
Sean Owen commented on SPARK-9744:
----------------------------------
Yeah that's the idea. It works though is potentially slow. There are a few
things you could do to make it more efficient.
I think the mapPartitions idea is the way to go, by far -- assuming you are OK
with missing windows across partition boundaries.
Why just the Java RDD?
> Add Java RDD method to map with lag and lead
> --------------------------------------------
>
> Key: SPARK-9744
> URL: https://issues.apache.org/jira/browse/SPARK-9744
> Project: Spark
> Issue Type: Wish
> Reporter: Jerry Z
> Priority: Minor
>
> To avoid zipping with index and doing numerous mapping and joins, having a
> single method call to map with an additional two parameters (1: list of
> offsets [(-) for lag, 0 for current and (+) for lead])) and (2:default
> value). The other difference to the map function takes an argument of List<T>
> and not just T.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]