Amit Sela created BEAM-1815:
-------------------------------
Summary: Avoid shuffling twice in GABW
Key: BEAM-1815
URL: https://issues.apache.org/jira/browse/BEAM-1815
Project: Beam
Issue Type: Bug
Components: runner-spark
Reporter: Amit Sela
Assignee: Amit Sela
Spark runner implementation of GABW includes a "built-in" groupByKey, but BOBK
before it already groups, so in order to avoid an unnecessary shuffle we need
to force a {{Partitioner}} on the RDDs involved.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)