[ 
https://issues.apache.org/jira/browse/BEAM-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-490:
---------------------------
    Fix Version/s: First stable release

> Swap to using CoGBK as grouping primitive instead of GBK
> --------------------------------------------------------
>
>                 Key: BEAM-490
>                 URL: https://issues.apache.org/jira/browse/BEAM-490
>             Project: Beam
>          Issue Type: Improvement
>          Components: beam-model-fn-api, beam-model-runner-api, runner-core
>            Reporter: Luke Cwik
>              Labels: backwards-incompatible
>             Fix For: First stable release
>
>
> The intent is for the semantics of both GBK and CoGBK to be
> unchanged, just swapping their status as primitives.
> CoGBK is a more powerful operator then GBK allowing for two key benefits:
> 1) SDKs are simplified: transforming a CoGBK into a GBK is trivial while the 
> reverse is not.
> 2) It will be easier for runners to provide more efficient implementations of 
> CoGBK as they will be responsible for the logic which takes their own 
> internal grouping implementation and maps it onto a CoGBK.
> This requires the following modifications to the Beam code base:
> 1) Make GBK a composite transform in terms of CoGBK.
> 2) Move the CoGBK from contrib to runners-core as an adapter*. Runners that 
> more naturally support GBK can just use this and everything executes exactly 
> as before.
> *just like GroupByKeyViaGroupByKeyOnly and UnboundedReadFromBoundedSource



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to