Luke Cwik created BEAM-490:
------------------------------

             Summary: Swap to using CoGBK as grouping primitive instead of GBK
                 Key: BEAM-490
                 URL: https://issues.apache.org/jira/browse/BEAM-490
             Project: Beam
          Issue Type: Improvement
          Components: beam-model-fn-api, beam-model-runner-api, runner-core
            Reporter: Luke Cwik


The intent is for the semantics of both GBK and CoGBK to be
unchanged, just swapping their status as primitives.

CoGBK is a more powerful operator then GBK allowing for two key benefits:

1) SDKs are simplified: transforming a CoGBK into a GBK is trivial while the 
reverse is not.
2) It will be easier for runners to provide more efficient implementations of 
CoGBK as they will be responsible for the logic which takes their own internal 
grouping implementation and maps it onto a CoGBK.

This requires the following modifications to the Beam code base:

1) Make GBK a composite transform in terms of CoGBK.
2) Move the CoGBK from contrib to runners-core as an adapter*. Runners that 
more naturally support GBK can just use this and everything executes exactly as 
before.

*just like GroupByKeyViaGroupByKeyOnly and UnboundedReadFromBoundedSource



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to