I would like to propose a change to Beam to make CoGBK the basis for
grouping instead of GBK. The idea behind this proposal is that CoGBK is a
more powerful operator then GBK allowing for two key benefits:

1) SDKs are simplified: transforming a CoGBK into a GBK is trivial while
the reverse is not.
2) It will be easier for runners to provide more efficient implementations
of CoGBK as they will be responsible for the logic which takes their own
internal grouping implementation and maps it onto a CoGBK.

This requires the following modifications to the Beam code base:

1) Make GBK a composite transform in terms of CoGBK.
2) Move the CoGBK from contrib to runners-core as an adapter*. Runners that
more naturally support GBK can just use this and everything executes
exactly as before.

*just like GroupByKeyViaGroupByKeyOnly and UnboundedReadFromBoundedSource

Reply via email to