GitHub user jkff opened a pull request:
https://github.com/apache/incubator-beam/pull/1036
Introduces the Rebundle transform
It's similar to Reshuffle in that it prevents fusion of the surrounding
transforms, however while Reshuffle
requires the input collection to be KVs, Rebundle efficiently generates
sufficiently unique keys itself.
Also uses it in Datastore. The transform will be useful in JdbcIO.
(I tried adapting it to also support the case of fixed number of bundles,
as in the Write transform, but this has hairy semantics in the unbounded case,
so I decided not to do it)
R: @bjchambers
CC: @jbonofre @dhalperi
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jkff/incubator-beam reparallelize
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-beam/pull/1036.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1036
----
commit 58dc834c88b2a1249645f307276c97c9b9fa0e78
Author: Eugene Kirpichov <[email protected]>
Date: 2016-09-30T18:18:38Z
Introduces the Rebundle transform
It's similar to Reshuffle in that it prevents fusion
of the surrounding transforms, however while Reshuffle
requires the input collection to be KVs, Rebundle
efficiently generates sufficiently unique keys itself.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---