[
https://issues.apache.org/jira/browse/BEAM-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16058370#comment-16058370
]
ASF GitHub Bot commented on BEAM-1377:
--------------------------------------
GitHub user jkff opened a pull request:
https://github.com/apache/beam/pull/3417
[BEAM-1377] Uses KV in SplittableParDo expansion instead of
ElementAndRestriction
This is a workaround for the following issue.
ElementAndRestriction is in runners-core, which may be shaded by runners
(and is shaded by Dataflow runner), hence it should be *both* produced
and consumed by workers - but currently it's produced by (shaded)
SplittableParDo and consumed by (differently shaded) ProcessFn in the
runner's worker code.
There are several ways out of this, e.g. moving EAR into the SDK (icky
because it's an implementation detail of SplittableParDo), or using
a type that's already in the SDK. There may be other more complicated
ways too.
(This PR will require building a compatible Dataflow worker, so it will
naturally not pass tests initially)
R: @kennknowles
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jkff/incubator-beam sdf-kv
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/beam/pull/3417.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3417
----
commit c4885833c6edd37fcd3161a124292044ebe820da
Author: Eugene Kirpichov <[email protected]>
Date: 2017-06-21T22:21:32Z
Uses KV in SplittableParDo expansion instead of ElementAndRestriction
This is a workaround for the following issue.
ElementAndRestriction is in runners-core, which may be shaded by runners
(and is shaded by Dataflow runner), hence it should be *both* produced
and consumed by workers - but currently it's produced by (shaded)
SplittableParDo and consumed by (differently shaded) ProcessFn in the
runner's worker code.
There are several ways out of this, e.g. moving EAR into the SDK (icky
because it's an implementation detail of SplittableParDo), or using
a type that's already in the SDK. There may be other more complicated
ways too.
----
> Support Splittable DoFn in Dataflow streaming runner
> ----------------------------------------------------
>
> Key: BEAM-1377
> URL: https://issues.apache.org/jira/browse/BEAM-1377
> Project: Beam
> Issue Type: New Feature
> Components: runner-dataflow
> Reporter: Eugene Kirpichov
> Assignee: Eugene Kirpichov
> Fix For: 2.1.0
>
>
> Dataflow runner should support splittable DoFn.
> However, Dataflow batch and streaming runners will support it quite
> differently, streaming being the somewhat easier one. The current issue is
> about the streaming runner.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)