[
https://issues.apache.org/jira/browse/BEAM-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188816#comment-16188816
]
Eugene Kirpichov commented on BEAM-2949:
----------------------------------------
Per discussion with Luke, the design can be simply: expand
Read.from(BoundedSource) into (runner-specific primitive impulse source) +
ParDo(initial split into chunks of 64MB) + reshuffle + ParDo(read) - this might
not even require any modifications on the worker.
Supporting UnboundedSource is the interesting part and will require SDF support
(i.e. the ParDo(read) will be a splittable one)).
> Initial splitting
> -----------------
>
> Key: BEAM-2949
> URL: https://issues.apache.org/jira/browse/BEAM-2949
> Project: Beam
> Issue Type: Sub-task
> Components: beam-model
> Reporter: Henning Rohde
> Assignee: Eugene Kirpichov
> Labels: portability
>
> Initial spitting is a useful feature that should work in this setup,
> especially for runners that do not support dynamic splitting. Design TBD.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)