[ https://issues.apache.org/jira/browse/BEAM-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15854916#comment-15854916 ]
ASF GitHub Bot commented on BEAM-65: ------------------------------------ Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/1895 > SplittableDoFn > -------------- > > Key: BEAM-65 > URL: https://issues.apache.org/jira/browse/BEAM-65 > Project: Beam > Issue Type: New Feature > Components: beam-model > Reporter: Daniel Halperin > Assignee: Eugene Kirpichov > Priority: Minor > > SplittableDoFn is a proposed enhancement for "dynamically splittable work" to > the Beam model. > Among other things, it would allow a unified implementation of > bounded/unbounded sources with dynamic work rebalancing and the ability to > express multiple scalable steps (e.g., global expansion -> file sizing & > parsing -> splitting files into independently-processable blocks) via > composition rather than inheritance. > This would make it much easier to implement many types of sources, to modify > and reuse existing sources. Also, it would improve scalability of the Beam > model by moving things like splitting a source from the control plane (where > it is today -- glob -> List<FileBasedSource> sent over service APIs) into the > data plane (PCollection<Glob> -> PCollection<FileName> -> ...). -- This message was sent by Atlassian JIRA (v6.3.15#6346)