GitHub user mxm opened a pull request:

    https://github.com/apache/incubator-beam/pull/104

    [BEAM-158] add support for bounded sources in streaming

    Apart from a few improvements, this PR introduces bounded sources in 
streaming. The BoundedSource wrapper (`SourceInputFormat`) is the same as for 
the batch part of the runner. The translator assigns ingestion time watermarks 
and processing time timestamps upon reading from the source. We could make this 
more flexible in terms of watermark generation if we had an UnboundedSource 
wrapper for a BoundedSource.
    
    Perhaps we could have common utility for runners to deal with serialization 
of PipelineOptions. At some point, they have to be shipped. I had to change the 
serialization code because I was experiencing a serialization bug which led to 
a serialization loop. Debugging this was almost impossible because the stack 
trace doesn't show all serialization calls due to some magic in the VM. I 
didn't find any cyclic references between the PipelineOptions and Flink 
components. I'm assuming this is a bug and the workaround using byte array 
serialization of the options is fair enough. See `SourceInputFormat`.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mxm/incubator-beam BEAM-158

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-beam/pull/104.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #104
    
----
commit f26674e7a7b30ee3f992edfc8e473df2a7ee3e80
Author: Maximilian Michels <[email protected]>
Date:   2016-03-30T14:43:04Z

    [flink] improve InputFormat wrapper and ReadSourceITCase

commit 03404c7f4a5656bb5c5c0a2510f12e33292fef01
Author: Maximilian Michels <[email protected]>
Date:   2016-03-30T17:05:27Z

    [flink] improvements to UnboundedSource translation

commit de574136a5b4ba9b75231b321d0190e23af3bac2
Author: Maximilian Michels <[email protected]>
Date:   2016-03-31T08:18:01Z

    [BEAM-158] add support for bounded sources in streaming

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to