Hi Aljoscha, Great idea.
- The logic for matching up the windows is WindowFn#getSideInputWindow [1] - The SDK used to have something along the lines of what you describe [2] but we thought it was too runner-specific, directly referencing Dataflow details, and with a particular model of buffering + timer. Perhaps it is a starting place for your design? Kenn [1] https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/WindowFn.java#L131 [2] https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/1e5524a7f5d0d774488cb0206ea6433085461775/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/worker/StreamingSideInputDoFnRunner.java On Wed, Apr 20, 2016 at 4:25 AM, Jean-Baptiste Onofré <[email protected]> wrote: > Hi Aljoscha > > AFAIR, the Runner API Proposal document (from Kenneth) contains some > points about side input. > > > https://drive.google.com/folderview?id=0B-IhJZh9Ab52OFBVZHpsNjc4eXc&usp=sharing > > I don't think it goes into the details of side inputs and windows, but > definitely the document we should extend. > > Regards > JB > > > > On 04/20/2016 11:55 AM, Aljoscha Krettek wrote: > >> Hi, >> for https://issues.apache.org/jira/browse/BEAM-102 we will need to have >> some functionality that deals with side inputs and windows (of both the >> main input and the side inputs) and how they get matched and how we wait >> for windows (blocking). I imagine that we could add some component that is >> similar to ReduceFnRunner but for side inputs: We would just instantiate >> it >> with a factory for state storage, then push elements into it while >> processing and it would provide a way to get a SideInputReader. >> >> I think this would not be specific to the Flink runner because other >> runner >> implementors will face similar problems. Are there any ideas/design docs >> about such a thing already? If not, we should probably start designing. >> >> What do you think? >> >> Cheers, >> Aljoscha >> >> > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com >
