For portability reasons, the PushbackSideInputDoFnRunner will go away in
the long term since the Runner will have to filter elements before sending
them to the SDK for processing. Performing this filtering by a prior step
within the Runner is a reasonable solution and what Dataflow has adopted

On Fri, Mar 2, 2018 at 1:15 PM, Reuven Lax <> wrote:

> The point of PushbackSideInputDoFnRunner is to buffer the main input until
> the side input is ready (for a  sometimes complicated definition of ready).
> One possibility is instead to add a new prior step in the graph that is
> responsible for buffering these inputs. That way there's no need for a
> special DoFnRunner at all here.
> On Fri, Mar 2, 2018 at 1:01 PM Romain Manni-Bucau <>
> wrote:
>> Hi guys,
>> what's the rational behind PushbackSideInputDoFnRunner?
>> Why not using a DoFnRunner<WindowedValue<InputT>, OutputT>?
>> It is the same thing I think, better represents what it does (most is
>> delegated in general) and avoids yet another API which is not even
>> implemented completely in 1 of the 2 implementation cause the interface is
>> not relevant for one case (onTimer in ProcessFnRunner).
>> Worse case we keep the pushbacksideinputdofnrunner interface but extends
>> the dofn one to avoid to define other methods and we just break the process
>> method name which is the only one which was not copied (not sure why).
>> Personally I'd be to drop completely the interface but aliasing the first
>> one with a default method to bridge the process methods sounds good as well
>> and allows to reduce the forked code between both branches.
>> wdyt?
>> Romain Manni-Bucau
>> @rmannibucau <> |  Blog
>> <> | Old Blog
>> <> | Github
>> <> | LinkedIn
>> <> | Book
>> <>

Reply via email to