Cool! Thanks Kenn.
Jacob
On Mon, Nov 20, 2017 at 9:57 AM, Kenneth Knowles wrote:
> I wanted to follow up that this has been reproduced and diagnosed, and a
> fix is underway. The ticket to follow is https://issues.apache.org/
> jira/browse/BEAM-3219.
>
> Kenn
>
> On Fri, Nov 17, 2017 at 12:23 P
I wanted to follow up that this has been reproduced and diagnosed, and a
fix is underway. The ticket to follow is
https://issues.apache.org/jira/browse/BEAM-3219.
Kenn
On Fri, Nov 17, 2017 at 12:23 PM, Jacob Marble wrote:
> Here is a small pipeline job that fails using the Dataflow runner, but
On Fri, Nov 17, 2017 at 8:38 PM, Jacob Marble wrote:
> I also notice that stateful DoFn's seem to only be instantiated once in
> Dataflow, but multiple instances do end up being created in the direct
> runner. Is there a story behind that?
>
The runner is free to instantiate a DoFn as often as i
I also notice that stateful DoFn's seem to only be instantiated once in
Dataflow, but multiple instances do end up being created in the direct
runner. Is there a story behind that?
Jacob
On Fri, Nov 17, 2017 at 7:22 PM, Jacob Marble wrote:
> Noticing some related and unexpected differences betw
Noticing some related and unexpected differences between batch and
streaming pipelines.
Why does a stateful DoFn behave like GroupByKey (no data output until all
data input is complete) in a batch pipeline, but not in a streaming
pipeline? It looks like BatchStatefulParDoOverrides has something to
Here is a small pipeline job that fails using the Dataflow runner, but
doesn't fail using the direct runner.
https://gist.github.com/jacobmarble/804c2edb9c80a2863f3e671d6851a55f
Jacob
On Fri, Nov 17, 2017 at 9:27 AM, Kenneth Knowles wrote:
> It is definitely a big deal if @Setup is not getting
It is definitely a big deal if @Setup is not getting called! There are no
special cases that would skip @Setup. Please do report what you can.
That said, lazily doing setup (via null check or some such as you mention)
is perfectly fine and often a more robust programming pattern. Upside: you
can't
I tried to write a simpler DoFn that induces the error, but it works fine.
Working around the issue today by using @StartBundle with a null check, and
that seems to be working.
If this really is a big deal, then it needs to be reported, so I'll try to
find time to write a broken example.
Jacob
O
Could you give more details, e.g. a code snippet that reproduces the issue,
and describe how you determine that @Setup hasn't been called?
On Thu, Nov 16, 2017 at 6:58 PM Derek Hao Hu wrote:
> I've been using DoFn.Setup method in Dataflow and it seems to be working
> fine.
>
> On Thu, Nov 16,
I've been using DoFn.Setup method in Dataflow and it seems to be working
fine.
On Thu, Nov 16, 2017 at 4:56 PM, Jacob Marble wrote:
> This one is weird.
>
> A DoFn I wrote:
> - stateful
> - used plenty in a streaming pipeline
> - direct and dataflow runners
> - works fine
>
> Now:
> - new batc
10 matches
Mail list logo