Cool! Thanks Kenn.
Jacob
On Mon, Nov 20, 2017 at 9:57 AM, Kenneth Knowles wrote:
> I wanted to follow up that this has been reproduced and diagnosed, and a
> fix is underway. The ticket to follow is https://issues.apache.org/
> jira/browse/BEAM-3219.
>
> Kenn
>
> On Fri, Nov 17, 2017 at 12:23 PM, Jacob Marble
> wrote:
>
>> Here is a small pipeline job that fails using the Dataflow runner, but
>> doesn't fail using the direct runner.
>>
>> https://gist.github.com/jacobmarble/804c2edb9c80a2863f3e671d6851a55f
>>
>> Jacob
>>
>> On Fri, Nov 17, 2017 at 9:27 AM, Kenneth Knowles wrote:
>>
>>> It is definitely a big deal if @Setup is not getting called! There are
>>> no special cases that would skip @Setup. Please do report what you can.
>>>
>>> That said, lazily doing setup (via null check or some such as you
>>> mention) is perfectly fine and often a more robust programming pattern.
>>> Upside: you can't accidentally use uninitialized things. Downside: it might
>>> mask repeated initialization and only manifest as poor performance.
>>>
>>> Kenn
>>>
>>> On Fri, Nov 17, 2017 at 9:00 AM, Jacob Marble
>>> wrote:
>>>
I tried to write a simpler DoFn that induces the error, but it works
fine. Working around the issue today by using @StartBundle with a null
check, and that seems to be working.
If this really is a big deal, then it needs to be reported, so I'll try
to find time to write a broken example.
Jacob
On Thu, Nov 16, 2017 at 10:27 PM, Eugene Kirpichov <
kirpic...@google.com> wrote:
> Could you give more details, e.g. a code snippet that reproduces the
> issue, and describe how you determine that @Setup hasn't been called?
>
> On Thu, Nov 16, 2017 at 6:58 PM Derek Hao Hu
> wrote:
>
>> I've been using DoFn.Setup method in Dataflow and it seems to be
>> working fine.
>>
>> On Thu, Nov 16, 2017 at 4:56 PM, Jacob Marble
>> wrote:
>>
>>> This one is weird.
>>>
>>> A DoFn I wrote:
>>> - stateful
>>> - used plenty in a streaming pipeline
>>> - direct and dataflow runners
>>> - works fine
>>>
>>> Now:
>>> - new batch pipeline
>>> - @DoFn.Setup method not called
>>> - direct runner works properly (logs from setup method are output)
>>> - dataflow runner simply doesn't call the setup method
>>>
>>> Is this possibly a Beam misuse? Javadoc for DoFn.Setup doesn't hint
>>> at anything, so I'm suspecting Dataflow bug?
>>>
>>> Jacob
>>>
>>
>>
>>
>> --
>> Derek Hao Hu
>>
>> Software Engineer | Snapchat
>> Snap Inc.
>>
>
>>>
>>
>