Re: Flink Runner logging FAILED_TO_UNCOMPRESS

2019-09-17 Thread Kyle Weaver
Actually, the reported issues are already fixed on head. We're just trying to prevent similar issues in the future. Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com On Tue, Sep 17, 2019 at 3:38 PM Ahmet Altay wrote: > > > On Tue, Sep 17, 2019 at 2:26 PM Maximilian

Re: Flink Runner logging FAILED_TO_UNCOMPRESS

2019-09-17 Thread Ahmet Altay
On Tue, Sep 17, 2019 at 2:26 PM Maximilian Michels wrote: > > Is not this flag set automatically for the portable runner > > Yes, the flag is set automatically, but it has been broken before and > likely will be again. It just adds additional complexity to portable > Runners. There is no other

Re: Flink Runner logging FAILED_TO_UNCOMPRESS

2019-09-17 Thread Maximilian Michels
Is not this flag set automatically for the portable runner Yes, the flag is set automatically, but it has been broken before and likely will be again. It just adds additional complexity to portable Runners. There is no other portability API then the Fn API. This flag historically had its

Re: Flink Runner logging FAILED_TO_UNCOMPRESS

2019-09-17 Thread Ahmet Altay
Could you make that change and see if it would have addressed the issue here? On Tue, Sep 17, 2019 at 2:18 PM Kyle Weaver wrote: > The flag is automatically set, but not in a smart way. Taking another look > at the code, a more resilient fix would be to just check if the runner > isinstance of

Re: Flink Runner logging FAILED_TO_UNCOMPRESS

2019-09-17 Thread Kyle Weaver
The flag is automatically set, but not in a smart way. Taking another look at the code, a more resilient fix would be to just check if the runner isinstance of PortableRunner. Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com On Tue, Sep 17, 2019 at 2:14 PM Ahmet Altay

Re: Flink Runner logging FAILED_TO_UNCOMPRESS

2019-09-17 Thread Ahmet Altay
Is not this flag set automatically for the portable runner here [1] ? [1] https://github.com/apache/beam/blob/f0aa877b8703eed4143957b4cd212aa026238a6e/sdks/python/apache_beam/pipeline.py#L160 On Tue, Sep 17, 2019 at 2:07 PM Robert Bradshaw wrote: > On Tue, Sep 17, 2019 at 1:43 PM Thomas Weise

Re: Flink Runner logging FAILED_TO_UNCOMPRESS

2019-09-17 Thread Robert Bradshaw
On Tue, Sep 17, 2019 at 1:43 PM Thomas Weise wrote: > > +1 for making --experiments=beam_fn_api default. > > Can the Dataflow runner driver just remove the setting if it is not > compatible? The tricky bit would be undoing the differences in graph construction due to this flag flip. But I would

Re: Flink Runner logging FAILED_TO_UNCOMPRESS

2019-09-17 Thread Thomas Weise
+1 for making --experiments=beam_fn_api default. Can the Dataflow runner driver just remove the setting if it is not compatible? On Tue, Sep 17, 2019 at 11:33 AM Maximilian Michels wrote: > +dev > > The beam_fn_api flag and the way it is automatically set is error-prone. > Is there anything

Re: Possible Python SDK performance regression

2019-09-17 Thread Thomas Weise
Hi Valentyn, Thanks for the reminder. The bisect is on my TODO list. Hopefully this week. I saw the discussion about declaring 2.16 LTS. We probably need to sort these performance concerns out prior to doing so. Thomas On Tue, Sep 17, 2019 at 12:02 PM Valentyn Tymofieiev wrote: > Hi

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-17 Thread Ning Kang
Thank you Hannah! It works for me! On Mon, Sep 16, 2019 at 9:56 PM Mark Liu wrote: > Thank you Hannah! > > BTW, the fix is https://github.com/apache/beam/pull/9588. > Since this affects release, https://github.com/apache/beam/pull/9595 will > be cherry-picked to release branch. > > On Mon, Sep

Re: Possible Python SDK performance regression

2019-09-17 Thread Valentyn Tymofieiev
Hi Thomas, Just a reminder that 2.16.0 was cut and soon the voting may start, so to avoid the regression that you reported blocking the vote, it would be great to start investigate it if it is reproducible. Thanks, Valentyn On Tue, Sep 10, 2019 at 1:53 PM Valentyn Tymofieiev wrote: > Thomas,

Re: Pointers on Contributing to Structured Streaming Spark Runner

2019-09-17 Thread Rui Wang
+1 on merging the runner into master, which make it more discoverable and easy to contribute (I am also interested in contributing). -Rui On Tue, Sep 17, 2019 at 3:36 AM Alexey Romanenko wrote: > Hi Xinyu, > > Great to hear that you wish to contribute into new Spark runner! We used > to have

Re: Flink Runner logging FAILED_TO_UNCOMPRESS

2019-09-17 Thread Maximilian Michels
+dev The beam_fn_api flag and the way it is automatically set is error-prone. Is there anything that prevents us from removing it? I understand that some Runners, e.g. Dataflow Runner have two modes of executing Python pipelines (legacy and portable), but at this point it seems clear that

Re: Pointers on Contributing to Structured Streaming Spark Runner

2019-09-17 Thread Alexey Romanenko
Hi Xinyu, Great to hear that you wish to contribute into new Spark runner! We used to have the sync meetings about all Spark runners in general every two weeks, so feel free to let know us if you want to participate too. Also, as one of the contributors for Structural Streaming Spark runner