Yes, apparently it might have been an error related to the Dataflow runner,
Google is investigating the case. They gave a few suggestions that solved
the issue but I am not sure which was the real solution and why.

On Wed, 24 Feb 2021 at 03:11, Yichi Zhang <[email protected]> wrote:

> Reshuffle doesn't change your windowing or grouping, it simply
> redistributes the elements to different workers. The output should match
> the input of the Reshuffle step. Are you seeing fewer elements coming out
> of Reshuffle comparing to the input?
>
> On Wed, Feb 17, 2021 at 9:11 AM Manninger, Matyas <
> [email protected]> wrote:
>
>> Dear Beam users,
>>
>> I have a problem running a python pipeline in Dataflow. Because of many
>> side inputs and a complicated architecture Google told us that their
>> optimization algorithm gets messed up and adding reshuffle to the pipeline
>> solves the issue. Unfortunately, it seems like the Reshuffle step is not
>> working properly. I added a 60 sec fixed window in front of it as this is a
>> streaming pipeline. It seems like elements get added to the step but they
>> remain grouped or something like that as there are only a very few elements
>> coming out of the step. Any ideas what I might be doing wrong? The code is
>> very long and complicated, I also wouldn't share it, but are there any
>> typical mistakes regarding the reshuffling?
>>
>> Thanks for any tips,
>> Matyas
>>
>

Reply via email to