Re: Limiting elements on streaming pipelines

2020-11-03 Thread Reza Ardeshir Rokni
Hi, You may want to use more than one element in your Create to start the FlatMap process as with a runner that does Fusion , the code will end up only being able to parallelize to 1. So make use of a Create

Limiting elements on streaming pipelines

2020-11-03 Thread André Rocha Silva
Fellow users I am not very used to making streaming pipelines, but I have a batch to write to pub/sub. My pipeline starts with a 'fake' element only to trigger the next step. Then in a FlatMap I use a For that yields many elements inside a for. But in the last step I've got only 100 elements

Re: October 2020, Beam Community Update

2020-11-03 Thread Matthias Baetens
This is really awesome indeed, Brittany :) If I could only read Alexey's mind, but my hunch is that this could come in blogpost format on the website (as well) and sharing it on the Twitter handle? On Tue, 3 Nov 2020 at 20:54, Pablo Estrada wrote: > Hi Alexey, > Do you have any other place in

Re: October 2020, Beam Community Update

2020-11-03 Thread Pablo Estrada
Hi Alexey, Do you have any other place in mind? I don't think Brittany has current plans to publish this elsewhere, but if you have any good ideas, I imagine she could consider them : ) Best -P. On Tue, Nov 3, 2020 at 8:23 AM Alexey Romanenko wrote: > Thanks for doing this! > > Is it going to

Re: October 2020, Beam Community Update

2020-11-03 Thread Alexey Romanenko
Thanks for doing this! Is it going to be published somewhere else than only the mailing lists? > On 31 Oct 2020, at 00:03, Brittany Hermann wrote: > > Hi everyone, > > Attached is the October 2020 Beam Community Update. The purpose of this > newsletter is purely community focused - giving

ParquetIO.sink fails on embedded Flink runner in Java classic

2020-11-03 Thread Chris Hinds
Hey, Flink is super useful for Beam development, but I’m having trouble writing data to Parquet. Everything works fine on DirectRunner, DataflowRunner, and FlinkRunner against a local cluster (1.10.2). However, when I use FlinkRunner in embedded mode, only a subset of my data arrive on the