Re: A problem with nexmark build

2021-05-17 Thread Brian Hulette
Hm it looks like there may be a bug in our gradle config, it doesn't seem to make a shaded jar for use with Spark (see this comment on the PR that added this to the website [1]). Maybe we need to add a shadowJar configuration to :sdks:java:testing:nexmark? +dev does anyone have context on this?

Re: Streaming Beam/Dataflow with schema evolution

2021-05-17 Thread Jing Zhang
I feel like this is related to https://www.youtube.com/watch?v=7lJyq1hw_KI#t=18m > On May 17, 2021, at 3:36 PM, Pierre Oberholzer > wrote: > > Dear Beam Community, > > We would like to run a streaming job from Pub/Sub to BigQuery that handles > schema updates "smoothly" (i.e. without having t

Streaming Beam/Dataflow with schema evolution

2021-05-17 Thread Pierre Oberholzer
Dear Beam Community, We would like to run a streaming job from Pub/Sub to BigQuery that handles schema updates "smoothly" (i.e. without having to restart a new job). Any suggestion on a suitable method/architecture to achieve this ? We found the below unresolved question [1] on Stack Overflow whe

Beam 2.29.0 throwsing warning/error when reading by using BigQuery Storage Read API

2021-05-17 Thread Filip Popić
Hi, I am trying to read data from BigQuery table using BigQuery Storage/Direct Read API using Beam 2.29.0, and providing simple negated boolean row restriction (`NOT field_x`) but getting warning+errors: ``` *~*~*~ Channel ManagedChannelI

Re: Is there a way (seetings) to limit the number of element per worker machine

2021-05-17 Thread Robert Bradshaw
Note that workers generally process one element per thread at a time. The number of threads defaults to the number of cores of the VM that you're using. On Mon, May 17, 2021 at 10:18 AM Brian Hulette wrote: > What type of files are you reading? If they can be split and read by > multiple workers

Re: Is there a way (seetings) to limit the number of element per worker machine

2021-05-17 Thread Brian Hulette
What type of files are you reading? If they can be split and read by multiple workers this might be a good candidate for a Splittable DoFn (SDF). Brian On Wed, May 12, 2021 at 6:18 AM Eila Oriel Research wrote: > Hi, > I am running out of resources on the workers machines. > The reasons are: >

Re: DirectRunner, Fusion, and Triggers

2021-05-17 Thread Brian Hulette
> P.S. I need this pipeline to work both on a distributed runner and also on a local machine with many cores. That's why the performance of DirectRunner is important to me. IIUC the DirectRunner has intentionally made some trade-offs to make it less performant, so that it better verifies pipelines

Re: Unsubscribe

2021-05-17 Thread Brian Hulette
Hi Tarek, Pasan, You can unsubscribe by writing to user-unsubscr...@beam.apache.org [1] [1] https://apache.org/foundation/mailinglists.html#request-addresses-for- unsubscribing On Sun, May 16, 2021 at 6:04 AM Pasan Kamburugamuwa < pasankamburugamu...@gmail.com> wrote: > > On Sun, May 16, 2021,

Re: DirectRunner, Fusion, and Triggers

2021-05-17 Thread Jan Lukavský
On 5/17/21 3:46 PM, Bashir Sadjad wrote: Thanks Jan. Two points: - I was running all the experiments I reported with `--targetParallelism=1` to make sure concurrent threads do not mess up the logs. I think that is what causes what you see. Try to increase the parallelism to number higher than

Re: DirectRunner, Fusion, and Triggers

2021-05-17 Thread Bashir Sadjad
Thanks Jan. Two points: - I was running all the experiments I reported with `--targetParallelism=1` to make sure concurrent threads do not mess up the logs. - I have been tracking bundles too (see @StartBundle log messages in the mini-example in my previous reply to Kenn). So I don't think bundle

Re: DirectRunner, Fusion, and Triggers

2021-05-17 Thread Jan Lukavský
Hi Bashir, the behavior you describe should be expected. DirectRunner splits the input work into bundles, processing each bundle might result in zero, one or more new bundles. The executor executes the work associated with these bundles, enqueuing new bundles into a queue, until there are no