Re: [VOTE] Release 2.26.0, release candidate #1

2020-12-08 Thread Robert Burke
I'm +1 on RC1 based on the 7 tests I know I can check successfully. I'll be trying more tomorrow, but remember that release validation requires the community to validate it meets our standards, and I can't do it alone. Remember you can participate in the release validation by reviewing parts of

Re: Proposal: Scheduled tasks

2020-12-08 Thread Chad Dombrova
Thanks! On Tue, Dec 8, 2020 at 6:54 AM Pablo Estrada wrote: > Hi Chad! > I've been meaning to review this, I've just not carved up the time. I'll > try to get back to you this week with some thoughts! > Thanks! > -P. > > On Wed, Dec 2, 2020 at 10:31 AM Chad Dombrova wrote: > >> Hi everyone, >>

Re: Use case for reading to dynamic Pub/Sub subscriptions?

2020-12-08 Thread Vincent Marquez
KafkaIO has a readAll method that returns a PTransform, PCollection> is that what you mean? Then it could read in a 'dynamic' number of topics generated from somewhere else. Is that what you mean? *~Vincent* On Tue, Dec 8, 2020 at 5:15 PM Daniel Collins wrote: > /s/Combine/Flatten > > On

Re: Use case for reading to dynamic Pub/Sub subscriptions?

2020-12-08 Thread Daniel Collins
/s/Combine/Flatten On Tue, Dec 8, 2020 at 8:06 PM Daniel Collins wrote: > Hi all, > > I'm trying to figure out if there's any possible use for reading from a > dynamic set of Pub/Sub [Lite] subscriptions in a beam pipeline, although > the same logic would apply to kafka topics. Does anyone know

Use case for reading to dynamic Pub/Sub subscriptions?

2020-12-08 Thread Daniel Collins
Hi all, I'm trying to figure out if there's any possible use for reading from a dynamic set of Pub/Sub [Lite] subscriptions in a beam pipeline, although the same logic would apply to kafka topics. Does anyone know of a use case where you'd want to apply the same set of processing logic to all

Re: Throttling stream outputs per trigger?

2020-12-08 Thread Boyuan Zhang
I think your understanding is correct. Does the CommitOffset transform have side-effects on your pipeline? On Tue, Dec 8, 2020 at 3:35 PM Vincent Marquez wrote: > > *~Vincent* > > > On Tue, Dec 8, 2020 at 3:13 PM Boyuan Zhang wrote: > >> Please note that each record output from

Re: Throttling stream outputs per trigger?

2020-12-08 Thread Vincent Marquez
*~Vincent* On Tue, Dec 8, 2020 at 3:13 PM Boyuan Zhang wrote: > Please note that each record output from ReadFromKafkaDoFn is in a > GlobalWindow. The workflow is: > ReadFromKafkaDoFn -> Reshuffle -> Window.into(FixedWindows) -> > Max.longsPerKey -> CommitDoFn >

Re: Throttling stream outputs per trigger?

2020-12-08 Thread Boyuan Zhang
Please note that each record output from ReadFromKafkaDoFn is in a GlobalWindow. The workflow is: ReadFromKafkaDoFn -> Reshuffle -> Window.into(FixedWindows) -> Max.longsPerKey -> CommitDoFn | --->

Re: Caching issue in BigQueryIO

2020-12-08 Thread Reuven Lax
How long does it take to rebuild? Even for thousands of tables I would not expect it to take very long, unless you are hitting quota rate limits with BigQuery. If that's the case, maybe a better solution is to see if those quotas could be raised? On Fri, Dec 4, 2020 at 9:57 AM Vasu Gupta wrote:

[PROPOSAL] Preparing for Beam release 2.27.0

2020-12-08 Thread Pablo Estrada
Hello everyone! The next Beam release (2.27.0) is scheduled to be cut on December 16th according to the release calendar [1]. I'd like to volunteer to handle this release. I plan on cutting the branch on December 16th as scheduled. I'll keep updates on this thread. Any comments or objections?

Re: Implementing ARR_AGG

2020-12-08 Thread Robin Qiu
Hi Sonam, I replied directly to your draft PR. Please see me comments there and let me know if that is helpful. On Mon, Dec 7, 2020 at 4:37 AM Sonam Ramchand < sonam.ramch...@venturedive.com> wrote: > Hi Devs, > I have tried to implement the ARR_AGG function for Zetasql dialect by > following

Re: Throttling stream outputs per trigger?

2020-12-08 Thread Vincent Marquez
*~Vincent* On Tue, Dec 8, 2020 at 1:34 PM Boyuan Zhang wrote: > Hi Vicent, > > Window.into(FixedWindows.of(Duration.standardMinutes(5))) operation just > applies the window information to each element, not really does the > grouping operation. And in the commit transform, there is a combine >

Re: Throttling stream outputs per trigger?

2020-12-08 Thread Boyuan Zhang
Hi Vicent, Window.into(FixedWindows.of(Duration.standardMinutes(5))) operation just applies the window information to each element, not really does the grouping operation. And in the commit transform, there is a combine transform applied(Max.longsPerKey()).

Re: Throttling stream outputs per trigger?

2020-12-08 Thread Vincent Marquez
If this is the case that the pipeline has no way of enforcing fixed time windows, how does this work: https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaCommitOffset.java#L126 Isn't this supposed to only trigger every five minutes,

Re: Add emilymye to JIRA contributers

2020-12-08 Thread Pablo Estrada
Hi Emily, I've checked and someone has added you. Best -P. On Tue, Dec 8, 2020 at 10:18 AM Griselda Cuevas wrote: > Hi Emily, did you get access? > > On Mon, 7 Dec 2020 at 13:33, Emily Ye wrote: > >> Hi dev@ >> >> Just realized I didn't request contributor access to JIRA a while back - >>

Re: Docker Development Environment

2020-12-08 Thread Alex Kosolapov
Hi! Thank you for creating Docker build environment - makes build environment setup so much easier! I ran start-build-env.sh on a macOS, and I ran into some items that wanted to share + propose how to improve Docker build environment for macOS support: 1. ./start-build-env.sh: line 75:

Getting Sprint Management rights on Jira

2020-12-08 Thread Griselda Cuevas
Hi folks, I'd like to request Sprint management rights on Jira to organize the Website redesign work around the content creation part. Could someone in the PMC help me with this?

Re: Add emilymye to JIRA contributers

2020-12-08 Thread Griselda Cuevas
Hi Emily, did you get access? On Mon, 7 Dec 2020 at 13:33, Emily Ye wrote: > Hi dev@ > > Just realized I didn't request contributor access to JIRA a while back - > would someone be able to add me so I can self-assign my issues? > > Thank you! > Emily >

Re: Proposal: Scheduled tasks

2020-12-08 Thread Pablo Estrada
Hi Chad! I've been meaning to review this, I've just not carved up the time. I'll try to get back to you this week with some thoughts! Thanks! -P. On Wed, Dec 2, 2020 at 10:31 AM Chad Dombrova wrote: > Hi everyone, > Beam's niche is low latency, high throughput workloads, but Beam has >