Re: [VOTE] Mark 2.7.0 branch as a long term support (LTS) branch

2018-11-12 Thread Connell O'Callaghan
+1 On Fri, Nov 9, 2018 at 11:06 PM Manu Zhang wrote: > +1 > > Thanks, > Manu Zhang > On Nov 10, 2018, 12:55 PM +0800, Kenneth Knowles , wrote: > > +1 > > On Fri, Nov 9, 2018, 12:04 Scott Wegner >> +1 Thanks for driving this Ahmet! >> >> On Fri, Nov 9, 2018 at 9:33 AM Chamikara Jayalath >>

Re: Bigquery streaming TableRow size limit

2018-11-12 Thread Reuven Lax
I'm a bit worried about making this automatic, as it can have unexpected side effects on BigQuery load-job quota. This is a 24-hour quota, so if it's accidentally exceeded all load jobs for the project may be blocked for the next 24 hours. However if the user opts in (possibly via .a builder

Re: Evolving a Coder for an added field

2018-11-12 Thread Reuven Lax
A few thoughts: 1. I agree with you about coder versioning. The lack of a good story around versioning has been a huge pain here, and it's unfortunate that nobody ever worked on this. 2. I think versioning schemas will be easier than versioning coders (especially for adding new fields). In many

Beam Dependency Check Report (2018-11-12)

2018-11-12 Thread Apache Jenkins Server
High Priority Dependency Updates Of Beam Python SDK: Dependency Name Current Version Latest Version Release Date Of the Current Used Version Release Date Of The Latest Release JIRA Issue future 0.16.0 0.17.1 2016-10-27

Re: [Call for items] November Beam Newsletter

2018-11-12 Thread Matthias Baetens
Looks great, thanks for the effort and for including the Summit blogpost, Rose! On Thu, 8 Nov 2018 at 22:55 Rose Nguyen wrote: > Hi Beamers: > > Time to sync with the community on all the awesome stuff we've been doing! > > *Add the highlights from October to now (or planned events and talks)

Re: Design review for supporting AutoValue Coders and conversions to Row

2018-11-12 Thread Jeff Klukas
Reuven - A SchemaProvider makes sense. It's not clear to me, though, whether that's more limited than a Coder. Do all values of the schema have to be simple types, or does Beam SQL support nested schemas? Put another way, would a user be able to create an AutoValue class comprised of simple types

Re: Design review for supporting AutoValue Coders and conversions to Row

2018-11-12 Thread Reuven Lax
On Mon, Nov 12, 2018 at 11:38 PM Jeff Klukas wrote: > Reuven - A SchemaProvider makes sense. It's not clear to me, though, > whether that's more limited than a Coder. Do all values of the schema have > to be simple types, or does Beam SQL support nested schemas? > Nested schemas, collection

Bigquery streaming TableRow size limit

2018-11-12 Thread Wout Scheepers
Hey all, The TableRow size limit is 1mb when streaming into bigquery. To prevent data loss, I’m going to implement a TableRow size check and add a fan out to do a bigquery load job in case the size is above the limit. Of course this load job would be windowed. I know it doesn’t make sense to

Re: [BEAM-5442] Store duplicate unknown (runner) options in a list argument

2018-11-12 Thread Maximilian Michels
Thank you Robert and Lukasz for your points. Note that I believe that we will want to have multiple URLs to support cross language pipelines since we will want to be able to ask other SDK languages/versions for their "list" of supported PipelineOptions. Why is that? The Runner itself is the

Re: Evolving a Coder for an added field

2018-11-12 Thread Jeff Klukas
Conversation here has fizzled, but sounds like there's basically a consensus here on a need for a new concept of Coder versioning that's accessible at the Java level in order to allow an evolution path. Further, it sounds like my open PR [0] for adding a new field to Metadata is essentially

Re: Bigquery streaming TableRow size limit

2018-11-12 Thread Lukasz Cwik
Having data ingestion work without needing to worry about how big the blobs are would be nice if it was automatic for users. On Mon, Nov 12, 2018 at 1:03 AM Wout Scheepers < wout.scheep...@vente-exclusive.com> wrote: > Hey all, > > > > The TableRow size limit is 1mb when streaming into bigquery.

Re: Evolving a Coder for an added field

2018-11-12 Thread Lukasz Cwik
Jeff, I reached out to a few people at Google who were involved on the Coder update issue in the past related to being able to update streaming pipelines to see if they can work with you or at least provide context. On Mon, Nov 12, 2018 at 10:16 AM Jeff Klukas wrote: > Conversation here has

JIRA contributor access

2018-11-12 Thread Andrea Foegler
Hi all - I'm part of the Dataflow team at Google and would like to add a few features to the Dataflow Runner. Could someone add me as a JIRA contributor to track the work? Thanks! Andrea

Re: [BEAM-5442] Store duplicate unknown (runner) options in a list argument

2018-11-12 Thread Lukasz Cwik
On Mon, Nov 12, 2018 at 9:38 AM Maximilian Michels wrote: > Thank you Robert and Lukasz for your points. > > > Note that I believe that we will want to have multiple URLs to support > cross language pipelines since we will want to be able to ask other SDK > languages/versions for their "list" of