Re: [CANCEL][VOTE] Release 2.3.0, release candidate #1

2018-02-04 Thread Jean-Baptiste Onofré
Hi guys, Quick update on the RC2 preparation: * BEAM-3587 (TextIO with Flink) seems related to a custom build with Gradle (not using artifacts created by Maven). Anyway, I will take a look today. * BEAM-3186 has a PR. Aljoscha will do the review pretty soon. * I'm also taking a look on the

Re: coder evolutions?

2018-02-04 Thread Jean-Baptiste Onofré
Done Regards JB On 02/04/2018 09:14 PM, Romain Manni-Bucau wrote: > Works for me. So a jira with target version = 3. > > Can someone with the karma check we have a 3.0.0 in jira system please? > > Le 4 févr. 2018 20:46, "Reuven Lax" > > a écrit : >

Re: [DISCUSS] State of the project: Culture and governance

2018-02-04 Thread Ted Yu
bq. 1. Propose a pull request to their fix branch. This is my favorite and I've mentioned it. Everything is straightforward and explicit. The above makes sense. When the contributor merges the reviewer's pull request, it signifies their willingness to adopt the suggestion, making the combined

Re: [DISCUSS] State of the project: Culture and governance

2018-02-04 Thread Romain Manni-Bucau
Le 4 févr. 2018 19:53, "Kenneth Knowles" a écrit : On Wed, Jan 31, 2018 at 1:11 AM, Ismaël Mejía wrote: > > For example if a > first-time contributor fixes some error and has the expected quality > we should accept it quickly, not being picky about some

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Romain Manni-Bucau
Im off tonight but can we try to do it next week (tomorrow)? If not please answer to this thread with outcomes and Ill catch up tmr morning. Le 4 févr. 2018 20:23, "Reuven Lax" a écrit : Cool, let's chat about this on slack for a bit (which I realized I've been signed out of

Re: coder evolutions?

2018-02-04 Thread Romain Manni-Bucau
Works for me. So a jira with target version = 3. Can someone with the karma check we have a 3.0.0 in jira system please? Le 4 févr. 2018 20:46, "Reuven Lax" a écrit : > Seems fine to me. At some point we might want to do an audit of existing > Jira issues, because I suspect

Re: coder evolutions?

2018-02-04 Thread Reuven Lax
Seems fine to me. At some point we might want to do an audit of existing Jira issues, because I suspect there are issues that should be targeted to 3.0 but are not yet tagged. On Sun, Feb 4, 2018 at 11:41 AM, Jean-Baptiste Onofré wrote: > I would prefer to use Jira, with

Re: coder evolutions?

2018-02-04 Thread Jean-Baptiste Onofré
I would prefer to use Jira, with "wish"/"ideas", and adding Beam 3.0.0 version. WDYT ? Regards JB On 02/04/2018 07:55 PM, Reuven Lax wrote: > Do we have a good place to track the items for Beam 3.0, or is Jira the best > place? Romain has a good point - if this gets forgotten when we do Beam

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Reuven Lax
Cool, let's chat about this on slack for a bit (which I realized I've been signed out of for some time). Reuven On Sun, Feb 4, 2018 at 9:21 AM, Jean-Baptiste Onofré wrote: > Sorry guys, I was off today. Happy to be part of the party too ;) > > Regards > JB > > On 02/04/2018

Re: coder evolutions?

2018-02-04 Thread Reuven Lax
Do we have a good place to track the items for Beam 3.0, or is Jira the best place? Romain has a good point - if this gets forgotten when we do Beam 3.0, then we're stuck waiting around till Beam 4.0. Reuven On Sun, Feb 4, 2018 at 9:27 AM, Jean-Baptiste Onofré wrote: >

Re: [DISCUSS] State of the project: Culture and governance

2018-02-04 Thread Kenneth Knowles
On Wed, Jan 31, 2018 at 1:11 AM, Ismaël Mejía wrote: > > For example if a > first-time contributor fixes some error and has the expected quality > we should accept it quickly, not being picky about some other part > that can be improved that was discovered during the review,

Re: coder evolutions?

2018-02-04 Thread Jean-Baptiste Onofré
That's a good point. In the roadmap for Beam 3, I think it makes sense to add a point about this. Regards JB On 02/04/2018 06:18 PM, Eugene Kirpichov wrote: > I think doing a change that would break pipeline update for every single user > of > Flink and Dataflow needs to be postponed until a

Re: coder evolutions?

2018-02-04 Thread Romain Manni-Bucau
yep sadly :( how should we track it properly to not forget it for v3? (I dont trust jira much but if we don't have anything better...) when do we start beam 3? next week? :) Romain Manni-Bucau @rmannibucau | Blog | Old Blog

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Jean-Baptiste Onofré
Sorry guys, I was off today. Happy to be part of the party too ;) Regards JB On 02/04/2018 06:19 PM, Reuven Lax wrote: > Romain, since you're interested maybe the two of us should put together a > proposal for how to set this things (hints, schema) on PCollections? I don't > think it'll be hard

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Reuven Lax
Romain, since you're interested maybe the two of us should put together a proposal for how to set this things (hints, schema) on PCollections? I don't think it'll be hard - the previous list thread on hints already agreed on a general approach, and we would just need to flesh it out. BTW in the

Re: coder evolutions?

2018-02-04 Thread Eugene Kirpichov
I think doing a change that would break pipeline update for every single user of Flink and Dataflow needs to be postponed until a next major version. Pipeline update is a very frequently used feature, especially by the largest users. We've had those users get significantly upset even when we

Re: coder evolutions?

2018-02-04 Thread Romain Manni-Bucau
I like this idea of migration support at coder level. It would require to add a metadata in all outputs which would represent the version then coders can handle the logic properly depending the version - we can assume a coder dev upgrade the version when he breaks the representation I hope ;).

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Romain Manni-Bucau
2018-02-04 17:53 GMT+01:00 Reuven Lax : > > > On Sun, Feb 4, 2018 at 8:42 AM, Romain Manni-Bucau > wrote: > >> >> 2018-02-04 17:37 GMT+01:00 Reuven Lax : >> >>> I'm not sure where proto comes from here. Proto is one example of a type >>>

Re: coder evolutions?

2018-02-04 Thread Reuven Lax
It would already break quite a number of users at this point. I think what we should be doing is moving forward on the snapshot/update proposal. That proposal actually provides a way forward when coders change (it proposes a way to map an old snapshot to one using the new coder, so changes to

Re: coder evolutions?

2018-02-04 Thread Romain Manni-Bucau
I fully understand that, and this is one of the reason managing to solve these issues is very important and ASAP. My conclusion is that we must break it now to avoid to do it later when usage will be way more developped - I would be very happy to be wrong on that point - so I started this PR and

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Reuven Lax
On Sun, Feb 4, 2018 at 8:42 AM, Romain Manni-Bucau wrote: > > 2018-02-04 17:37 GMT+01:00 Reuven Lax : > >> I'm not sure where proto comes from here. Proto is one example of a type >> that has a schema, but only one example. >> >> 1. In the initial

Re: coder evolutions?

2018-02-04 Thread Reuven Lax
Unfortunately several runners (at least Flink and Dataflow) support in-place update of streaming pipelines as a key feature, and changing coder format breaks this. This is a very important feature of both runners, and we should endeavor not to break them. In-place snapshot and update is also a

Re: coder evolutions?

2018-02-04 Thread Romain Manni-Bucau
Sadly yes, and why the PR is actually WIP. As mentionned it modifies it and requires some updates in other languages and the standard_coders.yml file (I didn't find how this file was generated). Since coders must be about volatile data I don't think it is a big deal to change it though. Romain

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Romain Manni-Bucau
2018-02-04 17:37 GMT+01:00 Reuven Lax : > I'm not sure where proto comes from here. Proto is one example of a type > that has a schema, but only one example. > > 1. In the initial prototype I want to avoid modifying the PCollection API. > So I think it's best to create a special

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Reuven Lax
I'm not sure where proto comes from here. Proto is one example of a type that has a schema, but only one example. 1. In the initial prototype I want to avoid modifying the PCollection API. So I think it's best to create a special SchemaCoder, and pass the schema into this coder. Later we might

Re: coder evolutions?

2018-02-04 Thread Reuven Lax
One question - does this change the actual byte encoding of elements? We've tried hard not to do that so far for reasons of compatibility. Reuven On Sun, Feb 4, 2018 at 6:44 AM, Romain Manni-Bucau wrote: > Hi guys, > > I submitted a PR on coders to enhance 1. the user

Build failed in Jenkins: beam_PostRelease_NightlySnapshot #16

2018-02-04 Thread Apache Jenkins Server
See Changes: [robertwb] [BEAM-3207] Create a standard location to enumerate and document URNs. [robertwb] Revert URNs that are currently hard-coded in the Dataflow worker. [kedin] [SQL] Add

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Romain Manni-Bucau
@Reuven: is the proto only about passing schema or also the generic type? There are 2.5 topics to solve this issue: 1. How to pass schema 1.a. hints? 2. What is the generic record type associated to a schema and how to express a schema relatively to it I would be happy to help on 1.a and 2