Re: big data blog

2020-02-13 Thread Etienne Chauchot
Hi all, I just sent the link to the blog articles on @ApacheBeam twitter as Kenn suggested. Etienne On 10/02/2020 10:01, Etienne Chauchot wrote: Yes sure, Here is the link to the spreadsheet for review of the tweet:

Re: daily dataflow job failing today

2020-02-13 Thread Ismaël Mejía
For info Avro has published a new version 1.9.2.1 that fixes the issue: https://issues.apache.org/jira/browse/AVRO-2737 I just submitted a PR to make the dependency consistent with Avro versioning and verify that everything works as intended with the upgraded dependency on the python SDK. Can you

Re: A new reworked Elasticsearch 7+ IO module

2020-02-13 Thread Etienne Chauchot
Hi Cham, thanks for your comments ! I just sent an email to user ML with a survey link to count ES uses per version: https://lists.apache.org/thread.html/rc8185afb8af86a2a032909c13f569e18bd89e75a5839894d5b5d4082%40%3Cuser.beam.apache.org%3E Best Etienne On 10/02/2020 19:46, Chamikara

Re: FnAPI proto backwards compatibility

2020-02-13 Thread Luke Cwik
On Wed, Feb 12, 2020 at 2:24 PM Kenneth Knowles wrote: > > > On Wed, Feb 12, 2020 at 12:04 PM Robert Bradshaw > wrote: > >> On Wed, Feb 12, 2020 at 11:08 AM Luke Cwik wrote: >> > >> > We can always detect on the runner/SDK side whether there is an unknown >> field[1] within a payload and fail

Re: daily dataflow job failing today

2020-02-13 Thread Valentyn Tymofieiev
Thank you, Ismaël. Good to know Avro doesn't follow semantic versioning. Replied on the PR. On Thu, Feb 13, 2020 at 5:24 AM Ismaël Mejía wrote: > For info Avro has published a new version 1.9.2.1 that fixes the issue: > https://issues.apache.org/jira/browse/AVRO-2737 > > I just submitted a PR

Re: [PROPOSAL] Transition released containers to the official ASF dockerhub organization

2020-02-13 Thread Ahmet Altay
Could we ask them to buik add a list of people to add to the list? We could add all PMC members and previous release managers to the list. That might cover a good chunk of the future releases. On Wed, Feb 12, 2020 at 10:10 PM Hannah Jiang wrote: > Thanks everyone for supporting it. > > Yes,

Re: daily dataflow job failing today

2020-02-13 Thread Ahmet Altay
Thank you, Ismaël. I did not know that Avro was not using semantic versioning either. On Thu, Feb 13, 2020 at 9:44 AM Valentyn Tymofieiev wrote: > Thank you, Ismaël. Good to know Avro doesn't follow semantic versioning. > Replied on the PR. > > On Thu, Feb 13, 2020 at 5:24 AM Ismaël Mejía

Re: [PROPOSAL] Transition released containers to the official ASF dockerhub organization

2020-02-13 Thread Robert Burke
+1 to a bulk add. Shared account removes all accouttabillity and is at risk for abuse. As it stands, the release managers could abuse their privilege, but we'd have the opportunity to know about whodunnit. On Thu, Feb 13, 2020, 9:51 AM Robert Bradshaw wrote: > +1, granting permission to

Re: FnAPI proto backwards compatibility

2020-02-13 Thread Jan Lukavský
Hi, +1 for adding pipeline required features. I think being able to reject pipeline with unknown requirement is pretty much needed, mostly because that enables runners to completely decouple from SDKs, while being able to recognize when a pipeline constructed with incomplatible version of

Re: Jenkins jobs not running for my PR 10438

2020-02-13 Thread Tomo Suzuki
Ahmet, thanks. But it seems Jenkins is not reporting the status correctly. Will check tomorrow. On Thu, Feb 13, 2020 at 2:45 PM Tomo Suzuki wrote: > > Hi Beam committers, > > Would you run precommit checks on https://github.com/apache/beam/pull/10765 > with the following 6 additional commands? >

Re: FnAPI proto backwards compatibility

2020-02-13 Thread Robert Burke
One thing that doesn't appear to have been suggested yet is we could "batch" urns together under a "super urn" so that adding one super urn is like adding each of the represented batch of features. This prevents needing to send dozens of urns to be individually sent over. The super urns would

Re: FnAPI proto backwards compatibility

2020-02-13 Thread Robert Burke
+1 to deferring for now. Since they should not be modified after adoption, it makes sense not to get ahead of ourselves. On Thu, Feb 13, 2020, 10:59 AM Robert Bradshaw wrote: > On Thu, Feb 13, 2020 at 10:12 AM Robert Burke wrote: > > > > One thing that doesn't appear to have been suggested yet

Re: daily dataflow job failing today

2020-02-13 Thread Kenneth Knowles
But pip doesn't try to reconcile user's requested version and Beam's listed dep, right? (https://github.com/pypa/pip/issues/988 still open) Kenn On Thu, Feb 13, 2020 at 9:48 AM Ahmet Altay wrote: > Thank you, Ismaël. I did not know that Avro was not using semantic > versioning either. > > On

Re: FnAPI proto backwards compatibility

2020-02-13 Thread Kenneth Knowles
On Thu, Feb 13, 2020 at 12:42 PM Jan Lukavský wrote: > Hi, > > +1 for adding pipeline required features. I think being able to reject > pipeline with unknown requirement is pretty much needed, mostly because > that enables runners to completely decouple from SDKs, while being able to > recognize

Re: FnAPI proto backwards compatibility

2020-02-13 Thread Robert Burke
Wrt per DoFn/ParDo level, there's the similar case of wether the DoFn has an Urn for requiring something or it's an annotation for saying the DoFn provides something (eg. Provides K-anonymization with k defined) The general theme of this thread seems to be trying to ensure a runner can reject a

Re: FnAPI proto backwards compatibility

2020-02-13 Thread Kyle Weaver
> we can take advantage of these pipeline features to get rid of the categories of @ValidatesRunner tests, because we could have just simply @ValidatesRunner and each test would be matched against runner capabilities +1, I think the potential to formally integrate our idea of compatibility and

Contributor permission for Beam Jira tickets

2020-02-13 Thread Wenbing Bai
Hi there, I am Wenbing from Cruise. I would like to make some contributions to the Python SDK for Beam. Can someone add me as a contributor in the Beam Jira? My username is wenbing-bai. Thank you! Wenbing -- Wenbing Bai Senior Software Engineer, MLP Cruise Pronouns: She/Her --

Re: Labels on PR

2020-02-13 Thread Kyle Weaver
I'm really enjoying this feature so far! The "Pull Requests" page for Beam is now way more readable. Thanks Alex :) On Wed, Feb 12, 2020 at 9:18 PM Alex Van Boxel wrote: > What do you exactly mean with github grep... where is it an issue. I find > it useful for searching here: > > [image:

Re: [PROPOSAL] Transition released containers to the official ASF dockerhub organization

2020-02-13 Thread Robert Bradshaw
+1, granting permission to individual accounts is preferable to trying to share a single account. On Thu, Feb 13, 2020 at 9:44 AM Ahmet Altay wrote: > > Could we ask them to buik add a list of people to add to the list? We could > add all PMC members and previous release managers to the list.

Re: FnAPI proto backwards compatibility

2020-02-13 Thread Robert Bradshaw
On Thu, Feb 13, 2020 at 10:12 AM Robert Burke wrote: > > One thing that doesn't appear to have been suggested yet is we could "batch" > urns together under a "super urn" so that adding one super urn is like adding > each of the represented batch of features. This prevents needing to send >

Re: Jenkins jobs not running for my PR 10438

2020-02-13 Thread Tomo Suzuki
Hi Beam committers, Would you run precommit checks on https://github.com/apache/beam/pull/10765 with the following 6 additional commands? Run Java PostCommit Run Java HadoopFormatIO Performance Test Run BigQueryIO Streaming Performance Test Java Run Dataflow ValidatesRunner Run Spark

Re: Python2.7 Beam End-of-Life Date

2020-02-13 Thread Ismaël Mejía
> I would suggest re-evaluating this within the next 3 months again. We need to balance between user pain/contributor pain/our ability to continuously test with python 2 in a shifting environment. Good idea for the in 3 months evaluation, at that point also distributions will probably be phasing

Re: daily dataflow job failing today

2020-02-13 Thread Ismaël Mejía
> I can argue for not pinning and bounding with major version ranges. This gives flexibility to users to mix other third party libraries that share common dependencies with Beam. Our expectation is that dependencies follow semantic versioning and do not introduce breaking changes unless there is a