Re: Artifact staging in cross-language pipelines

2019-04-23 Thread Heejong Lee
2019년 4월 23일 (화) 오전 2:07, Robert Bradshaw 님이 작성: > I've been out, so coming a bit late to the discussion, but here's my > thoughts. > > The expansion service absolutely needs to be able to provide the > dependencies for the transform(s) it expands. It seems the default, > foolproof way of doing

Enable security for data channels in portability

2019-04-23 Thread Hai Lu
Hi, This is Hai from LinkedIn. Daniel and I have been working on productionizing Samza portable runner. BTW, Daniel didn't mention in his previous email that he has enabled and validated Python 3 for Samza runner and it worked smoothly. Kudos to the team! Here I have a few security related

Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-23 Thread Valentyn Tymofieiev
I think we should also leverage/invest in the automation for RC validation. We have some validation scripts, but last time I looked at them they worked only partially and had several usability issues. On Tue, Apr 23, 2019 at 3:24 PM Ahmet Altay wrote: > > > On Tue, Apr 23, 2019 at 3:21 PM

Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-23 Thread Ahmet Altay
On Tue, Apr 23, 2019 at 3:21 PM Kenneth Knowles wrote: > What can we do to make this part of day-to-day workflow instead of finding > out during release validation? Was this just a failing test that was missed? > > Kenn > > On Tue, Apr 23, 2019 at 3:02 PM Andrew Pilloud > wrote: > >> It looks

Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-23 Thread Reuven Lax
I mistakenly though that Java PostCommit would run these tests, and I merged based on PostCommit passing. That's how the bug got into master. On Tue, Apr 23, 2019 at 3:21 PM Kenneth Knowles wrote: > What can we do to make this part of day-to-day workflow instead of finding > out during release

Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-23 Thread Kenneth Knowles
What can we do to make this part of day-to-day workflow instead of finding out during release validation? Was this just a failing test that was missed? Kenn On Tue, Apr 23, 2019 at 3:02 PM Andrew Pilloud wrote: > It looks like Java Nexmark tests are on the validation sheet but we've > missed

Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-23 Thread Andrew Pilloud
It looks like Java Nexmark tests are on the validation sheet but we've missed it the last few releases. Thanks for checking it Etienne! Does the current release process require everything to be tested before making the release final? I fully agree with you on point 2. All of these issues were in

Re: Hello from Hannah Jiang

2019-04-23 Thread Ahmet Altay
Welcome! I could not find your user name in JIRA. Have you registered? You need to register first then we can add you as a contributor to Beam/ On Tue, Apr 23, 2019 at 1:31 PM Hannah Jiang wrote: > Thanks Aizhamal. > Here is my user name: jiangxuehua1...@gmail.com > > Thanks, > Hannah > > > >

Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-23 Thread Reuven Lax
-1 we need to cherry pick pr/8325 and pr/8385 to fix the above issue On Tue, Apr 23, 2019 at 1:48 PM Andrew Pilloud wrote: > I believe the breakage of Nexmark on Dataflow is > https://issues.apache.org/jira/browse/BEAM-7002, which went in before the > release was cut. It looks like this might

Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-23 Thread Andrew Pilloud
Please consider the vote for RC4 canceled. I'll quickly follow up with a new RC. Thanks for the complete testing everyone! Andrew On Tue, Apr 23, 2019 at 2:06 PM Reuven Lax wrote: > -1 > > we need to cherry pick pr/8325 and pr/8385 to fix the above issue > > On Tue, Apr 23, 2019 at 1:48 PM

Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-23 Thread Andrew Pilloud
I believe the breakage of Nexmark on Dataflow is https://issues.apache.org/jira/browse/BEAM-7002, which went in before the release was cut. It looks like this might be a release blocker based on the fix: https://github.com/apache/beam/pull/8325. The degraded performance is after the release is

Re: Hello from Hannah Jiang

2019-04-23 Thread Hannah Jiang
Thanks Aizhamal. Here is my user name: jiangxuehua1...@gmail.com Thanks, Hannah On Tue, Apr 23, 2019 at 1:25 PM Aizhamal Nurmamat kyzy wrote: > Welcome Hannah! Looking forward to your contributions :) > > Could you also provide your Jira username to the admins? > > Thank you, > > On Tue, Apr

Re: Hello from Hannah Jiang

2019-04-23 Thread Aizhamal Nurmamat kyzy
Welcome Hannah! Looking forward to your contributions :) Could you also provide your Jira username to the admins? Thank you, On Tue, Apr 23, 2019 at 12:59 PM Hannah Jiang wrote: > Hi everyone > > I joined Google recently and would work on Python portability part. I am > happy to be part of

Hello from Hannah Jiang

2019-04-23 Thread Hannah Jiang
Hi everyone I joined Google recently and would work on Python portability part. I am happy to be part of the community. Looking forward to working with all of you together. I have a minor request, can admin please give me access to JIRA? Thanks, Hannah

Re: [DISCUSS] Turn `WindowedValue` into `T` in the FnDataService and BeamFnDataClient interface definition

2019-04-23 Thread Kenneth Knowles
It seems to me that the most valuable code to share and keep up with is the Python/Go/etc SDK harness; they would need to be enhanced with new primitive operations. So you would want to depend directly and share the original proto-generated classes too, which Beam publishes as separate artifacts

Re: [DISCUSS] FLIP-38 Support python language in flink TableAPI

2019-04-23 Thread Stephan Ewen
Hi all! Below are my notes on the discussion last week on how to collaborate between Beam and Flink. The discussion was between Tyler, Kenn, Luke, Ahmed, Xiaowei, Shaoxuan, Jincheng, and me. This represents my understanding of the discussion, please augment this where I missed something or where

questions about BigQuery temp dataset

2019-04-23 Thread Chengxuan Wang
Hi, I am using Apache Beam python sdk (apache-beam==2.11.0) to run a dataflow job with BigQuerySource. Even though I checked the code, BigQueryReader will delete the temporary dataset after the query is done.

Re: Integration of python/portable runner tests for Samza runner

2019-04-23 Thread Boyuan Zhang
Hi Daniel, If you are interested in portable python pipeline validation, I think fn_api_runner_test would also help. On Tue, Apr 23, 2019 at 10:19 AM Pablo Estrada wrote: > This is

Re: Projects Can Apply Individually for Google Season of Docs

2019-04-23 Thread Pablo Estrada
That's excellent. Thanks Max! Let's hope we'll be selected. Best -P. On Tue, Apr 23, 2019, 5:36 AM Maximilian Michels wrote: > Both proposals for doc improvements sound great. Portability is an > obvious one and the capability matrix needs an update as well. > > I might be a bit late to the

Re: SNAPSHOTS have not been updated since february

2019-04-23 Thread Yifan Zou
We could send an email alert once the snapshots publishing fails. Alternatively, we have a job_PostRelease_NightlySnapshot to validate the healthy of the snapshots. It pulls the latest snapshots

Re: Python SDK timestamp precision

2019-04-23 Thread Robert Bradshaw
On Tue, Apr 23, 2019 at 4:20 PM Kenneth Knowles wrote: > > On Tue, Apr 23, 2019 at 5:48 AM Robert Bradshaw wrote: >> >> On Thu, Apr 18, 2019 at 12:23 AM Kenneth Knowles wrote: >> > >> > For Robert's benefit, I want to point out that my proposal is to support >> > femtosecond data, with

Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-23 Thread Ismaël Mejía
Etienne RC1 vote happened in 04/03 and there have not been any cherry picks on the spark runner afterwards so if there is a commit that degraded performance around 04/10 it is not part of the release we are voting, so please consider reverting your -1. However the issue you are reporting looks

Re: CVE audit gradle plugin

2019-04-23 Thread Etienne Chauchot
Hi,should I merge my branch https://github.com/echauchot/beam/tree/cve_audit_plugin to master to include this tool to the build system then ?It will not fail the build but add an audit task to it. EtienneLe vendredi 19 avril 2019 à 10:54 -0700, Lukasz Cwik a écrit : > Common Vulnerabilities and

Re: [VOTE] Release 2.12.0, release candidate #4

2019-04-23 Thread Etienne Chauchot
Hi guys ,I will vote -1 (binding) on this RC (although degradation is before RC4 cut date). I took a look at Nexmark graphs for the 3 major runners :- there seem to have functional regressions on Dataflow: https://apache-beam-testing.appspot.com/explore?dashboard=5647201107705856 . 13 queries

Re: Python SDK timestamp precision

2019-04-23 Thread Kenneth Knowles
Another brute force approach that I expect is not really that painful and allows optimal compactness in all cases: Support 3 precisions, with appropriate standard coders. Millis for unix timestamps, micros for compactness in int64, nanos for Java/Spanner/Proto/Pubsub* aka the max precision anyone

Re: Python SDK timestamp precision

2019-04-23 Thread Kenneth Knowles
On Tue, Apr 23, 2019 at 5:48 AM Robert Bradshaw wrote: > On Thu, Apr 18, 2019 at 12:23 AM Kenneth Knowles wrote: > > > > For Robert's benefit, I want to point out that my proposal is to support > femtosecond data, with femtosecond-scale windows, even if watermarks/event > timestamps/holds are

Re: Contributing Beam Kata (Java & Python)

2019-04-23 Thread Ismaël Mejía
Thanks for answering Lars, The 'interesting' part is that the tutorial has a full IDE integrated experience based on the Jetbrains edu platform [1]. So maybe interesting to see if it could make sense to have projects like this in the new trainings incubator project or if they became too platform

Re: Contributing Beam Kata (Java & Python)

2019-04-23 Thread Lars Francke
Thanks Ismaël. I must admit I'm a tad confused. What has JetBrains got to do with this? This looks pretty cool and specific to Beam though, or is this more generic? But yeah something along those lines could be interesting for hands-on type things in training. On Fri, Apr 19, 2019 at 12:10 PM

Re: Python SDK timestamp precision

2019-04-23 Thread Robert Bradshaw
On Thu, Apr 18, 2019 at 12:23 AM Kenneth Knowles wrote: > > For Robert's benefit, I want to point out that my proposal is to support > femtosecond data, with femtosecond-scale windows, even if watermarks/event > timestamps/holds are only millisecond precision. > > So the workaround once I have

Re: Projects Can Apply Individually for Google Season of Docs

2019-04-23 Thread Maximilian Michels
Both proposals for doc improvements sound great. Portability is an obvious one and the capability matrix needs an update as well. I might be a bit late to the party, but I'd like to help with the mentoring. I've filled out the mentor form. Thanks, Max On 22.04.19 23:32, Pablo Estrada wrote:

Re: [DISCUSS] Turn `WindowedValue` into `T` in the FnDataService and BeamFnDataClient interface definition

2019-04-23 Thread Maximilian Michels
Hi Jincheng, Copying code is a solution for the short term. In the long run I'd like the Fn services to be a library not only for the Beam portability layer but also for other projects which want to leverage it. We should thus make an effort to make it more generic/extensible where necessary

Re: [DISCUSS] Turn `WindowedValue` into `T` in the FnDataService and BeamFnDataClient interface definition

2019-04-23 Thread jincheng sun
Hi Reuven, I think you have provided an optional solution for other community which wants to take advantage of Beam's existing achievements. Thank you very much! I think the Flink community can choose to copy from Beam's code or choose to rely directly on the beam's class library. The Flink

Re: SNAPSHOTS have not been updated since february

2019-04-23 Thread Ismaël Mejía
Thanks a lot Yifan, just pulled them and everything working as expected +1 I wonder still if we may need some extra verification task to detect when those have not been updated (which can happen for both INFRA or other reasons). On Tue, Apr 23, 2019 at 4:44 AM Yifan Zou wrote: > > The Infra is

Re: Integration of python/portable runner tests for Samza runner

2019-04-23 Thread Maximilian Michels
Hi Daniel, Note that there is also Portable Validates Runner which runs Java portability tests. I don't know if you have integrated with that one already. Thanks, Max On 23.04.19 02:28, Ankur Goenka wrote: Hi Daniel, We use flinkCompatibilityMatrix [1] to check the Flink compatibility

Re: Beam Summit at ApacheCon

2019-04-23 Thread Maximilian Michels
Hi Austin, Thanks for the heads-up! I just want to highlight that this is a great chance for Beam. There will be a _dedicated_ Beam track which means that there is potential for lots of new people to learn about Beam. Of course, there will also be many people already involved in Beam. -Max

Re: Artifact staging in cross-language pipelines

2019-04-23 Thread Robert Bradshaw
I've been out, so coming a bit late to the discussion, but here's my thoughts. The expansion service absolutely needs to be able to provide the dependencies for the transform(s) it expands. It seems the default, foolproof way of doing this is via the environment, which can be a docker image with