Re: [PROPOSAL] Preparing for Beam 2.23.0 release

2020-06-23 Thread Valentyn Tymofieiev
Friendly reminder that the release cut is slated next week. If you are aware of *release-blocking* issues, please open a JIRA and set the "Fix version" to be 2.23.0. Please do not set "Fix version" for open non-blocking issues, instead set "Fix version" once the issue is actually resolved.

Re: Request for Java PR review

2020-06-23 Thread Chamikara Jayalath
Thanks. I'm taking a look. On Tue, Jun 23, 2020 at 3:07 AM Niel Markwick wrote: > Hey devs... > > I have 3 PRs sitting waiting for a code review to fix potential bugs (and > improve memory use) in SpannerIO. 2 small, and one quite large -- I would > really like these to be in 2.23... > >

Re: JIRA contributor permissions

2020-06-23 Thread Robert Burke
Welcome! I was going to suggest this when we finished up with your current PR. One the PMC members will grant your permissions. As you've already discovered, please mention me (@lostluck) on GitHubfor Go SDK changes, as I have the context for Go at present, and can merge PRs when they're ready.

JIRA contributor permissions

2020-06-23 Thread Brian Michalski
Greetings! I'm wading my way a few small Go SDK tickets. Can I have contributor permissions on JIRA? My username is bamnet. Thanks, ~Brian M

Re: Running Beam pipeline using Spark on YARN

2020-06-23 Thread Kyle Weaver
> So hopefully setting --spark-master-url to be yarn will work too. This is not supported. On Tue, Jun 23, 2020 at 2:58 PM Xinyu Liu wrote: > I am doing some prototyping on this too. I used spark-submit script > instead of the rest api. In my simple setup, I ran > SparkJobServerDriver.main()

Re: Canceling Jenkins builds when the update to PR makes prior build irrelevant

2020-06-23 Thread Kenneth Knowles
+1 to Andrew's analysis On Tue, Jun 23, 2020 at 12:13 PM Ahmet Altay wrote: > Would it be possible to cancel any running _Phrase or _Commit variants, if > either one of them is triggered? > > On Tue, Jun 23, 2020 at 10:41 AM Andrew Pilloud > wrote: > >> I believe we split _Commit and _Phrase

Re: Running Beam pipeline using Spark on YARN

2020-06-23 Thread Xinyu Liu
I am doing some prototyping on this too. I used spark-submit script instead of the rest api. In my simple setup, I ran SparkJobServerDriver.main() directly in the AM as a spark job, which will submit the python job to the default spark master url pointing to "local". I also use --files in the

Re: Running Beam pipeline using Spark on YARN

2020-06-23 Thread Kyle Weaver
Hi Kamil, there is a JIRA for this: https://issues.apache.org/jira/browse/BEAM-8970 It's theoretically possible but remains untested as far as I know :) As I indicated in a comment, you can set --output_executable_path to create a jar that you can then submit to yarn via spark-submit. If you can

Re: Canceling Jenkins builds when the update to PR makes prior build irrelevant

2020-06-23 Thread Ahmet Altay
Would it be possible to cancel any running _Phrase or _Commit variants, if either one of them is triggered? On Tue, Jun 23, 2020 at 10:41 AM Andrew Pilloud wrote: > I believe we split _Commit and _Phrase to work around a bug with job > filtering. For example, when you make a python change only

Re: Seasons of Technical Communications Project

2020-06-23 Thread Kyle Weaver
Hi Vikas, Thank you for the introduction and your interest to work on Apache Beam documentation with Season of Docs. To participate in the program you need to follow the guides here [1] [2]. If you are new to the program, we suggest: 1. Start by studying our proposed project ideas and

Seasons of Technical Communications Project

2020-06-23 Thread Vikas Wadhwa
Hi, Aizhamal: Good Morning! Through Google's initiative of Seasons of Docs 2020, I would like to take this opportunity to introduce myself and my intent to take up this 'Technical Communications' project with your organization. At a high level, I went through the initial details about your

Watermark-based trigger doesn't fire for 10+ minutes after message is received from Pub/Sub source

2020-06-23 Thread Alex Mordkovich
Hi Beam folks! I'm running a simple Java Beam pipeline on DirectRunner. The pipeline reads in messages from a Pub/Sub topic and aggregates them into windows: by processing time and by event time. The custom timestamp option isn't used, so the event time should be

Re: Canceling Jenkins builds when the update to PR makes prior build irrelevant

2020-06-23 Thread Andrew Pilloud
I believe we split _Commit and _Phrase to work around a bug with job filtering. For example, when you make a python change only the python tests are run based on the commit. We still want to be able to run the java jobs by trigger phrase if needed. There are also performance tests (Nexmark for

Re: Canceling Jenkins builds when the update to PR makes prior build irrelevant

2020-06-23 Thread Tyson Hamilton
+1 the ability to cancel in-flight jobs is worth deduplicating _Phrase and _Commit. I don't see a benefit for having both. On Tue, Jun 23, 2020 at 9:02 AM Luke Cwik wrote: > I think this is a great improvement to prevent the Jenkins queue from > growing too large and has been suggested in the

Running Beam pipeline using Spark on YARN

2020-06-23 Thread Kamil Wasilewski
Hi all, I'm trying to run a Beam pipeline using Spark on YARN. My pipeline is written in Python, so I need to use a portable runner. Does anybody know how I should configure job server parameters, especially --spark-master-url? Is there anything else I need to be aware of while using such setup?

Re: Match_Recognize Design Documentation

2020-06-23 Thread Rui Wang
Thank you Qihang. I have been hearing some confusions offline, so to highlight: Qihang's design doc is a commentable doc that is under https://s.apache.org/beam-sql-pattern-recognization. -Rui On Wed, Jun 17, 2020 at 5:39 AM Qihang Zeng wrote: > Dear Beam development community, > > Hi! I am

Re: Canceling Jenkins builds when the update to PR makes prior build irrelevant

2020-06-23 Thread Luke Cwik
I think this is a great improvement to prevent the Jenkins queue from growing too large and has been suggested in the past but we were unable to do due to difficulty with the version of the ghrpb plugin that was used at the time. I know that we created different variants of the tests because we

Re: On Auto-creating GCS buckets on behalf of users

2020-06-23 Thread David Cavazos
I like the idea of simplifying the user experience by automating part of the initial setup. On the other hand, I see why silently creating billed resources like a GCS bucket could be an issue. I don't think creating an empty bucket is an issue since it doesn't incur any charges yet, but at least

Canceling Jenkins builds when the update to PR makes prior build irrelevant

2020-06-23 Thread Tobiasz Kędzierski
Hi everyone, I was investigating the possibility of canceling Jenkins builds when the update to PR makes prior build irrelevant. (related to https://issues.apache.org/jira/browse/BEAM-3105) In the `GitHub Pull Request Builder Jenkins plugin [ghprb-plugin] there is a hidden option `Cancel build on

Request for Java PR review

2020-06-23 Thread Niel Markwick
Hey devs... I have 3 PRs sitting waiting for a code review to fix potential bugs (and improve memory use) in SpannerIO. 2 small, and one quite large -- I would really like these to be in 2.23... https://github.com/apache/beam/pulls/nielm Would someone be willing to have a look? Thanks! --