Re: Automation for Jira

2020-10-19 Thread Brian Hulette
Can we revisit #5 and #6? We have 8 open unassigned P0s [1] and 96 open
unassigned P1s [2]. Auto-assigning to a pool of people could be a good way
to get them triaged. A couple of questions:
1) Can we actually do this with rules in Jira?
2) Who would be in the pool of stuckees and how could it be maintained? All
committers seems too broad. Maybe there could be a "beam-triagers" group
that people can opt-in to?

Brian

[1] https://issues.apache.org/jira/browse/BEAM-10270?filter=12349888
[2] https://issues.apache.org/jira/browse/BEAM-9154?filter=12349889

On Wed, Jun 17, 2020 at 8:43 PM Kenneth Knowles  wrote:

>
>
> On Wed, Jun 17, 2020 at 8:40 PM Kenneth Knowles  wrote:
>
>>
>> On Wed, Jun 17, 2020 at 3:13 PM Robert Burke  wrote:
>>
>>> It may be very easy, but commenting in addition to self assignment is
>>> also not yet part of the official process, while the
>>> assignment/unassignment is already visible to watchers based on deluge of
>>> emails I've been getting as older issues have been unassigned from me.
>>>
>>> I'd say assignment needs to be a signal to the automation based on that
>>> experience, if it isn't already.
>>>
>>
>> Very good point. The clock should start when someone is assigned an
>> issue, and reset when it changes hands. Just did a minute of quick research
>> and it does look like this may not be expressible in JQL, so we might have
>> to go with time since any change.
>>
>
> I've changed the "stale-assigned" rule from last public comment to last
> update. TBH I don't know what falls under the category of "update" but I
> expect it is the greatest overapproximation available.
>
> Kenn
>
>
>> That said i agree with Kenn that the automatic touch by the bot isn't
>>> sufficient to determine if the person assigned the issue is actually
>>> working on it or not. The bot is only looking for a matching JIRA ID on the
>>> PR title nd it's not checking if said PR is authored by the person to whom
>>> the JIRA is assigned.
>>>
>>> I'm personally bad at closing issues after a resolution PR, but that
>>> can't be made automatic anyway. I've been reminding authors to resolve the
>>> associated JIRA as appropriate to build the habit.
>>>
>>
>> Another issue in vanilla JQL: cannot search for Jiras in this state and
>> at least ping them. You need a human to decide the PR really finished it,
>> for sure. There's add ons we could go with to do more but I wanted to get
>> some more experience.
>>
>> Kenn
>>
>>
>>> On Wed, Jun 17, 2020, 1:57 PM Kenneth Knowles  wrote:
>>>
 I'm sure this is possible. I made a (personal) call to not do it. I
 think using words in a comment to communicate is the most important thing.
 I don't want automated updates to reset things. I definitely don't want to
 reset it unless the action causes a notification to watchers. Silently
 grabbing and fixing a bug will very rarely take long enough that it gets
 unassigned or downgraded. And anyhow removing the label and commenting
 "working on it" is very easy.

 These are just my thoughts. I'm happy to do whatever the community
 wants.

 Kenn

 On Tue, Jun 16, 2020 at 12:21 PM Brian Hulette 
 wrote:

> Sorry if you already looked into this, but is it possible to reset the
> counter based on anything in the "activity" tab? It looks like ASF GitHub
> Bot is always doing things whenever there's activity on a linked PR. So we
> wouldn't need to call out to GitHub to check for PR activity.
>
> On Tue, Jun 16, 2020 at 11:49 AM Kenneth Knowles 
> wrote:
>
>> Yes, only public comments reset the counter.
>>
>> On Mon, Jun 15, 2020 at 6:57 PM Udi Meiri  wrote:
>>
>>> Interesting: you could consider the JIRA as active as long as the
>>> linked PRs are open.
>>>
>>> On Mon, Jun 15, 2020 at 2:28 PM Luke Cwik  wrote:
>>>
 One thing I noticed is that links being added to issues
 automatically (e.g. a PR is opened that tags something) doesn't reset 
 the
 activity counter so things are marked stale even though there are PRs
 opened for the issue recently.

 On Thu, Jun 11, 2020 at 10:37 AM Kenneth Knowles 
 wrote:

> Yes, my inbox is hit as well. I'm enjoying going through some old
> bugs actually. One takeaway is that we have a lot of early Jiras that 
> are
> still relevant, and also that there are a lot of duplicates. I think 
> some
> automation to help find duplicates might be helpful.
>
> Also, some accidental automation humor:
> https://issues.apache.org/jira/browse/BEAM-6414
>
> Kenn
>
> On Tue, Jun 2, 2020 at 8:39 AM Brian Hulette 
> wrote:
>
>> RIP my inbox :)
>> This is overwhelming, but I think it will be very good. Thanks
>> for setting this up Kenn.
>>
>> Brian
>>

Re: [VOTE] Release 2.25.0, release candidate #1

2020-10-19 Thread Kyle Weaver
> We should update the release guide to make this explicit for the person
preparing the release so this does not happen again and eventually include
some validation for this in the build.

Instructions in the release guide can be easily missed. We should
prioritize adding a version check to all the relevant build scripts so it
becomes a hard failure if Java 11 is used.

On Mon, Oct 19, 2020 at 3:11 PM Robin Qiu  wrote:

> Thank you all for the feedback! I will work on a RC2 to address these
> problems.
>
> On Mon, Oct 19, 2020 at 7:38 AM Ismaël Mejía  wrote:
>
>> -1
>>
>> > * Java artifacts were built with Maven 3.5.3 and OpenJDK/Oracle JDK
>> 11.0.8.
>>
>> As from discussion on 2.24.0 RC1 we MUST build Java artifacts with Java 8
>> otherwise we will not have guaranteed compatibility with Java 8.
>> We should update the release guide to make this explicit for the person
>> preparing the release so this does not happen again and eventually include
>> some validation for this in the build.
>>
>> I validated that this is broken the same way as before by running a
>> pipeline with Direct runner using the 2.25.0 jars inside of a Java 8 docker.
>> The Exception is the same.
>>
>> 2020-10-19 16:14:23,427 [direct-runner-worker] ERROR
>> org.apache.beam.runners.direct.DirectTransformExecutor  - Error occurred
>> within org.apache.beam.runners.direct.DirectTransformExecutor@6babef80
>> java.lang.NoSuchMethodError:
>> java.nio.ByteBuffer.clear()Ljava/nio/ByteBuffer;
>> at
>> org.apache.beam.sdk.util.BufferedElementCountingOutputStream.outputBuffer(BufferedElementCountingOutputStream.java:197)
>> at
>> org.apache.beam.sdk.util.BufferedElementCountingOutputStream.flush(BufferedElementCountingOutputStream.java:180)
>> at
>> org.apache.beam.sdk.util.BufferedElementCountingOutputStream.finish(BufferedElementCountingOutputStream.java:119)
>> at
>> org.apache.beam.sdk.coders.IterableLikeCoder.encode(IterableLikeCoder.java:127)
>> at
>> org.apache.beam.sdk.coders.IterableLikeCoder.encode(IterableLikeCoder.java:60)
>> at org.apache.beam.sdk.coders.Coder.encode(Coder.java:136)
>>
>>
>> On Sat, Oct 17, 2020 at 3:57 AM Ahmet Altay  wrote:
>>
>>> I verified python quickstarts. There is a minor issue and I will update
>>> my vote after that.
>>>
>>> Python batch pipelines on Dataflow are failing with the following error:
>>> "RuntimeError: Beam SDK base version 2.25.0 does not match Dataflow Python
>>> worker version 2.25.0.dev. Please check Dataflow worker startup logs
>>> and make sure that correct version of Beam SDK is installed."
>>>
>>> Same issue happened during 2.24.0 and was fixed quickly. We may need to
>>> update the release guide to prevent this error in the future. (/cc +Daniel
>>> Oliveira  and +Valentyn Tymofieiev
>>>  fixed the issue for 2.24.0).
>>>
>>> Ahmet
>>>
>>> On Fri, Oct 16, 2020 at 2:36 PM Robin Qiu  wrote:
>>>
 Hi everyone,
 Please review and vote on the release candidate #1 for the version
 2.25.0, as follows:
 [ ] +1, Approve the release
 [ ] -1, Do not approve the release (please provide specific comments)


 The complete staging area is available for your review, which includes:
 * JIRA release notes [1],
 * the official Apache source release to be deployed to dist.apache.org
 [2], which is signed with the key with fingerprint
 AD70476B9D1AF3EFEC2208165952E71AACAF911D [3],
 * all artifacts to be deployed to the Maven Central Repository [4],
 * source code tag "v2.25.0-RC1" [5],
 * website pull request listing the release [6], publishing the API
 reference manual [7], and the blog post [8].
 * Java artifacts were built with Maven 3.5.3 and OpenJDK/Oracle JDK
 11.0.8.
 * Python artifacts are deployed along with the source release to the
 dist.apache.org [2].
 * Validation sheet with a tab for 2.25.0 release to help with
 validation [9].
 * Docker images published to Docker Hub [10].

 The vote will be open for at least 72 hours. It is adopted by majority
 approval, with at least 3 PMC affirmative votes.

 Thanks,
 Robin

 [1]
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12347147
 [2] https://dist.apache.org/repos/dist/dev/beam/2.25.0/
 [3] https://dist.apache.org/repos/dist/release/beam/KEYS
 [4]
 https://repository.apache.org/content/repositories/orgapachebeam-1139/
 [5] https://github.com/apache/beam/tree/v2.25.0-RC1
 [6] https://github.com/apache/beam/pull/13130
 [7] https://github.com/apache/beam-site/pull/608
 [8] https://github.com/apache/beam/pull/13131
 [9]
 https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1494345946
 [10] https://hub.docker.com/search?q=apache%2Fbeam=image

>>>


Re: [VOTE] Release 2.25.0, release candidate #1

2020-10-19 Thread Robin Qiu
Thank you all for the feedback! I will work on a RC2 to address these
problems.

On Mon, Oct 19, 2020 at 7:38 AM Ismaël Mejía  wrote:

> -1
>
> > * Java artifacts were built with Maven 3.5.3 and OpenJDK/Oracle JDK
> 11.0.8.
>
> As from discussion on 2.24.0 RC1 we MUST build Java artifacts with Java 8
> otherwise we will not have guaranteed compatibility with Java 8.
> We should update the release guide to make this explicit for the person
> preparing the release so this does not happen again and eventually include
> some validation for this in the build.
>
> I validated that this is broken the same way as before by running a
> pipeline with Direct runner using the 2.25.0 jars inside of a Java 8 docker.
> The Exception is the same.
>
> 2020-10-19 16:14:23,427 [direct-runner-worker] ERROR
> org.apache.beam.runners.direct.DirectTransformExecutor  - Error occurred
> within org.apache.beam.runners.direct.DirectTransformExecutor@6babef80
> java.lang.NoSuchMethodError:
> java.nio.ByteBuffer.clear()Ljava/nio/ByteBuffer;
> at
> org.apache.beam.sdk.util.BufferedElementCountingOutputStream.outputBuffer(BufferedElementCountingOutputStream.java:197)
> at
> org.apache.beam.sdk.util.BufferedElementCountingOutputStream.flush(BufferedElementCountingOutputStream.java:180)
> at
> org.apache.beam.sdk.util.BufferedElementCountingOutputStream.finish(BufferedElementCountingOutputStream.java:119)
> at
> org.apache.beam.sdk.coders.IterableLikeCoder.encode(IterableLikeCoder.java:127)
> at
> org.apache.beam.sdk.coders.IterableLikeCoder.encode(IterableLikeCoder.java:60)
> at org.apache.beam.sdk.coders.Coder.encode(Coder.java:136)
>
>
> On Sat, Oct 17, 2020 at 3:57 AM Ahmet Altay  wrote:
>
>> I verified python quickstarts. There is a minor issue and I will update
>> my vote after that.
>>
>> Python batch pipelines on Dataflow are failing with the following error:
>> "RuntimeError: Beam SDK base version 2.25.0 does not match Dataflow Python
>> worker version 2.25.0.dev. Please check Dataflow worker startup logs and
>> make sure that correct version of Beam SDK is installed."
>>
>> Same issue happened during 2.24.0 and was fixed quickly. We may need to
>> update the release guide to prevent this error in the future. (/cc +Daniel
>> Oliveira  and +Valentyn Tymofieiev
>>  fixed the issue for 2.24.0).
>>
>> Ahmet
>>
>> On Fri, Oct 16, 2020 at 2:36 PM Robin Qiu  wrote:
>>
>>> Hi everyone,
>>> Please review and vote on the release candidate #1 for the version
>>> 2.25.0, as follows:
>>> [ ] +1, Approve the release
>>> [ ] -1, Do not approve the release (please provide specific comments)
>>>
>>>
>>> The complete staging area is available for your review, which includes:
>>> * JIRA release notes [1],
>>> * the official Apache source release to be deployed to dist.apache.org
>>> [2], which is signed with the key with fingerprint
>>> AD70476B9D1AF3EFEC2208165952E71AACAF911D [3],
>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>> * source code tag "v2.25.0-RC1" [5],
>>> * website pull request listing the release [6], publishing the API
>>> reference manual [7], and the blog post [8].
>>> * Java artifacts were built with Maven 3.5.3 and OpenJDK/Oracle JDK
>>> 11.0.8.
>>> * Python artifacts are deployed along with the source release to the
>>> dist.apache.org [2].
>>> * Validation sheet with a tab for 2.25.0 release to help with validation
>>> [9].
>>> * Docker images published to Docker Hub [10].
>>>
>>> The vote will be open for at least 72 hours. It is adopted by majority
>>> approval, with at least 3 PMC affirmative votes.
>>>
>>> Thanks,
>>> Robin
>>>
>>> [1]
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12347147
>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.25.0/
>>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>> [4]
>>> https://repository.apache.org/content/repositories/orgapachebeam-1139/
>>> [5] https://github.com/apache/beam/tree/v2.25.0-RC1
>>> [6] https://github.com/apache/beam/pull/13130
>>> [7] https://github.com/apache/beam-site/pull/608
>>> [8] https://github.com/apache/beam/pull/13131
>>> [9]
>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1494345946
>>> [10] https://hub.docker.com/search?q=apache%2Fbeam=image
>>>
>>


Re: Please add me to the mailing list

2020-10-19 Thread Pablo Estrada
Adding Mike - as Luke said, send an e-mail to dev-subscr...@beam.apache.org to
subscribe as per https://beam.apache.org/community/contact-us/


On Mon, Oct 19, 2020 at 9:36 AM Luke Cwik  wrote:

> Send an e-mail to dev-subscr...@beam.apache.org to subscribe as per
> https://beam.apache.org/community/contact-us/
>
> On Mon, Oct 19, 2020 at 9:24 AM Mike Lo  wrote:
>
>> Thanks!
>>
>> Best,
>> Mike
>>
>> PhD, Bioengineering
>> San Francisco Bay Area
>> Mobile: 510-710-4906 <(510)%20710-4906>
>> LinkedIn  | Website
>> 
>>
>


Re: [RESULT] [VOTE] JupyterLab Sidepanel extension release v1.0.0 for BEAM-10545 RC #1

2020-10-19 Thread Ning Kang
The package has been published to NPM:
https://www.npmjs.com/package/apache-beam-jupyterlab-sidepanel

Verified that JupyterLab v2+ users can use `jupyter labextension install
apache-beam-jupyterlab-sidepanel` to install it now.

Cheers!

On Mon, Oct 19, 2020 at 11:19 AM Ning Kang  wrote:

> I'm happy to announce that we have unanimously approved this release.
>
> There are 3 approving votes, all of which are binding:
> * Ahmet Altay
> * Pablo Estrada
> * Robert Bradshaw
>
> There are no disapproving votes.
>
> Thanks everyone!
>
>
> On Fri, Oct 16, 2020 at 2:05 PM Robert Bradshaw 
> wrote:
>
>> Thanks, Ning and Ahmet.
>>
>> +1 (binding) Approve the release.
>>
>> On Fri, Oct 16, 2020 at 1:34 PM Ning Kang  wrote:
>>
>>> Sorry, if you cannot see the missing thread history in the previous
>>> thread, here is another copy:
>>>
>>> On Fri, Oct 16, 2020 at 9:55 AM Robert Bradshaw 
>>> wrote:
>>>
 Thanks.

 +1 (binding) to this release.

 On Thu, Oct 15, 2020 at 7:06 PM Ahmet Altay  wrote:

> Here you go:
> https://dist.apache.org/repos/dist/dev/beam/extensions/jupyterlab-sidepanel/v1.0.0/
>
> On Thu, Oct 15, 2020 at 5:11 PM Robert Bradshaw 
> wrote:
>
>> If we can stage the sources to dist/dev that sounds good to me.
>>
>> On Thu, Oct 15, 2020 at 4:57 PM Ning Kang  wrote:
>>
>>> +1 to Ahmet's suggestion.
>>>
>>> I've taken a look at the process
>>> 
>>>  used
>>> by vendored artifacts and summarized below commands to stage the source
>>> code to dist/dev
>>>
>>> extension=jupyterlab-sidepanel
>>>
>>> version=v1.0.0
>>>
>>> tag=${extension}-${version}
>>>
>>>
>>> svn co https://dist.apache.org/repos/dist/dev/beam
>>>
>>> mkdir -p beam/extensions/${extension}/${version}
>>>
>>> pushd beam/extensions/${extension}/${version}
>>>
>>> curl -o apache-beam-${tag}-source-release.zip https://
>>> github.com/apache/beam/archive/${tag}.zip
>>> 
>>>
>>> gpg --armor --detach-sig apache-beam-${tag}-source-release.zip
>>>
>>> sha512sum apache-beam-${tag}-source-release.zip > apache-beam-${tag
>>> }-source-release.zip.sha512
>>>
>>> # If sha512sum command is not found, on mac, run brew install
>>> coreutils;
>>>
>>> # on linux, run apt-get install coreutils
>>>
>>> popd
>>>
>>> pushd beam
>>>
>>> # For the first time adding the directory with its contents
>>>
>>> svn add extensions
>>>
>>> # For future versions, use below to add
>>>
>>> # svn add extensions/${extension}/${version}
>>>
>>> svn commit
>>>
>>> Please feel free to comment on the directory structure.
>>>
>>> Ahmet, if everything looks good, could you please help me execute
>>> the commands with your GPG key to stage the source to dist/dev?
>>> And once we publish the extension to NPM, we'll move the source from
>>> dist/dev to dist/release following the same process
>>> 
>>>  to
>>> vendored artifact's releases.
>>>
>>> I'll document the release process with release history in the Beam
>>> repo once the release is done.
>>>
>>> Thanks!
>>>
>>> On Thu, Oct 15, 2020 at 1:06 PM Ahmet Altay 
>>> wrote:
>>>
 This is similar to the vendored dependencies release. For that we
 vote on the artifacts. commit hash, and staged source distribution on
 dist/dev[1]. And then the same source distribution is promoted to
 dist/release [2]. We can follow the same process and stage a source
 distribution to dist.

 [1]
 https://lists.apache.org/thread.html/rea4a27c47529a27936ab2c51162c8e532b8b625c4d70c4f7f485c7cd%40%3Cdev.beam.apache.org%3E
 [2] https://dist.apache.org/repos/dist/release/beam/vendor/

 On Thu, Oct 15, 2020 at 12:17 PM Robert Bradshaw <
 rober...@google.com> wrote:

> I'm thinking specifically of
>
>
> https://incubator.apache.org/guides/distribution.html#release_platforms
>
> In addition to the Apache mirror system incubating projects may
> distribute artifacts on other platforms as long as they follow these
> general guidelines:
> * Source releases must be placed in the Apache mirror system.
>
>
> On Thu, Oct 15, 2020 at 12:05 PM Ning Kang 
> wrote:
>
>> Thanks Robert, I didn't know the existence of this document.
>>
>> Looks like the only thing potentially missing is the incubation
>> disclaimer.
>> NPM should be the only 

[RESULT] [VOTE] JupyterLab Sidepanel extension release v1.0.0 for BEAM-10545 RC #1

2020-10-19 Thread Ning Kang
I'm happy to announce that we have unanimously approved this release.

There are 3 approving votes, all of which are binding:
* Ahmet Altay
* Pablo Estrada
* Robert Bradshaw

There are no disapproving votes.

Thanks everyone!


On Fri, Oct 16, 2020 at 2:05 PM Robert Bradshaw  wrote:

> Thanks, Ning and Ahmet.
>
> +1 (binding) Approve the release.
>
> On Fri, Oct 16, 2020 at 1:34 PM Ning Kang  wrote:
>
>> Sorry, if you cannot see the missing thread history in the previous
>> thread, here is another copy:
>>
>> On Fri, Oct 16, 2020 at 9:55 AM Robert Bradshaw 
>> wrote:
>>
>>> Thanks.
>>>
>>> +1 (binding) to this release.
>>>
>>> On Thu, Oct 15, 2020 at 7:06 PM Ahmet Altay  wrote:
>>>
 Here you go:
 https://dist.apache.org/repos/dist/dev/beam/extensions/jupyterlab-sidepanel/v1.0.0/

 On Thu, Oct 15, 2020 at 5:11 PM Robert Bradshaw 
 wrote:

> If we can stage the sources to dist/dev that sounds good to me.
>
> On Thu, Oct 15, 2020 at 4:57 PM Ning Kang  wrote:
>
>> +1 to Ahmet's suggestion.
>>
>> I've taken a look at the process
>> 
>>  used
>> by vendored artifacts and summarized below commands to stage the source
>> code to dist/dev
>>
>> extension=jupyterlab-sidepanel
>>
>> version=v1.0.0
>>
>> tag=${extension}-${version}
>>
>>
>> svn co https://dist.apache.org/repos/dist/dev/beam
>>
>> mkdir -p beam/extensions/${extension}/${version}
>>
>> pushd beam/extensions/${extension}/${version}
>>
>> curl -o apache-beam-${tag}-source-release.zip https://
>> github.com/apache/beam/archive/${tag}.zip
>> 
>>
>> gpg --armor --detach-sig apache-beam-${tag}-source-release.zip
>>
>> sha512sum apache-beam-${tag}-source-release.zip > apache-beam-${tag}-
>> source-release.zip.sha512
>>
>> # If sha512sum command is not found, on mac, run brew install
>> coreutils;
>>
>> # on linux, run apt-get install coreutils
>>
>> popd
>>
>> pushd beam
>>
>> # For the first time adding the directory with its contents
>>
>> svn add extensions
>>
>> # For future versions, use below to add
>>
>> # svn add extensions/${extension}/${version}
>>
>> svn commit
>>
>> Please feel free to comment on the directory structure.
>>
>> Ahmet, if everything looks good, could you please help me execute the
>> commands with your GPG key to stage the source to dist/dev?
>> And once we publish the extension to NPM, we'll move the source from
>> dist/dev to dist/release following the same process
>> 
>>  to
>> vendored artifact's releases.
>>
>> I'll document the release process with release history in the Beam
>> repo once the release is done.
>>
>> Thanks!
>>
>> On Thu, Oct 15, 2020 at 1:06 PM Ahmet Altay  wrote:
>>
>>> This is similar to the vendored dependencies release. For that we
>>> vote on the artifacts. commit hash, and staged source distribution on
>>> dist/dev[1]. And then the same source distribution is promoted to
>>> dist/release [2]. We can follow the same process and stage a source
>>> distribution to dist.
>>>
>>> [1]
>>> https://lists.apache.org/thread.html/rea4a27c47529a27936ab2c51162c8e532b8b625c4d70c4f7f485c7cd%40%3Cdev.beam.apache.org%3E
>>> [2] https://dist.apache.org/repos/dist/release/beam/vendor/
>>>
>>> On Thu, Oct 15, 2020 at 12:17 PM Robert Bradshaw <
>>> rober...@google.com> wrote:
>>>
 I'm thinking specifically of


 https://incubator.apache.org/guides/distribution.html#release_platforms

 In addition to the Apache mirror system incubating projects may
 distribute artifacts on other platforms as long as they follow these
 general guidelines:
 * Source releases must be placed in the Apache mirror system.


 On Thu, Oct 15, 2020 at 12:05 PM Ning Kang 
 wrote:

> Thanks Robert, I didn't know the existence of this document.
>
> Looks like the only thing potentially missing is the incubation
> disclaimer.
> NPM should be the only channel for distribution. And normally, a
> jupyter user would install extensions through `jupyter` commands. They
> wouldn't even use the `npm` command directly.
>
> Looking at
> https://incubator.apache.org/guides/branding.html#disclaimers,
> since this extension is part of Beam and we are not incubating 
> something
> new, we might not even need an incubation disclaimer.

Re: [DISCUSS][BEAM-10670] Migrating BoundedSource/UnboundedSource to execute as a Splittable DoFn for non-portable Java runners

2020-10-19 Thread Robert Burke
+1 to that. The programming guide is generally assumed to be up to date
which can't be said for arbitrary blog posts. Likely more discoverable too.

On Mon, Oct 19, 2020, 10:17 AM Luke Cwik  wrote:

> +Rose Nguyen  suggested that instead of just a blog,
> we should add the majority of the current blog's content to the core
> programming guide and either drop the blog and/or have a much smaller blog
> that links to the docs.
>
> I think this is a great idea, what do others think?
>
> On Wed, Oct 14, 2020 at 10:51 AM Luke Cwik  wrote:
>
>> Thanks Alexey, that is correct.
>>
>> On Wed, Oct 14, 2020 at 10:33 AM Alexey Romanenko <
>> aromanenko@gmail.com> wrote:
>>
>>> Thanks Luke, just I guess that the proper link should be this one:
>>>
>>> https://docs.google.com/document/d/1kpn0RxqZaoacUPVSMYhhnfmlo8fGT-p50fEblaFr2HE
>>>
>>> On 13 Oct 2020, at 00:23, Luke Cwik  wrote:
>>>
>>> I have a draft[1] off the blog ready. Please take a look.
>>>
>>> 1:
>>> http://doc/1kpn0RxqZaoacUPVSMYhhnfmlo8fGT-p50fEblaFr2HE#heading=h.tbab2n97o3eo
>>>
>>> On Mon, Oct 5, 2020 at 4:28 PM Luke Cwik  wrote:
>>>


 On Mon, Oct 5, 2020 at 3:45 PM Kenneth Knowles  wrote:

>
>
> On Mon, Oct 5, 2020 at 2:44 PM Luke Cwik  wrote:
>
>> For the 2.25 release the Java Direct, Flink, Jet, Samza, Twister2
>> will use SDF powered Read transforms. Users can opt-out
>> with --experiments=use_deprecated_read.
>>
>
> Huzzah! In our release notes maybe be clear about the expectations for
> users:
>
> Done in https://github.com/apache/beam/pull/13015


>  - semantics are expected to be the same: file bugs for any change in
> results
>  - perf may vary: file bugs or write to user@
>
> I was unable to get Spark done for 2.25 as I found out that Spark
>> streaming doesn't support watermark holds[1]. If someone knows more about
>> the watermark system in Spark I could use some guidance here as I 
>> believe I
>> have a version of unbounded SDF support written for Spark (I get all the
>> expected output from tests, just that watermarks aren't being held back 
>> so
>> PAssert fails).
>>
>
> Spark's watermarks are not comparable to Beam's. The rule as I
> understand it is that any data that is later than `max(seen timestamps) -
> allowedLateness` is dropped. One difference is that dropping is relative 
> to
> the watermark instead of expiring windows, like early versions of Beam. 
> The
> other difference is that it track the latest event (some call it a "high
> water mark" because it is the highest datetime value seen) where Beam's
> watermark is an approximation of the earliest (some call it a "low water
> mark" because it is a guarantee that it will not dip lower). When I 
> chatted
> about this with Amit in the early days, it was necessary to implement a
> Beam-style watermark using Spark state. I think that may still be the 
> case,
> for correct results.
>
>
 In the Spark implementation I saw that watermark holds weren't wired at
 all to control Sparks watermarks and this was causing triggers to fire too
 early.


> Also, I started a doc[2] to produce an updated blog post since the
>> original SplittableDoFn blog from 2017 is out of date[3]. I was thinking 
>> of
>> making this a new blog post and having the old blog post point to it. We
>> could also remove the old blog post and or update it. Any thoughts?
>>
>
> New blog post w/ pointer from the old one.
>
> Finally, I have a clean-up PR[4] that pushes the Read -> primitive
>> Read expansion into each of the runners instead of having it within Read
>> transform within beam-sdks-java-core.
>>
>
> Approved! I did CC a bunch of runner authors already. I think the
> important thing is if a default changes we should be sure everyone is OK
> with the perf changes, and everyone is confident that no incorrect results
> are produced. The abstractions between sdk-core, runners-core-*, and
> individual runners is important to me:
>
>  - The SDK's job is to produce a portable, un-tweaked pipeline so
> moving flags out of SDK core (and IOs) ASAP is super important.
>  - The runner's job is to execute that pipeline, if they can, however
> they want. If a runner wants to run Read transforms differently/directly
> that is fine. If a runner is incapable of supporting SDF, then Read is
> better than nothing. Etc.
>  - The runners-core-* job is to just be internal libraries for runner
> authors to share code, and should not make any decisions about the Beam
> model, etc.
>
> Kenn
>
> 1: https://github.com/apache/beam/pull/12603
>> 2: http://doc/1kpn0RxqZaoacUPVSMYhhnfmlo8fGT-p50fEblaFr2HE
>> 3: https://beam.apache.org/blog/splittable-do-fn/
>> 4: 

Re: [DISCUSS][BEAM-10670] Migrating BoundedSource/UnboundedSource to execute as a Splittable DoFn for non-portable Java runners

2020-10-19 Thread Luke Cwik
+Rose Nguyen  suggested that instead of just a blog,
we should add the majority of the current blog's content to the core
programming guide and either drop the blog and/or have a much smaller blog
that links to the docs.

I think this is a great idea, what do others think?

On Wed, Oct 14, 2020 at 10:51 AM Luke Cwik  wrote:

> Thanks Alexey, that is correct.
>
> On Wed, Oct 14, 2020 at 10:33 AM Alexey Romanenko <
> aromanenko@gmail.com> wrote:
>
>> Thanks Luke, just I guess that the proper link should be this one:
>>
>> https://docs.google.com/document/d/1kpn0RxqZaoacUPVSMYhhnfmlo8fGT-p50fEblaFr2HE
>>
>> On 13 Oct 2020, at 00:23, Luke Cwik  wrote:
>>
>> I have a draft[1] off the blog ready. Please take a look.
>>
>> 1:
>> http://doc/1kpn0RxqZaoacUPVSMYhhnfmlo8fGT-p50fEblaFr2HE#heading=h.tbab2n97o3eo
>>
>> On Mon, Oct 5, 2020 at 4:28 PM Luke Cwik  wrote:
>>
>>>
>>>
>>> On Mon, Oct 5, 2020 at 3:45 PM Kenneth Knowles  wrote:
>>>


 On Mon, Oct 5, 2020 at 2:44 PM Luke Cwik  wrote:

> For the 2.25 release the Java Direct, Flink, Jet, Samza, Twister2 will
> use SDF powered Read transforms. Users can opt-out
> with --experiments=use_deprecated_read.
>

 Huzzah! In our release notes maybe be clear about the expectations for
 users:

 Done in https://github.com/apache/beam/pull/13015
>>>
>>>
  - semantics are expected to be the same: file bugs for any change in
 results
  - perf may vary: file bugs or write to user@

 I was unable to get Spark done for 2.25 as I found out that Spark
> streaming doesn't support watermark holds[1]. If someone knows more about
> the watermark system in Spark I could use some guidance here as I believe 
> I
> have a version of unbounded SDF support written for Spark (I get all the
> expected output from tests, just that watermarks aren't being held back so
> PAssert fails).
>

 Spark's watermarks are not comparable to Beam's. The rule as I
 understand it is that any data that is later than `max(seen timestamps) -
 allowedLateness` is dropped. One difference is that dropping is relative to
 the watermark instead of expiring windows, like early versions of Beam. The
 other difference is that it track the latest event (some call it a "high
 water mark" because it is the highest datetime value seen) where Beam's
 watermark is an approximation of the earliest (some call it a "low water
 mark" because it is a guarantee that it will not dip lower). When I chatted
 about this with Amit in the early days, it was necessary to implement a
 Beam-style watermark using Spark state. I think that may still be the case,
 for correct results.


>>> In the Spark implementation I saw that watermark holds weren't wired at
>>> all to control Sparks watermarks and this was causing triggers to fire too
>>> early.
>>>
>>>
 Also, I started a doc[2] to produce an updated blog post since the
> original SplittableDoFn blog from 2017 is out of date[3]. I was thinking 
> of
> making this a new blog post and having the old blog post point to it. We
> could also remove the old blog post and or update it. Any thoughts?
>

 New blog post w/ pointer from the old one.

 Finally, I have a clean-up PR[4] that pushes the Read -> primitive Read
> expansion into each of the runners instead of having it within Read
> transform within beam-sdks-java-core.
>

 Approved! I did CC a bunch of runner authors already. I think the
 important thing is if a default changes we should be sure everyone is OK
 with the perf changes, and everyone is confident that no incorrect results
 are produced. The abstractions between sdk-core, runners-core-*, and
 individual runners is important to me:

  - The SDK's job is to produce a portable, un-tweaked pipeline so
 moving flags out of SDK core (and IOs) ASAP is super important.
  - The runner's job is to execute that pipeline, if they can, however
 they want. If a runner wants to run Read transforms differently/directly
 that is fine. If a runner is incapable of supporting SDF, then Read is
 better than nothing. Etc.
  - The runners-core-* job is to just be internal libraries for runner
 authors to share code, and should not make any decisions about the Beam
 model, etc.

 Kenn

 1: https://github.com/apache/beam/pull/12603
> 2: http://doc/1kpn0RxqZaoacUPVSMYhhnfmlo8fGT-p50fEblaFr2HE
> 3: https://beam.apache.org/blog/splittable-do-fn/
> 4: https://github.com/apache/beam/pull/13006
>
>
> On Fri, Aug 28, 2020 at 1:45 AM Maximilian Michels 
> wrote:
>
>> Thanks Luke! I've had a pass.
>>
>> -Max
>>
>> On 28.08.20 01:22, Luke Cwik wrote:
>> > As an update.
>> >
>> > Direct and Twister2 are done.
>> > Samza: is ready for review[1].
>> > Flink: 

Re: [Proposal] Website Revamp project

2020-10-19 Thread Robert Bradshaw
Welcome. Looking forward to the website improvements.

I would like to call out that it'd be a good idea to have a plan for how
this can be developed/reviewed incrementally. We were able to get the last
major website change in, but huge monolithic PRs that change the world are
difficult to review and integrate, so it'd be good to have a plan upfront
(learning from or experience last time).

On Fri, Oct 16, 2020 at 1:24 PM Alexey Romanenko 
wrote:

> Welcome to Beam!
>
> Regards,
> Alexey
>
> On 15 Oct 2020, at 16:06, Nam Bui  wrote:
>
> Hi everyone,
>
> I'm Nam. I used to work on Apache Beam website migration. From now on, I
> will be responsible for website development. Thus, I will implement new
> designs and some of the functionalities on our website. Great to be working
> with you!
>
> Best regards,
> Nam
>
>
>
> On Thu, Oct 15, 2020 at 8:42 PM Kasia Zając  wrote:
>
>> Hi all,
>>
>> I am Kasia and I will be responsible for the UX side of things in the
>> project. I will be preparing the wireframes and also asking you for
>> feedback later in the process. Happy to join the community at least for a
>> short amount of time :)
>>
>> Have a good day,
>> Kasia Zając
>> Product Owner
>> +48 664 115 440 <+48664115440>
>> ka...@utilodesign.com
>> [image: utilo] 
>>
>> www.utilodesign.com
>> [image: Behance]  [image: dribbble]
>>  [image: Instagram]
>>  [image: Linkedin]
>> 
>>
>>
>> On Thu, Oct 15, 2020 at 2:12 PM Agnieszka Sell <
>> agnieszka.s...@polidea.com> wrote:
>>
>>> Hi Everyone,
>>>
>>> I'm Agnieszka Sell, Project Manager in the Beam website revamp project.
>>>
>>> I'm looking forward to getting your feedback on design and development
>>> progress we'll be sharing with you in upcoming weeks.
>>>
>>> Best,
>>>
>>> Agnieszka
>>>
>>> On Wed, Oct 14, 2020 at 7:06 PM Gris Cuevas  wrote:
>>>
 Hi Everyone,

 We're ready to start work on the revamp of the website, we'll use the
 PRD shared in this thread previously.

 Polidea will be the team working on this revamp and we'll be bringing
 designs and proposals to the community for review as the project
 progresses.

 Thank you!
 Gris

 On 2020/09/09 18:14:18, Gris Cuevas  wrote:
 > Hi Beam community,
 >
 > In a previous thread [1] I mentioned that I was going to work on
 product requirements document (PRD) for a project to address some of the
 requests and ideas collected by Aizhamal, Rose and David in previous
 efforts.
 >
 > The PRD is ready [2], and I'd like to get your feedback before moving
 forward into implementation. Please add you comments by Sunday Septermber
 13, 2020.
 >
 > We're looking to start work on this project in around 2 weeks.
 >
 > Thank you!
 > Gris
 >
 > [1]
 https://lists.apache.org/thread.html/r1a4cea1e8b53bef73c49f75df13956024d8d78bc81b36e54ef71f8a9%40%3Cdev.beam.apache.org%3E
 >
 > [2] https://s.apache.org/beam-site-revamp
 >

>>>
>>>
>>> --
>>> Agnieszka Sell
>>> Polidea  | Project Manager
>>> M: *+48 504 901 334* <+48504901334>
>>> E: agnieszka.s...@polidea.com
>>> [image: Polidea] 
>>>
>>> Check out our projects! 
>>> [image: Github]  [image: Facebook]
>>>  [image: Twitter]
>>>  [image: Linkedin]
>>>  [image: Instagram]
>>> 
>>>
>>> Unique Tech
>>> Check out our projects! 
>>>
>>
>


Re: Please add me to the mailing list

2020-10-19 Thread Luke Cwik
Send an e-mail to dev-subscr...@beam.apache.org to subscribe as per
https://beam.apache.org/community/contact-us/

On Mon, Oct 19, 2020 at 9:24 AM Mike Lo  wrote:

> Thanks!
>
> Best,
> Mike
>
> PhD, Bioengineering
> San Francisco Bay Area
> Mobile: 510-710-4906 <(510)%20710-4906>
> LinkedIn  | Website
> 
>


Please add me to the mailing list

2020-10-19 Thread Mike Lo
Thanks!

Best,
Mike

PhD, Bioengineering
San Francisco Bay Area
Mobile: 510-710-4906
LinkedIn  | Website



Re: [VOTE] Release 2.25.0, release candidate #1

2020-10-19 Thread Ismaël Mejía
-1

> * Java artifacts were built with Maven 3.5.3 and OpenJDK/Oracle JDK
11.0.8.

As from discussion on 2.24.0 RC1 we MUST build Java artifacts with Java 8
otherwise we will not have guaranteed compatibility with Java 8.
We should update the release guide to make this explicit for the person
preparing the release so this does not happen again and eventually include
some validation for this in the build.

I validated that this is broken the same way as before by running a
pipeline with Direct runner using the 2.25.0 jars inside of a Java 8 docker.
The Exception is the same.

2020-10-19 16:14:23,427 [direct-runner-worker] ERROR
org.apache.beam.runners.direct.DirectTransformExecutor  - Error occurred
within org.apache.beam.runners.direct.DirectTransformExecutor@6babef80
java.lang.NoSuchMethodError:
java.nio.ByteBuffer.clear()Ljava/nio/ByteBuffer;
at
org.apache.beam.sdk.util.BufferedElementCountingOutputStream.outputBuffer(BufferedElementCountingOutputStream.java:197)
at
org.apache.beam.sdk.util.BufferedElementCountingOutputStream.flush(BufferedElementCountingOutputStream.java:180)
at
org.apache.beam.sdk.util.BufferedElementCountingOutputStream.finish(BufferedElementCountingOutputStream.java:119)
at
org.apache.beam.sdk.coders.IterableLikeCoder.encode(IterableLikeCoder.java:127)
at
org.apache.beam.sdk.coders.IterableLikeCoder.encode(IterableLikeCoder.java:60)
at org.apache.beam.sdk.coders.Coder.encode(Coder.java:136)


On Sat, Oct 17, 2020 at 3:57 AM Ahmet Altay  wrote:

> I verified python quickstarts. There is a minor issue and I will update my
> vote after that.
>
> Python batch pipelines on Dataflow are failing with the following error:
> "RuntimeError: Beam SDK base version 2.25.0 does not match Dataflow Python
> worker version 2.25.0.dev. Please check Dataflow worker startup logs and
> make sure that correct version of Beam SDK is installed."
>
> Same issue happened during 2.24.0 and was fixed quickly. We may need to
> update the release guide to prevent this error in the future. (/cc +Daniel
> Oliveira  and +Valentyn Tymofieiev
>  fixed the issue for 2.24.0).
>
> Ahmet
>
> On Fri, Oct 16, 2020 at 2:36 PM Robin Qiu  wrote:
>
>> Hi everyone,
>> Please review and vote on the release candidate #1 for the version
>> 2.25.0, as follows:
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>>
>>
>> The complete staging area is available for your review, which includes:
>> * JIRA release notes [1],
>> * the official Apache source release to be deployed to dist.apache.org
>> [2], which is signed with the key with fingerprint
>> AD70476B9D1AF3EFEC2208165952E71AACAF911D [3],
>> * all artifacts to be deployed to the Maven Central Repository [4],
>> * source code tag "v2.25.0-RC1" [5],
>> * website pull request listing the release [6], publishing the API
>> reference manual [7], and the blog post [8].
>> * Java artifacts were built with Maven 3.5.3 and OpenJDK/Oracle JDK
>> 11.0.8.
>> * Python artifacts are deployed along with the source release to the
>> dist.apache.org [2].
>> * Validation sheet with a tab for 2.25.0 release to help with validation
>> [9].
>> * Docker images published to Docker Hub [10].
>>
>> The vote will be open for at least 72 hours. It is adopted by majority
>> approval, with at least 3 PMC affirmative votes.
>>
>> Thanks,
>> Robin
>>
>> [1]
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12347147
>> [2] https://dist.apache.org/repos/dist/dev/beam/2.25.0/
>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>> [4]
>> https://repository.apache.org/content/repositories/orgapachebeam-1139/
>> [5] https://github.com/apache/beam/tree/v2.25.0-RC1
>> [6] https://github.com/apache/beam/pull/13130
>> [7] https://github.com/apache/beam-site/pull/608
>> [8] https://github.com/apache/beam/pull/13131
>> [9]
>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1494345946
>> [10] https://hub.docker.com/search?q=apache%2Fbeam=image
>>
>


Beam Dependency Check Report (2020-10-19)

2020-10-19 Thread Apache Jenkins Server

High Priority Dependency Updates Of Beam Python SDK:


  Dependency Name
  Current Version
  Latest Version
  Release Date Of the Current Used Version
  Release Date Of The Latest Release
  JIRA Issue
  
chromedriver-binary
86.0.4240.22.0
87.0.4280.20.0
2020-09-07
2020-10-19BEAM-10426
google-cloud-bigquery
1.28.0
2.1.0
2020-10-05
2020-10-12BEAM-5537
google-cloud-dlp
1.0.0
2.0.0
2020-06-29
2020-10-05BEAM-10344
google-cloud-pubsub
1.7.0
2.1.0
2020-07-20
2020-10-05BEAM-5539
google-cloud-vision
1.0.0
2.0.0
2020-03-24
2020-10-05BEAM-9581
mock
2.0.0
4.0.2
2019-05-20
2020-10-05BEAM-7369
mypy-protobuf
1.18
1.23
2020-03-24
2020-06-29BEAM-10346
nbconvert
5.6.1
6.0.7
2020-10-05
2020-10-05BEAM-11007
Pillow
7.2.0
8.0.0
None
2020-10-19BEAM-11071
pyarrow
0.17.1
1.0.1
2020-07-27
2020-08-24BEAM-10582
PyHamcrest
1.10.1
2.0.2
2020-01-20
2020-07-08BEAM-9155
pytest
4.6.11
6.1.1
2020-07-08
2020-10-05BEAM-8606
pytest-xdist
1.34.0
2.1.0
2020-08-17
2020-08-28BEAM-10713
tenacity
5.1.5
6.2.0
2019-11-11
2020-06-29BEAM-8607
High Priority Dependency Updates Of Beam Java SDK:


  Dependency Name
  Current Version
  Latest Version
  Release Date Of the Current Used Version
  Release Date Of The Latest Release
  JIRA Issue
  
com.datastax.cassandra:cassandra-driver-core
3.10.2
4.0.0
2020-08-26
2019-03-18BEAM-8674
com.esotericsoftware:kryo
4.0.2
5.0.0
2018-03-20
2020-10-18BEAM-5809
com.esotericsoftware.kryo:kryo
2.21
2.24.0
2013-02-27
2014-05-04BEAM-5574
com.github.ben-manes.versions:com.github.ben-manes.versions.gradle.plugin
0.29.0
0.33.0
2020-07-20
2020-09-14BEAM-6645
com.google.api.grpc:grpc-google-cloud-pubsub-v1
1.85.1
1.90.4
2020-03-09
2020-10-12BEAM-8677
com.google.api.grpc:grpc-google-common-protos
1.12.0
2.0.0
2018-06-29
2020-10-15BEAM-8633
com.google.api.grpc:proto-google-cloud-bigquerystorage-v1beta1
0.85.1
0.105.5
2020-01-08
2020-10-09BEAM-8678
com.google.api.grpc:proto-google-cloud-bigtable-v2
1.9.1
1.16.2
2020-01-10
2020-10-14BEAM-8679
com.google.api.grpc:proto-google-cloud-datastore-v1
0.85.0
0.88.0
2019-12-05
2020-09-17BEAM-8680
com.google.api.grpc:proto-google-cloud-pubsub-v1
1.85.1
1.90.4
2020-03-09
2020-10-12BEAM-8681
com.google.api.grpc:proto-google-cloud-spanner-admin-database-v1
1.59.0
2.0.2
2020-07-16
2020-10-02BEAM-8682
com.google.api.grpc:proto-google-common-protos
1.17.0
2.0.0
2019-10-04
2020-10-15BEAM-6899
com.google.apis:google-api-services-bigquery
v2-rev20200719-1.30.10
v2-rev20201007-1.30.10
2020-07-26
2020-10-15BEAM-8684
com.google.apis:google-api-services-clouddebugger
v2-rev20200501-1.30.10
v2-rev20200807-1.30.10
2020-07-14
2020-08-17BEAM-8750
com.google.apis:google-api-services-cloudresourcemanager
v1-rev20200720-1.30.10
v2-rev20200831-1.30.10
2020-07-25
2020-09-03BEAM-8751
com.google.apis:google-api-services-dataflow
v1b3-rev20200713-1.30.10
v1beta3-rev12-1.20.0
2020-07-25
2015-04-29BEAM-8752
com.google.apis:google-api-services-healthcare
v1beta1-rev20200713-1.30.10
v1-rev20201006-1.30.10
2020-07-24
2020-10-14BEAM-10349
com.google.apis:google-api-services-pubsub
v1-rev20200713-1.30.10
v1-rev20200909-1.30.10
2020-07-25
2020-09-18BEAM-8753
com.google.apis:google-api-services-storage
v1-rev20200611-1.30.10
v1-rev20200927-1.30.10
2020-07-10
2020-10-03BEAM-8754
com.google.auth:google-auth-library-credentials
0.19.0
0.22.0
2019-12-13
2020-10-14BEAM-6478
com.google.auth:google-auth-library-oauth2-http
0.19.0
0.22.0
2019-12-13
2020-10-14BEAM-8685
com.google.auto.service:auto-service
1.0-rc6
1.0-rc7
2019-07-16
2020-05-13BEAM-5541