Re: Structured Logging in python

2024-04-11 Thread Udi Meiri
Hi,

I believe this wasn't implemented for Python (only Java). You can try adding 
structured data (extra keyword) under the key "custom_data" and that might work.

On 2024/04/11 17:49:43 Valentyn Tymofieiev wrote:
> Thanks for reaching out. There was a proposal a while back:
> https://s.apache.org/beam-structured-logging
> 
> /cc: @u...@apache.org - do you know the current status?
> 
> Thanks a lot!
> 
> On Thu, Apr 11, 2024 at 8:29 AM Geddy Schellevis 
> wrote:
> 
> > Hi all,
> >
> > I would like to know if it is possible to have structured logging in
> > Dataflow.
> > In the attached file, you can find the code that I am trying to do.
> >
> > I see the logs are appearing in gcp log explorer, but I cannot see the
> > extra fields.
> >
> > Best regards,
> >
> 


Re: Proposal: Structured Logging

2023-03-22 Thread Udi Meiri
The document has been updated.
Please take another look and let me know if there are any objections

Thanks!

On Wed, Mar 8, 2023 at 1:59 PM Udi Meiri  wrote:

> Hi all,
> I have written a proposal for Structured Logging in Beam:
> https://s.apache.org/beam-structured-logging
>
> Please LMK what you think. Any comments welcome here or in the doc.
>
> - Udi
>


Proposal: Structured Logging

2023-03-08 Thread Udi Meiri
Hi all,
I have written a proposal for Structured Logging in Beam:
https://s.apache.org/beam-structured-logging

Please LMK what you think. Any comments welcome here or in the doc.

- Udi


[ANNOUNCE] Beam 2.33.0 Release

2021-10-13 Thread Udi Meiri
The Apache Beam team is pleased to announce the release of version 2.33.0.

Apache Beam is an open source unified programming model to define and
execute data processing pipelines, including ETL, batch and stream
(continuous) processing.
See https://beam.apache.org

You can download the release here:
https://beam.apache.org/get-started/downloads/

This release includes bug fixes, features, and improvements detailed
on the Beam blog: https://beam.apache.org/blog/beam-2.33.0/

Thank you to everyone who contributed to this release, and we hope you
enjoy using Beam 2.33.0

- Udi, on behalf of the Apache Beam community.


[RESULT] [VOTE] Release 2.33.0, release candidate #2

2021-10-07 Thread Udi Meiri
I'm happy to announce that we have unanimously approved this release.

There are 8 approving votes, 5 of which are binding:
* Ahmet Altay
* Alexey Romanenko
* Robert Bradshaw
* Kenneth Knowles
* Chamikara Jayalath

There are no disapproving votes.

Thanks everyone!


[VOTE] Release 2.33.0, release candidate 2

2021-10-04 Thread Udi Meiri
Hi everyone,
Please review and vote on the release candidate #2 for the version 2.33.0,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


Reviewers are encouraged to test their own use cases with the release
candidate, and vote +1 if
no issues are found.

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which is signed with the key with fingerprint 587B049C36DAAFE6 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "v2.33.0-RC2" [5],
* website pull request listing the release [6], the blog post [6], and
publishing the API reference manual [7].
* Java artifacts were built with Maven 3.6.3 and OpenJDK 1.8.0_181.
* Python artifacts are deployed along with the source release to the
dist.apache.org [2] and pypi[8].
* Validation sheet with a tab for 2.33.0 release to help with validation
[9].
* Docker images published to Docker Hub [10].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

For guidelines on how to try the release in your projects, check out our
blog post at https://beam.apache.org/blog/validate-beam-release/.

Thanks,
Release Manager

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12350404
[2] https://dist.apache.org/repos/dist/dev/beam/2.33.0/
[3] https://dist.apache.org/repos/dist/release/beam/KEYS
[4] https://repository.apache.org/content/repositories/orgapachebeam-1235/
[5] https://github.com/apache/beam/tree/v2.33.0-RC2
[6] https://github.com/apache/beam/pull/15543
[7] https://github.com/apache/beam-site/pull/619
[8] https://pypi.org/project/apache-beam/2.33.0rc2/
[9]
https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1705275493
[10] https://hub.docker.com/search?q=apache%2Fbeamtype=image


Re: [RESULT] [VOTE] Release 2.33.0, release candidate #1

2021-09-27 Thread Udi Meiri
I spoke too soon. We will be doing an rc2

On Mon, Sep 27, 2021 at 1:29 PM Udi Meiri  wrote:

> I'm happy to announce that we have unanimously approved this release.
>
> There are 8 approving votes, 4 of which are binding:
> * Ahmet Altay
> * Alexey Romanenko
> * Robert Bradshaw
> * Chamikara Jayalath
>
> There are no disapproving votes.
>
> Thanks everyone!
>
>


[RESULT] [VOTE] Release 2.33.0, release candidate #1

2021-09-27 Thread Udi Meiri
I'm happy to announce that we have unanimously approved this release.

There are 8 approving votes, 4 of which are binding:
* Ahmet Altay
* Alexey Romanenko
* Robert Bradshaw
* Chamikara Jayalath

There are no disapproving votes.

Thanks everyone!


[VOTE] Release 2.33.0, release candidate 1

2021-09-21 Thread Udi Meiri
Hi everyone,
Please review and vote on the release candidate #1 for the version 2.33.0,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


Reviewers are encouraged to test their own use cases with the release
candidate, and vote +1 if
no issues are found.

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which is signed with the key with fingerprint 587B049C36DAAFE6 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "v2.33.0-RC1" [5],
* website pull request listing the release [6], the blog post [6], and
publishing the API reference manual [7].
* Java artifacts were built with Maven 3.6.3 and OpenJDK 1.8.0_181.
* Python artifacts are deployed along with the source release to the
dist.apache.org [2] and pypy[8].
* Validation sheet with a tab for 2.33.0 release to help with validation
[9].
* Docker images published to Docker Hub [10].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

For guidelines on how to try the release in your projects, check out our
blog post at https://beam.apache.org/blog/validate-beam-release/.

Thanks,
Release Manager

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12350404
[2] https://dist.apache.org/repos/dist/dev/beam/2.33.0/
[3] https://dist.apache.org/repos/dist/release/beam/KEYS
[4] https://repository.apache.org/content/repositories/orgapachebeam-1234/
[5] https://github.com/apache/beam/tree/v2.33.0-RC1
[6] https://github.com/apache/beam/pull/15543
[7] https://github.com/apache/beam-site/pull/619
[8] https://pypi.org/project/apache-beam/2.33.0rc1/
[9]
https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1705275493
[10] https://hub.docker.com/search?q=apache%2Fbeamtype=image


Re: [PROPOSAL] Preparing for Beam 2.33.0 Release

2021-09-10 Thread Udi Meiri
The release branch is now passing Jenkins tests:
https://github.com/apache/beam/pull/15431

Thank you all so far for help with release blockers. There are 5 left
<https://issues.apache.org/jira/projects/BEAM/versions/12350404>.

On Thu, Aug 26, 2021 at 7:54 AM Alexey Romanenko 
wrote:

> See my comment below.
>
> On 26 Aug 2021, at 03:44, Ahmet Altay  wrote:
>
> On Wed, Aug 25, 2021 at 1:50 PM Udi Meiri  wrote:
>
>> I've found 2 performance regressions in IO. Could someone triage these?
>>
>
>
>> https://issues.apache.org/jira/browse/BEAM-12801
>>
>
> @Alexey Romanenko  @Tim Robertson
>  - Could you comment on this? It is related
> to HadoopFormatIO.
>
>
>
> I don’t see any code changes for “sdks/java/io/hadoop-format” package
> since June 2nd while the Write performance dropped at Aug 10th for the
> first time.
> I’ll try to reproduce it locally with a direct runner but seems it was
> caused by something else. Any other ideas about this?
>
>
>
>> https://issues.apache.org/jira/browse/BEAM-12800
>>
>
> @Chamikara Jayalath  @Yifan Mai
>  - Do you know who could comment on this?
>
>
>>
>>
>> +14 other blockers currently
>>
>
> I have not looked at the other blockers. It might help to check with the
> assignees. They may not know that they are release blockers.
>
>
>>
>> On Wed, Aug 25, 2021 at 12:22 PM Udi Meiri  wrote:
>>
>>> I'll try cutting the release today. From what I can tell [1], the only
>>> perma-reds are python postcommits.
>>>
>>
> I believe this is fixed now. (Thanks to @Kyle Weaver 
> )
>
>
>>
>>> [1]
>>> https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md
>>>
>>> On Tue, Aug 17, 2021 at 10:12 AM Udi Meiri  wrote:
>>>
>>>> Please mark 2.33.0 release blockers by setting the "Fix version" field
>>>> to 2.33.0. Otherwise, it doesn't appear in the dashboard
>>>> <https://issues.apache.org/jira/projects/BEAM/versions/12350404>.
>>>>
>>>> On Tue, Aug 17, 2021 at 9:32 AM Luke Cwik  wrote:
>>>>
>>>>> +Rui Wang  as the current build monitor.
>>>>>
>>>>> At the time I did not mean to add any additional tests as it was a
>>>>> copy paste error.
>>>>>
>>>>> I do have another fix for the failing Python PreCommit which I fixed
>>>>> since it seemed like a trivial one:
>>>>> https://issues.apache.org/jira/browse/BEAM-12768
>>>>> - test_df_agg_method_invalid_kwarg_raises too strict on error checking. 
>>>>> fix
>>>>> in https://github.com/apache/beam/pull/15341
>>>>>
>>>>> https://issues.apache.org/jira/browse/BEAM-12733  (RecommendationAIIT
>>>>> Java Version)
>>>>> +Matthias Baetens  for the above issue.
>>>>>
>>>>> On Mon, Aug 16, 2021 at 6:54 PM Ahmet Altay  wrote:
>>>>>
>>>>>> Thank you Udi and +1 to the release cut on schedule.
>>>>>>
>>>>>> On Mon, Aug 16, 2021 at 3:53 PM Luke Cwik  wrote:
>>>>>>
>>>>>>> It seems as though several tests were broken due to commits during
>>>>>>> the issue with some existing flakes. It looks like the Python PostCommit
>>>>>>> tests are perma-red due to these failures. So far I'm tracking:
>>>>>>>
>>>>>>
>>>>>> Thank you very much Luke. Adding bug owners here to explicitly call
>>>>>> out release blocking issues.
>>>>>>
>>>>>>
>>>>>>> https://issues.apache.org/jira/browse/BEAM-12764 (new_block method
>>>>>>> missing in pandas)
>>>>>>>
>>>>>>
>>>>>> +Sam Rohde  for the above issue.
>>>>>>
>>>>>>
>>>>>>> https://issues.apache.org/jira/browse/BEAM-12683
>>>>>>> (RecommendationAIIT)
>>>>>>>
>>>>>>
>>>>>> +Matthias Baetens  for the above issue.
>>>>>>
>>>>>>
>>>>>>> https://issues.apache.org/jira/browse/BEAM-12765 (cannot access
>>>>>>> field fruit)
>>>>>>> https://issues.apache.org/jira/browse/BEAM-12766 (Already Exists:
>>>>>>> Dataset apache-beam-testing:python_bq_file_loads_)
>>>>>>>
>>>>>>

Re: One Pager - Test Command Line Discoverability in Beam

2021-05-25 Thread Udi Meiri
My first place to go would be here:
https://cwiki.apache.org/confluence/display/BEAM/Java+Tips (although it
doesn't document your use-case)

You are right that finding the correct gradle task or jenkins job is not
straightforward.


On Tue, May 25, 2021 at 12:48 PM Alex Amato  wrote:

> Friendly ping. I'll wait for more suggestions by the end of the week. Then
> close it out.
>
> -- Forwarded message -
> From: Alex Amato 
> Date: Fri, May 21, 2021 at 2:54 PM
> Subject: One Pager - Test Command Line Discoverability in Beam
> To: dev 
>
>
> Hi, I have had some issues determining how to run Beam tests. I have
> written a one pager for review and would like your feedback, to solve the
> problem
> 
> :
>
> "A Beam developer is looking at a test file, such as
> “BigQueryTornadoesIT.java” and wants to run this test. But they do not know
> the command line they need to type to run this test."
>
> I would like your feedback, to get toward a more concrete proposal. A few
> solutions are possible for this, mentioned in the proposal. But any
> solution that makes it very easy to understand how to run the test is a
> viable option as well.
>
> Cheers,
> Alex
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: BEAM-3713: Moving from nose to pytest

2021-04-26 Thread Udi Meiri
I'm about to merge https://github.com/apache/beam/pull/14481, which
converts 4 postcommit suites to pytest.

job_PostCommit_Python
job_PostCommit_Python_ValidatesContainer_Dataflow
job_PostCommit_Python_ValidatesRunner_Dataflow
job_PostCommit_Python_ValidatesRunner_Flink

Please report on the bug if you have issues (tests not running, missing
logs, missing results in jenkins).
If you have an open PR that adds a new IT that should be running in one of
the above suites, please convert your decorators.

Examples (see PR for more):
@attr('IT') -> @pytest.mark.it_postcommit
@attr('ValidatesRunner') -> @pytest.mark.it_validatesrunner


On Thu, Mar 25, 2021 at 10:45 AM Udi Meiri  wrote:

> Hi Benjamin,
>
> AFAIK nose is only used for integration tests (unit tests were converted
> to pytest a while back).
> These ITs should all be running periodically (except maybe the release
> related ones?).
>
> I would start with selecting one of the Jenkins jobs and converting the
> ITs in it to pytest.
> Good place to start:
> https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/
> I would prioritize converting the Python jobs listed here:
> https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md
>
> There's a fairly old abandoned PR with some ideas:
> https://github.com/apache/beam/pull/7949/files
> Have a look at:
> sdks/python/scripts/run_integration_test.sh
> sdks/python/pytest.ini
> sdks/python/conftest.py
>
> My idea in that PR was to replace the nose @attr('IT') decorators with 1
> or more:
> @pytest.mark.it_postcommit,
> @pytest.mark.no_direct,
> etc.
> These decorators tell nose/pytest which tests to run.
> So if I wanted to run post-commit tests on direct runner I would use this
> pytest flag:
> "-m 'it_postcommit and not no_direct'".
>
>
> On Wed, Mar 24, 2021 at 5:41 PM Ahmet Altay  wrote:
>
>> All PRs look either merged or closed.
>>
>> +Udi Meiri  might have more information about the
>> remaining work.
>>
>> On Wed, Mar 24, 2021 at 5:29 PM Benjamin Gonzalez Delgado <
>> benjamin.gonza...@wizeline.com> wrote:
>>
>>> Hi team,
>>> I am planning to work in BEAM-3713
>>> <https://issues.apache.org/jira/browse/BEAM-3713>, but I see there are
>>> PRs related to the task.
>>> Could someone guide me on the work that remains missing regarding the
>>> migration from nose to pytest?
>>> Any guidance on this would be appreciated.
>>>
>>> Thanks!
>>> Benjamin
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *This email and its contents (including any attachments) are being sent
>>> toyou on the condition of confidentiality and may be protected by
>>> legalprivilege. Access to this email by anyone other than the intended
>>> recipientis unauthorized. If you are not the intended recipient, please
>>> immediatelynotify the sender by replying to this message and delete the
>>> materialimmediately from your system. Any further use, dissemination,
>>> distributionor reproduction of this email is strictly prohibited. Further,
>>> norepresentation is made with respect to any content contained in this
>>> email.*
>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Codecov Bash Uploader Security Notice

2021-04-15 Thread Udi Meiri
>From the notice: "We strongly recommend affected users immediately re-roll
all of their credentials, tokens, or keys located in the environment
variables in their CI processes that used one of Codecov’s Bash Uploaders."


On Thu, Apr 15, 2021 at 11:35 AM Udi Meiri  wrote:

> I got this email: https://about.codecov.io/security-update/
>
> This is where we use codecov:
>
> https://github.com/apache/beam/blob/39923d8f843ecfd3d89443dccc359c14aea8f26f/sdks/python/tox.ini#L105
>
> I'm not sure if this runs the "bash uploader", but we do set
> a CODECOV_TOKEN environment variable.
>


smime.p7s
Description: S/MIME Cryptographic Signature


Codecov Bash Uploader Security Notice

2021-04-15 Thread Udi Meiri
I got this email: https://about.codecov.io/security-update/

This is where we use codecov:
https://github.com/apache/beam/blob/39923d8f843ecfd3d89443dccc359c14aea8f26f/sdks/python/tox.ini#L105

I'm not sure if this runs the "bash uploader", but we do set
a CODECOV_TOKEN environment variable.


smime.p7s
Description: S/MIME Cryptographic Signature


Re: BEAM-3713: Moving from nose to pytest

2021-03-25 Thread Udi Meiri
Hi Benjamin,

AFAIK nose is only used for integration tests (unit tests were converted to
pytest a while back).
These ITs should all be running periodically (except maybe the release
related ones?).

I would start with selecting one of the Jenkins jobs and converting the ITs
in it to pytest.
Good place to start:
https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/
I would prioritize converting the Python jobs listed here:
https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md

There's a fairly old abandoned PR with some ideas:
https://github.com/apache/beam/pull/7949/files
Have a look at:
sdks/python/scripts/run_integration_test.sh
sdks/python/pytest.ini
sdks/python/conftest.py

My idea in that PR was to replace the nose @attr('IT') decorators with 1 or
more:
@pytest.mark.it_postcommit,
@pytest.mark.no_direct,
etc.
These decorators tell nose/pytest which tests to run.
So if I wanted to run post-commit tests on direct runner I would use this
pytest flag:
"-m 'it_postcommit and not no_direct'".


On Wed, Mar 24, 2021 at 5:41 PM Ahmet Altay  wrote:

> All PRs look either merged or closed.
>
> +Udi Meiri  might have more information about the
> remaining work.
>
> On Wed, Mar 24, 2021 at 5:29 PM Benjamin Gonzalez Delgado <
> benjamin.gonza...@wizeline.com> wrote:
>
>> Hi team,
>> I am planning to work in BEAM-3713
>> <https://issues.apache.org/jira/browse/BEAM-3713>, but I see there are
>> PRs related to the task.
>> Could someone guide me on the work that remains missing regarding the
>> migration from nose to pytest?
>> Any guidance on this would be appreciated.
>>
>> Thanks!
>> Benjamin
>>
>>
>>
>>
>>
>>
>>
>>
>> *This email and its contents (including any attachments) are being sent
>> toyou on the condition of confidentiality and may be protected by
>> legalprivilege. Access to this email by anyone other than the intended
>> recipientis unauthorized. If you are not the intended recipient, please
>> immediatelynotify the sender by replying to this message and delete the
>> materialimmediately from your system. Any further use, dissemination,
>> distributionor reproduction of this email is strictly prohibited. Further,
>> norepresentation is made with respect to any content contained in this
>> email.*
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] New PMC Member: Chamikara Jayalath

2021-01-21 Thread Udi Meiri
Congrats Cham!

On Thu, Jan 21, 2021 at 4:25 PM Griselda Cuevas  wrote:

> Congratulations Cham!!! Well deserved :)
>
> On Thu, 21 Jan 2021 at 15:23, Connell O'Callaghan 
> wrote:
>
>> Well done Cham!!! Thank you for all your contributions to date!!!
>>
>>
>> On Thu, Jan 21, 2021 at 3:18 PM Rui Wang  wrote:
>>
>>> Congratulations, Cham!
>>>
>>> -Rui
>>>
>>> On Thu, Jan 21, 2021 at 3:15 PM Robert Bradshaw 
>>> wrote:
>>>
 Congratulations, Cham!

 On Thu, Jan 21, 2021 at 3:13 PM Brian Hulette 
 wrote:

> Great news, congratulations Cham!
>
> On Thu, Jan 21, 2021 at 3:08 PM Robin Qiu  wrote:
>
>> Congratulations, Cham!
>>
>> On Thu, Jan 21, 2021 at 3:05 PM Tyson Hamilton 
>> wrote:
>>
>>> Woo! Congrats Cham!
>>>
>>> On Thu, Jan 21, 2021 at 3:02 PM Robert Burke 
>>> wrote:
>>>
 Congratulations! That's fantastic news.

 On Thu, Jan 21, 2021, 2:59 PM Reza Rokni  wrote:

> Congratulations!
>
> On Fri, Jan 22, 2021 at 6:58 AM Ankur Goenka 
> wrote:
>
>> Congrats Cham!
>>
>> On Thu, Jan 21, 2021 at 2:57 PM Ahmet Altay 
>> wrote:
>>
>>> Hi all,
>>>
>>> Please join me and the rest of Beam PMC in welcoming Chamikara
>>> Jayalath as our
>>> newest PMC member.
>>>
>>> Cham has been part of the Beam community from its early days and
>>> contributed to the project in significant ways, including 
>>> contributing new
>>> features and improvements especially related Beam IOs, advocating 
>>> for
>>> users, and mentoring new community members.
>>>
>>> Congratulations Cham! And thanks for being a part of Beam!
>>>
>>> Ahmet
>>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Jenkins trigger phrase "run seed job" not working?

2020-11-09 Thread Udi Meiri
Yeah, I've had that issue recently:
https://github.com/apache/beam/pull/13213 (seed job did not trigger)

I've personally never seen where this is configured, but it sounds like it.
What happens if you edit a file in the PR? It should make you a co-author.

On Mon, Nov 9, 2020 at 3:36 AM Kamil Wasilewski <
kamil.wasilew...@polidea.com> wrote:

> Has anyone noticed problems with running "run seed job" by committers when
> the author of the Pull Request is NOT a committer? For example:
> https://github.com/apache/beam/pull/13242. Neither I nor Valentyn could
> trigger the job. Does it mean that it's an author's username that really
> matters, not a commentator's username?
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PROPOSAL] Preparing for Beam 2.26.0 release

2020-10-27 Thread Udi Meiri
+1 sg!

On Tue, Oct 27, 2020 at 10:02 AM Robert Burke  wrote:

> Hello everyone!
>
> The next Beam release (2.26.0) is scheduled to be cut on November 4th
> according to the release calendar [1].
>
> I'd like to volunteer myself to handle this release. I plan on cutting the
> branch on November 5th (since I've had November 4th booked off for months
> now) and cherry-picking in release-blocking fixes
> afterwards. So unresolved release blocking JIRA issues should have
> their "Fix Version/s" marked as "2.26.0".
>
> Any comments or objections?
>
> Thanks,
> Robert Burke
> @lostluck
> [1]
> https://calendar.google.com/calendar/u/0/embed?src=0p73sl034k80oob7seouani...@group.calendar.google.com
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Shutting down Perfkit Explorer

2020-09-22 Thread Udi Meiri
Thanks, Tyson!

On Tue, Sep 22, 2020 at 11:11 AM Tyson Hamilton  wrote:

> I re-enabled the AppEngine app. Today that app has both the required
> datastore app and the perfkit app baked into the container image. What
> should happen, is that the perfkit app is removed from that image, but the
> datastore related stuff remains functional.
>
> On Tue, Sep 22, 2020 at 10:37 AM Udi Meiri  wrote:
>
>> Is it possible to create a simple "hello world" application instead?
>>
>> On Tue, Sep 22, 2020 at 10:35 AM Udi Meiri  wrote:
>>
>>> Disabling this broke our Datastore ITs. Apparently you must have an
>>> application for Datastore to work. From the Datastore dashboard:
>>> The project apache-beam-testing does not exist or it does not contain an
>>> active Cloud Datastore or Cloud Firestore database. Please visit
>>> http://console.cloud.google.com to create a project or
>>> https://console.cloud.google.com/datastore/setup?project=apache-beam-testing
>>> to add a Cloud Datastore or Cloud Firestore database. Note that Cloud
>>> Datastore or Cloud Firestore always have an associated App Engine app and
>>> this app must not be disabled.
>>> New failure:
>>> https://ci-beam.apache.org/job/beam_PostCommit_Python36/2959/
>>>
>>>
>>>
>>> On Tue, Sep 22, 2020 at 2:16 AM Kamil Wasilewski <
>>> kamil.wasilew...@polidea.com> wrote:
>>>
>>>> Thanks. The application has been disabled.
>>>>
>>>> On Fri, Sep 18, 2020 at 8:46 PM Ahmet Altay  wrote:
>>>>
>>>>> +1. Thank you for the cleanup.
>>>>>
>>>>> On Fri, Sep 18, 2020 at 8:24 AM Tyson Hamilton 
>>>>> wrote:
>>>>>
>>>>>> +1 to removing, thank you Kamil.
>>>>>>
>>>>>> On Fri, Sep 18, 2020 at 6:05 AM Kamil Wasilewski <
>>>>>> kamil.wasilew...@polidea.com> wrote:
>>>>>>
>>>>>>> Hello everyone,
>>>>>>>
>>>>>>> Beam support for Python 2 is coming to an end. Consequently, we
>>>>>>> should make sure no Python 2 applications are running as a part of 
>>>>>>> Beam's
>>>>>>> infrastructure. As you may know, Beam is still hosting a Python 2
>>>>>>> application on Google App Engine. This application is Perfkit Explorer 
>>>>>>> [1].
>>>>>>>
>>>>>>> Perfkit Explorer has been used as a dashboarding tool for a long
>>>>>>> time. Few months ago, it was deprecated in favour of new Grafana 
>>>>>>> dashboards
>>>>>>> [2].
>>>>>>>
>>>>>>> Perfkit Explorer doesn't support Python 3, so a viable solution to
>>>>>>> the problem is to shut down Perfkit Explorer completely. That would also
>>>>>>> reduce costs, because at the moment we have two similar applications 
>>>>>>> doing
>>>>>>> the same thing.
>>>>>>>
>>>>>>> What do you think?
>>>>>>> If nobody has any objections, I'd like to shut down Perfkit Explorer
>>>>>>> next week.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Kamil
>>>>>>>
>>>>>>> [1] https://apache-beam-testing.appspot.com
>>>>>>> [2] http://metrics.beam.apache.org/
>>>>>>>
>>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: JIRA components

2020-09-18 Thread Udi Meiri
The component would be for Python type related code. Mostly the stuff under
sdks/python/apache_beam/typehints.

Thanks, I'll use labels.

On Fri, Sep 18, 2020 at 10:00 AM Kyle Weaver  wrote:

> Why not use the existing sdk-py-core component and make Python types a
> label?
>
> On Thu, Sep 17, 2020 at 7:17 PM Kenneth Knowles  wrote:
>
>> What does it mean?
>>
>> Kenn
>>
>> On Thu, Sep 17, 2020 at 6:18 PM Udi Meiri  wrote:
>>
>>> Hi, I was going to create a Python types component in our JIRA.
>>> (sdk-py-types)
>>> Any objections?
>>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


JIRA components

2020-09-17 Thread Udi Meiri
Hi, I was going to create a Python types component in our JIRA.
(sdk-py-types)
Any objections?


smime.p7s
Description: S/MIME Cryptographic Signature


Re: JIRA - can't set resolution?

2020-08-26 Thread Udi Meiri
Thanks, I wonder if there's a workaround in the meantime to manually set
resolution.

On Wed, Aug 26, 2020 at 10:09 AM Kenneth Knowles  wrote:

> I reviewed the guidance and raised the priority. I will follow up more at
> least to get an acknowledgment.
>
> Kenn
>
> On Tue, Aug 25, 2020 at 2:37 PM Brian Hulette  wrote:
>
>> Yeah this is still broken as described in
>> https://lists.apache.org/thread.html/r68924a0317a75d7858914b914e1d95fe36e0a9bf1794ef6861df7118%40%3Cdev.beam.apache.org%3E
>>
>> Currently blocked on https://issues.apache.org/jira/browse/INFRA-20563
>>
>> On Tue, Aug 25, 2020 at 10:34 AM Udi Meiri  wrote:
>>
>>> Example: https://issues.apache.org/jira/browse/BEAM-10751
>>> When I click "resolve issue" the status changes to "resolved" but
>>> resolution is still "unresolved" and I can't change it.
>>>
>>> I believe there was a change a while back to JIRA?
>>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


JIRA - can't set resolution?

2020-08-25 Thread Udi Meiri
Example: https://issues.apache.org/jira/browse/BEAM-10751
When I click "resolve issue" the status changes to "resolved" but
resolution is still "unresolved" and I can't change it.

I believe there was a change a while back to JIRA?


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Memory Issue When Running Beam On Flink

2020-08-10 Thread Udi Meiri
HI David,
I'm not familiar with Flink, but assuming there aren't any memory
management issues in the runner or SDK, try reducing
proceesing_time_duration (30 minutes currently) to 60 seconds and see how
long it takes for memory usage to reach the limit. Could you also say how
long it currently takes for memory to reach the limit?

On Thu, Aug 6, 2020 at 4:23 PM David Gogokhiya  wrote:

> Hi,
>
> We recently started using Apache Beam version 2.20.0 running on Flink
> version 1.9 deployed on kubernetes to process unbounded streams of data.
> However, we noticed that the memory consumed by stateful Beam is steadily
> increasing over time with no drops no matter what the current bandwidth is.
> We were wondering if this is expected and if not what would be the best way
> to resolve it.
> More Context
>
> We have the following pipeline that consumes messages from the unbounded
> stream of data. Later we deduplicate the messages based on unique message
> id using the deduplicate function
> .
> Since we are using Beam version 2.20.0, we copied the source code of the
> deduplicate function
> 
> from version 2.22.0. After that we unmap the tuple, retrieve the necessary
> data from message payload and dump the corresponding data into the log.
>
> Pipeline:
>
>
> Flink configuration:
>
>
> As we mentioned before, we noticed that the memory usage of the jobmanager
> and taskmanager pod are steadily increasing with no drops no matter what
> the current bandwidth is. We tried allocating more memory but it seems like
> no matter how much memory we allocate it eventually reaches its limit and
> then it tries to restart itself.
>
>
> Sincerely, David
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Broken links in code velocity dashboard

2020-08-10 Thread Udi Meiri
There's this old bug about it:
https://issues.apache.org/jira/browse/BEAM-8389

On Fri, Aug 7, 2020 at 4:45 AM Damian Gadomski 
wrote:

> Unfortunately, I'm not aware of any recent changes.
>
> On Thu, Aug 6, 2020 at 10:00 PM Ahmet Altay  wrote:
>
>> Damian, or anyone else, do you know if there were other changes to the
>> dashboard?
>>
>> I started to see closed PRs in the currently open PRs list (e.g.
>> https://github.com/apache/beam/pull/12349,
>> https://github.com/apache/beam/pull/12374). Not sure what is causing it,
>> but it seems like a new issue.
>>
>> On Fri, Jul 31, 2020 at 10:13 AM Ahmet Altay  wrote:
>>
>>> Looks fixed. Thank you for the quick response!
>>>
>>> On Fri, Jul 31, 2020 at 7:05 AM Damian Gadomski <
>>> damian.gadom...@polidea.com> wrote:
>>>
 Oops. Sorry about that. Everything should be fixed now. URLs are
 correct and I've also removed wrong entries from the DB.

 If you're curious, you were right, it was mistakenly deployed from my
 fork. Actually, my private Jenkins instance did it. Will be more cautious
 with the jobs.

 Regards,
 Damian

 On Fri, Jul 31, 2020 at 1:32 AM Ahmet Altay  wrote:

> Currently the open PRs section of the dashboard [1] seems to be
> broken. PRs are linking to https://github.com/damgadbot/beam/issues/ NUMBER> instead of https://github.com/apache/beam/pull/.
> And the PR list showing PRs from the (
> https://github.com/damgadbot/beam) fork in addition to the main repo.
>
> Dashboard was working normally last week, so probably something
> changed recently. I do not see a code change in the dashboard [2]. I am 
> not
> sure what happened, maybe the dashboard was deployed from a fork?
>
> +Damian Gadomski  - based on the url
> changes :)
>
> Thank you,
> Ahmet
>
> [1]
> http://metrics.beam.apache.org/d/code_velocity/code-velocity?orgId=1
> [2]
> https://github.com/apache/beam/blob/master/.test-infra/metrics/grafana/dashboards/code_velocity.json
>



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Git commit history: "fixup" commits

2020-08-04 Thread Udi Meiri
https://github.com/marketplace/actions/gs-commit-message-checker

On Tue, Aug 4, 2020 at 10:25 AM Robert Bradshaw  wrote:

> +1, thanks for the reminder.
>
> This should be really easy to automate, using
> https://developer.github.com/webhooks/event-payloads/#pull_request to
> give a warning when the change history is not sufficiently "clean."
> I'm not sure where to host this though (or if it could be integrated
> into jenkins--basically I'd just want to run a Python script with the
> PR number (or better, just point to the local git repo and have the
> master's commit handy) as another precommit).
>
> On Tue, Aug 4, 2020 at 10:10 AM Rui Wang  wrote:
> >
> > +1 thanks Alexey.
> >
> > My apologies that I merged such a case recently (but not intentionally).
> I tried to use the "squash and merge" button with a consolidated commit
> message. After clicking the button, github showed "failed to merge" and
> gave a retry button, and after clicking that retry button, github magically
> switched to "create merge commit" approach thus merged some fixup commits
> to the main branch.
> >
> > This is a rare case (I only encountered once). But I will pay more
> attention next time. I could ask PR authors to squash their commits before
> merging when it is possible.
> >
> >
> > -Rui
> >
> > On Tue, Aug 4, 2020 at 9:40 AM Alexey Romanenko <
> aromanenko@gmail.com> wrote:
> >>
> >> Yes, good point, thanks Valentyn.
> >>
> >> On 4 Aug 2020, at 18:29, Valentyn Tymofieiev 
> wrote:
> >>
> >> +1, thanks, Alexey.
> >>
> >> Also a reminder from the contributor guide: do not use the default
> GitHub commit message for merge commits, which looks like:
> >>
> >> Merge pull request #1234 from some_user/transient_branch_name
> >>
> >> Instead, add the commit message into the subject line, for example:
> "Merge pull request #1234: [BEAM-7873] Fix the foo bizzle bazzle".
> >>
> >> On Tue, Aug 4, 2020 at 7:13 AM Alexey Romanenko <
> aromanenko@gmail.com> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> I’d like to attract your attention regarding our Git commit history
> and related issue. A while ago I noticed that it started getting not very
> clear and quite verbose comparing to how it was before. We have quite
> significant amount of recent commits like “fix”, “address comments”,
> “typo”, “spotless”, etc. Most of them also doesn’t contain Jira Tag as a
> prefix and actually is just supplementary commits to “main” and initial
> commit of PR, added after several PRs review rounds.
> >>>
> >>> AFAIR, we already had several discussion in the past about this topic
> and we agreed that we should avoid such commits in a final merge and have
> only one (in most cases) or several (if necessary) logical commits that
> should be atomic and properly explain what they do.
> >>>
> >>> Why these “tiny" commits are bad practice? Just several main reasons:
> >>> - They pollute our git repository history and don’t give any
> additional and useful further information;
> >>> - They are not atomic and we can’t easily revert (rollback) this
> supplementary commit since the state of the build before was likely broken
> or had incorrect behaviour. So, in this case, the whole set of PRs commits
> should be reverted which is not convenient and error-prone. It’s also
> expected that all checks were green before merging a PR (take a part flaky
> tests).
> >>> - They are not informative in terms of commit message. So it makes
> more hard to identify Git annotated code and how the lines of code are
> related together.
> >>>
> >>> Following this, I just want to briefly remind our Committers rules
> regarding PR merging [1].
> >>> Every commit:
> >>> - should do one thing and reflect it in commit message;
> >>> - should contain Jira Tag;
> >>> - all “fixup” and “address comments” type of commits should be
> squashed by author or committer before merging.
> >>>
> >>> Please, pay attention on what is finally committed and merged into our
> repository and it should help to keep our commit history clear, which will
> be transferred to saving a time of other developers in the end.
> >>>
> >>> [1]
> https://beam.apache.org/contribute/committer-guide/#finishing-touches
> >>>
> >>> Regards,
> >>> Alexey
> >>
> >>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: No space left on device - beam-jenkins 1 and 7

2020-07-27 Thread Udi Meiri
What about the workspaces, which can take up 175GB in some cases (see
above)?
I'm working on getting them cleaned up automatically:
https://github.com/apache/beam/pull/12326

My opinion is that we would get more mileage out of fixing the jobs that
leave behind files in /tmp and images/containers in Docker.
This would also help keep development machines clean.


On Mon, Jul 27, 2020 at 5:31 PM Tyson Hamilton  wrote:

> Here is a summery of how I understand things,
>
>   - /tmp and /var/lib/docker are the culprit for filling up disks
>   - inventory Jenkins job runs every 12 hours and runs a docker prune to
> clean up images older than 24hr
>   - crontab on each machine cleans up /tmp files older than three days
> weekly
>
> This doesn't seem to be working since we're still running out of disk
> periodically and requiring manual intervention. Knobs and options we have
> available:
>
>   1. increase frequency of deleting files
>   2. decrease the number of days required to delete a file (e.g. older
> than 2 days)
>
> The execution methods we have available are:
>
>   A. cron
> - pro: runs even if a job gets stuck in Jenkins due to full disk
> - con: config baked into VM which is tough to update, not discoverable
> or documented well
>   B. inventory job
> - pro: easy to update, runs every 12h already
> - con: could get stuck if Jenkins agent runs out of disk or is
> otherwise stuck, tied to all other inventory job frequency
>   C. configure startup scripts for the VMs that set up the cron job
> anytime the VM is restarted
> - pro: similar to A. and easy to update
> - con: similar to A.
>
> Between the three I prefer B. because it is consistent with other
> inventory jobs. If it ends up that stuck jobs prohibit scheduling of the
> inventory job often we could further investigate C to avoid having to
> rebuild the VM images repeatedly.
>
> Any objections or comments? If not, we'll go forward with B. and reduce
> the date check from 3 days to 2 days.
>
>
> On 2020/07/24 20:13:29, Ahmet Altay  wrote:
> > Tests may not be doing docker cleanup. Inventory job runs a docker prune
> > every 12 hours for images older than 24 hrs [1]. Randomly looking at one
> of
> > the recent runs [2], it cleaned up a long list of containers consuming
> > 30+GB space. That should be just 12 hours worth of containers.
> >
> > [1]
> >
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_Inventory.groovy#L69
> > [2]
> >
> https://ci-beam.apache.org/job/beam_Inventory_apache-beam-jenkins-14/501/console
> >
> > On Fri, Jul 24, 2020 at 1:07 PM Tyson Hamilton 
> wrote:
> >
> > > Yes, these are on the same volume in the /var/lib/docker directory. I'm
> > > unsure if they clean up leftover images.
> > >
> > > On Fri, Jul 24, 2020 at 12:52 PM Udi Meiri  wrote:
> > >
> > >> I forgot Docker images:
> > >>
> > >> ehudm@apache-ci-beam-jenkins-3:~$ sudo docker system df
> > >> TYPETOTAL   ACTIVE  SIZE
> > >>RECLAIMABLE
> > >> Images  88  9   125.4GB
> > >>   124.2GB (99%)
> > >> Containers  40  4   7.927GB
> > >>   7.871GB (99%)
> > >> Local Volumes   47  0   3.165GB
> > >>   3.165GB (100%)
> > >> Build Cache 0   0   0B
> > >>0B
> > >>
> > >> There are about 90 images on that machine, with all but 1 less than 48
> > >> hours old.
> > >> I think the docker test jobs need to try harder at cleaning up their
> > >> leftover images. (assuming they're already doing it?)
> > >>
> > >> On Fri, Jul 24, 2020 at 12:31 PM Udi Meiri  wrote:
> > >>
> > >>> The additional slots (@3 directories) take up even more space now
> than
> > >>> before.
> > >>>
> > >>> I'm testing out https://github.com/apache/beam/pull/12326 which
> could
> > >>> help by cleaning up workspaces after a run (just started a seed job).
> > >>>
> > >>> On Fri, Jul 24, 2020 at 12:13 PM Tyson Hamilton 
> > >>> wrote:
> > >>>
> > >>>> 664Mbeam_PreCommit_JavaPortabilityApi_Commit
> > >>>> 656Mbeam_PreCommit_JavaPortabilityApi_Commit@2
> > >>>> 611Mbeam_PreCommit_JavaPortabilityApi_Cron
> > >>>> 616Mb

Re: [VOTE] Make Apache Beam 2.24.0 the final release supporting Python 2.

2020-07-24 Thread Udi Meiri
+1

On Fri, Jul 24, 2020 at 10:47 AM Maximilian Michels  wrote:

> +1
>
> On 24.07.20 18:54, Pablo Estrada wrote:
> > +1 - thank you Valentyn!
> > -P.
> >
> > On Thu, Jul 23, 2020 at 1:29 PM Chamikara Jayalath  > > wrote:
> >
> > +1
> >
> > On Thu, Jul 23, 2020 at 1:15 PM Brian Hulette  > > wrote:
> >
> > +1
> >
> > On Thu, Jul 23, 2020 at 1:05 PM Robert Bradshaw
> > mailto:rober...@google.com>> wrote:
> >
> > [X] +1: Remove Python 2 support in Apache Beam 2.25.0.
> >
> > According to our six-week release cadence, 2.24.0 (the last
> > release to support Python 2) will be cut mid-August, and the
> > first release not supporting Python 2 would be expected to
> > land sometime in October. This seems a reasonable timeline
> > to me.
> >
> >
> > On Thu, Jul 23, 2020 at 12:53 PM Valentyn Tymofieiev
> > mailto:valen...@google.com>> wrote:
> >
> > Hi everyone,
> >
> > Please vote whether to make Apache Beam 2.24.0 the final
> > release supporting Python 2 as follows.
> >
> > [ ] +1: Remove Python 2 support in Apache Beam 2.25.0.
> > [ ] -1: Continue to support Python 2 in Apache Beam, and
> > reconsider at a later date.
> >
> > The Beam community has pledged to sunset Python 2
> > support at some point in 2020[1,2]. A recent
> > discussion[3] on dev@  proposes to outline a specific
> > version after which Beam developers no longer have to
> > maintain Py2 support, which is a motivation for this
> vote.
> >
> > If this vote is approved we will announce Apache Beam
> > 2.24.0 as our final release to support Python 2 and
> > discontinue Python 2 support starting from 2.25.0
> > (inclusive).
> >
> > This is a procedural vote [4] that will follow the
> > majority approval rules and will be open for at least 72
> > hours.
> >
> > Thanks,
> > Valentyn
> >
> > [1]
> >
> https://lists.apache.org/thread.html/634f7346b607e779622d0437ed0eca783f474dea8976adf41556845b%40%3Cdev.beam.apache.org%3E
> > [2] https://python3statement.org/
> > [3]
> >
> https://lists.apache.org/thread.html/r0d5c309a7e3107854f4892ccfeb1a17c0cec25dfce188678ab8df072%40%3Cdev.beam.apache.org%3E
> > [4] https://www.apache.org/foundation/voting.html
> > 
> >
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Jenkins trigger phrase "run seed job" not working?

2020-07-24 Thread Udi Meiri
I made my membership public and now the phrase works. (page:
https://github.com/orgs/apache/people?query=udim)

On Fri, Jul 24, 2020 at 1:04 PM Kenneth Knowles  wrote:

>
>
> On Thu, Jul 23, 2020 at 1:11 PM Damian Gadomski <
> damian.gadom...@polidea.com> wrote:
>
>> Yes, I thought that whitelisting apache organization will do the trick,
>> but apparently, it doesn't. Actually, it makes sense as we want to allow
>> only beam committers and not all apache committers. I don't know the
>> implications of membership in the apache github organization, but you for
>> instance are not there :) Neither is Ahmet.
>>
>
> This may have to do with registering alternative email addresses and
> GitHub accounts via whimsy.apache.org. If you are able to commit, then
> you are set up via gitbox.apache.org.
>
> Kenn
>
>
>> Therefore there's nothing wrong with the Ghprb plugin, it correctly
>> forbade triggering. From my investigation, the "beam-committers" GitHub
>> team (which is under the apache org) is the list of people that should be
>> allowed. But firstly, you cant whitelist a team with Ghprb. There's a
>> ticket for that, open for 5 years
>> <https://github.com/jenkinsci/ghprb-plugin/issues/160>. I could
>> implement that but, secondly, the team is secret. I can't even see it. Even
>> asfbot doesn't have permission to see it.
>>
>> You may ask, how it worked before, because on the builds.apache.org
>> somehow only committers were allowed to trigger PR builds. It appeared that
>> Infra created a webhook relay. It's configured here
>> <https://github.com/apache/infrastructure-puppet/blob/deployment/modules/gitbox/files/conf/relay.yaml>
>>  and
>> it filters out all the non-committers events. I wish I had known that
>> before as it was also the reason for different issues during the migration.
>> Anyway, it would be hard to use that mechanism in our case as we want to
>> configure it depending on the job.
>>
>>
>> There's a publicly available source of committers list - it's LDAP. I've
>> tested it and it allows anonymous connection and provides the list of the
>> committers as well as the github usernames. My current idea is to read this
>> from LDAP as a part of the seed job and configure the jobs with the apache
>> committers present on the ghprb whitelist.
>>
>>
>> Hope that I didn't miss anything ;) It isn't that easy to investigate
>> that kind of issues with my poor privileges ;)
>>
>>
>> Regards,
>>
>> Damian
>>
>>
>> On Thu, Jul 23, 2020 at 6:52 PM Udi Meiri  wrote:
>>
>>> Thanks Damian! I saw that the config also has this:
>>>   orgWhitelist(['apache'])
>>> Shouldn't that be enough to allow all Apache committers?
>>>
>>> I traced the code for the membership check here:
>>>
>>> https://github.com/jenkinsci/ghprb-plugin/blob/4e86ed47a96a01eeaa51a479ff604252109635f6/src/main/java/org/jenkinsci/plugins/ghprb/GhprbGitHub.java#L27
>>> Is there a way to see these logs?
>>>
>>>
>>> On Thu, Jul 23, 2020 at 7:08 AM Damian Gadomski <
>>> damian.gadom...@polidea.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> You are right, the current behavior is wrong, I'm currently working to
>>>> fix it asap. Our intention was to disable that only for non-committers.
>>>>
>>>> As a workaround, as a committer, you could manually add yourself (your
>>>> GitHub username) to the whitelist of the SeedJob configuration:
>>>> https://ci-beam.apache.org/job/beam_SeedJob/configure
>>>> Then, your comment "Run Seed Job" will trigger the build. I've already
>>>> manually triggered it for you that way.
>>>>
>>>> Of course, it will only work until the seed job gets executed - it will
>>>> then override the whitelist with an empty one.
>>>>
>>>> [image: Selection_408.png]
>>>>
>>>> As a target solution, I'm planning to fetch the list of beam committers
>>>> from LDAP and automatically add them to the whitelist above as a part of
>>>> the seed job. I'll keep you updated about the progress.
>>>>
>>>> Regards,
>>>> Damian
>>>>
>>>>
>>>> On Wed, Jul 22, 2020 at 11:03 PM Ahmet Altay  wrote:
>>>>
>>>>> +Damian Gadomski , it might be related
>>>>> to this change: https://github.com/apache/beam/pull/12319.
>>>>>
>>>>> /cc +Tyson Hamilton 
>>>>>
>>>>> On Wed, Jul 22, 2020 at 1:17 PM Udi Meiri  wrote:
>>>>>
>>>>>> HI,
>>>>>> I'm trying to test a groovy change but I can't seem to trigger the
>>>>>> seed job. It worked yesterday so I'm not sure what changed.
>>>>>>
>>>>>> https://github.com/apache/beam/pull/12326
>>>>>>
>>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: No space left on device - beam-jenkins 1 and 7

2020-07-24 Thread Udi Meiri
I forgot Docker images:

ehudm@apache-ci-beam-jenkins-3:~$ sudo docker system df
TYPETOTAL   ACTIVE  SIZE
 RECLAIMABLE
Images  88  9   125.4GB
124.2GB (99%)
Containers  40  4   7.927GB
7.871GB (99%)
Local Volumes   47  0   3.165GB
3.165GB (100%)
Build Cache 0   0   0B
 0B

There are about 90 images on that machine, with all but 1 less than 48
hours old.
I think the docker test jobs need to try harder at cleaning up their
leftover images. (assuming they're already doing it?)

On Fri, Jul 24, 2020 at 12:31 PM Udi Meiri  wrote:

> The additional slots (@3 directories) take up even more space now than
> before.
>
> I'm testing out https://github.com/apache/beam/pull/12326 which could
> help by cleaning up workspaces after a run (just started a seed job).
>
> On Fri, Jul 24, 2020 at 12:13 PM Tyson Hamilton 
> wrote:
>
>> 664Mbeam_PreCommit_JavaPortabilityApi_Commit
>> 656Mbeam_PreCommit_JavaPortabilityApi_Commit@2
>> 611Mbeam_PreCommit_JavaPortabilityApi_Cron
>> 616Mbeam_PreCommit_JavaPortabilityApiJava11_Commit
>> 598Mbeam_PreCommit_JavaPortabilityApiJava11_Commit@2
>> 662Mbeam_PreCommit_JavaPortabilityApiJava11_Cron
>> 2.9Gbeam_PreCommit_Portable_Python_Commit
>> 2.9Gbeam_PreCommit_Portable_Python_Commit@2
>> 1.7Gbeam_PreCommit_Portable_Python_Commit@3
>> 3.4Gbeam_PreCommit_Portable_Python_Cron
>> 1.9Gbeam_PreCommit_Python2_PVR_Flink_Commit
>> 1.4Gbeam_PreCommit_Python2_PVR_Flink_Cron
>> 1.3Gbeam_PreCommit_Python2_PVR_Flink_Phrase
>> 6.2Gbeam_PreCommit_Python_Commit
>> 7.5Gbeam_PreCommit_Python_Commit@2
>> 7.5Gbeam_PreCommit_Python_Cron
>> 1012M   beam_PreCommit_PythonDocker_Commit
>> 1011M   beam_PreCommit_PythonDocker_Commit@2
>> 1011M   beam_PreCommit_PythonDocker_Commit@3
>> 1002M   beam_PreCommit_PythonDocker_Cron
>> 877Mbeam_PreCommit_PythonFormatter_Commit
>> 988Mbeam_PreCommit_PythonFormatter_Cron
>> 986Mbeam_PreCommit_PythonFormatter_Phrase
>> 1.7Gbeam_PreCommit_PythonLint_Commit
>> 2.1Gbeam_PreCommit_PythonLint_Cron
>> 7.5Gbeam_PreCommit_Python_Phrase
>> 346Mbeam_PreCommit_RAT_Commit
>> 341Mbeam_PreCommit_RAT_Cron
>> 338Mbeam_PreCommit_Spotless_Commit
>> 339Mbeam_PreCommit_Spotless_Cron
>> 5.5Gbeam_PreCommit_SQL_Commit
>> 5.5Gbeam_PreCommit_SQL_Cron
>> 5.5Gbeam_PreCommit_SQL_Java11_Commit
>> 750Mbeam_PreCommit_Website_Commit
>> 750Mbeam_PreCommit_Website_Commit@2
>> 750Mbeam_PreCommit_Website_Cron
>> 764Mbeam_PreCommit_Website_Stage_GCS_Commit
>> 771Mbeam_PreCommit_Website_Stage_GCS_Cron
>> 336Mbeam_Prober_CommunityMetrics
>> 693Mbeam_python_mongoio_load_test
>> 339Mbeam_SeedJob
>> 333Mbeam_SeedJob_Standalone
>> 334Mbeam_sonarqube_report
>> 556Mbeam_SQLBigQueryIO_Batch_Performance_Test_Java
>> 175Gtotal
>>
>> On Fri, Jul 24, 2020 at 12:04 PM Tyson Hamilton 
>> wrote:
>>
>>> Ya looks like something in the workspaces is taking up room:
>>>
>>> @apache-ci-beam-jenkins-8:/home/jenkins$ sudo du -shc .
>>> 191G.
>>> 191Gtotal
>>>
>>>
>>> On Fri, Jul 24, 2020 at 11:44 AM Tyson Hamilton 
>>> wrote:
>>>
>>>> Node 8 is also full. The partition that /tmp is on is here:
>>>>
>>>> Filesystem  Size  Used Avail Use% Mounted on
>>>> /dev/sda1   485G  482G  2.9G 100% /
>>>>
>>>> however after cleaning up tmp with the crontab command, there is only
>>>> 8G usage yet it still remains 100% full:
>>>>
>>>> @apache-ci-beam-jenkins-8:/tmp$ sudo du -shc /tmp
>>>> 8.0G/tmp
>>>> 8.0Gtotal
>>>>
>>>> The workspaces are in the /home/jenkins/jenkins-slave/workspace
>>>> directory. When I run a du on that, it takes really long. I'll let it keep
>>>> running for a while to see if it ever returns a result but so far this
>>>> seems suspect.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jul 24, 2020 at 11:19 AM Tyson Hamilton 
>>>> wrote:
>>>>
>>>>> Everything I've been looking at is in the /tmp dir. Where are the
>>>>> workspaces, or what are the named?
>>>>>
>>>>>
>>>

Re: No space left on device - beam-jenkins 1 and 7

2020-07-24 Thread Udi Meiri
The additional slots (@3 directories) take up even more space now than
before.

I'm testing out https://github.com/apache/beam/pull/12326 which could help
by cleaning up workspaces after a run (just started a seed job).

On Fri, Jul 24, 2020 at 12:13 PM Tyson Hamilton  wrote:

> 664Mbeam_PreCommit_JavaPortabilityApi_Commit
> 656Mbeam_PreCommit_JavaPortabilityApi_Commit@2
> 611Mbeam_PreCommit_JavaPortabilityApi_Cron
> 616Mbeam_PreCommit_JavaPortabilityApiJava11_Commit
> 598Mbeam_PreCommit_JavaPortabilityApiJava11_Commit@2
> 662Mbeam_PreCommit_JavaPortabilityApiJava11_Cron
> 2.9Gbeam_PreCommit_Portable_Python_Commit
> 2.9Gbeam_PreCommit_Portable_Python_Commit@2
> 1.7Gbeam_PreCommit_Portable_Python_Commit@3
> 3.4Gbeam_PreCommit_Portable_Python_Cron
> 1.9Gbeam_PreCommit_Python2_PVR_Flink_Commit
> 1.4Gbeam_PreCommit_Python2_PVR_Flink_Cron
> 1.3Gbeam_PreCommit_Python2_PVR_Flink_Phrase
> 6.2Gbeam_PreCommit_Python_Commit
> 7.5Gbeam_PreCommit_Python_Commit@2
> 7.5Gbeam_PreCommit_Python_Cron
> 1012M   beam_PreCommit_PythonDocker_Commit
> 1011M   beam_PreCommit_PythonDocker_Commit@2
> 1011M   beam_PreCommit_PythonDocker_Commit@3
> 1002M   beam_PreCommit_PythonDocker_Cron
> 877Mbeam_PreCommit_PythonFormatter_Commit
> 988Mbeam_PreCommit_PythonFormatter_Cron
> 986Mbeam_PreCommit_PythonFormatter_Phrase
> 1.7Gbeam_PreCommit_PythonLint_Commit
> 2.1Gbeam_PreCommit_PythonLint_Cron
> 7.5Gbeam_PreCommit_Python_Phrase
> 346Mbeam_PreCommit_RAT_Commit
> 341Mbeam_PreCommit_RAT_Cron
> 338Mbeam_PreCommit_Spotless_Commit
> 339Mbeam_PreCommit_Spotless_Cron
> 5.5Gbeam_PreCommit_SQL_Commit
> 5.5Gbeam_PreCommit_SQL_Cron
> 5.5Gbeam_PreCommit_SQL_Java11_Commit
> 750Mbeam_PreCommit_Website_Commit
> 750Mbeam_PreCommit_Website_Commit@2
> 750Mbeam_PreCommit_Website_Cron
> 764Mbeam_PreCommit_Website_Stage_GCS_Commit
> 771Mbeam_PreCommit_Website_Stage_GCS_Cron
> 336Mbeam_Prober_CommunityMetrics
> 693Mbeam_python_mongoio_load_test
> 339Mbeam_SeedJob
> 333Mbeam_SeedJob_Standalone
> 334Mbeam_sonarqube_report
> 556Mbeam_SQLBigQueryIO_Batch_Performance_Test_Java
> 175Gtotal
>
> On Fri, Jul 24, 2020 at 12:04 PM Tyson Hamilton 
> wrote:
>
>> Ya looks like something in the workspaces is taking up room:
>>
>> @apache-ci-beam-jenkins-8:/home/jenkins$ sudo du -shc .
>> 191G.
>> 191Gtotal
>>
>>
>> On Fri, Jul 24, 2020 at 11:44 AM Tyson Hamilton 
>> wrote:
>>
>>> Node 8 is also full. The partition that /tmp is on is here:
>>>
>>> Filesystem  Size  Used Avail Use% Mounted on
>>> /dev/sda1   485G  482G  2.9G 100% /
>>>
>>> however after cleaning up tmp with the crontab command, there is only 8G
>>> usage yet it still remains 100% full:
>>>
>>> @apache-ci-beam-jenkins-8:/tmp$ sudo du -shc /tmp
>>> 8.0G/tmp
>>> 8.0Gtotal
>>>
>>> The workspaces are in the /home/jenkins/jenkins-slave/workspace
>>> directory. When I run a du on that, it takes really long. I'll let it keep
>>> running for a while to see if it ever returns a result but so far this
>>> seems suspect.
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 24, 2020 at 11:19 AM Tyson Hamilton 
>>> wrote:
>>>
>>>> Everything I've been looking at is in the /tmp dir. Where are the
>>>> workspaces, or what are the named?
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jul 24, 2020 at 11:03 AM Udi Meiri  wrote:
>>>>
>>>>> I'm curious to what you find. Was it /tmp or the workspaces using up
>>>>> the space?
>>>>>
>>>>> On Fri, Jul 24, 2020 at 10:57 AM Tyson Hamilton 
>>>>> wrote:
>>>>>
>>>>>> Bleck. I just realized that it is 'offline' so that won't work. I'll
>>>>>> clean up manually on the machine using the cron command.
>>>>>>
>>>>>> On Fri, Jul 24, 2020 at 10:56 AM Tyson Hamilton 
>>>>>> wrote:
>>>>>>
>>>>>>> Something isn't working with the current set up because node 15
>>>>>>> appears to be out of space and is currently 'offline' according to 
>>>>>>> Jenkins.
>>>>>>> Can someone run the cleanup job? The machine is full,
>>>>>>>
>>>>>>> @apache-ci-beam-jenkins-15:/tmp$ df -h
>>>>&

Re: No space left on device - beam-jenkins 1 and 7

2020-07-24 Thread Udi Meiri
In the "jenkins" user home directory.

On Fri, Jul 24, 2020, 11:19 Tyson Hamilton  wrote:

> Everything I've been looking at is in the /tmp dir. Where are the
> workspaces, or what are the named?
>
>
>
>
> On Fri, Jul 24, 2020 at 11:03 AM Udi Meiri  wrote:
>
>> I'm curious to what you find. Was it /tmp or the workspaces using up the
>> space?
>>
>> On Fri, Jul 24, 2020 at 10:57 AM Tyson Hamilton 
>> wrote:
>>
>>> Bleck. I just realized that it is 'offline' so that won't work. I'll
>>> clean up manually on the machine using the cron command.
>>>
>>> On Fri, Jul 24, 2020 at 10:56 AM Tyson Hamilton 
>>> wrote:
>>>
>>>> Something isn't working with the current set up because node 15 appears
>>>> to be out of space and is currently 'offline' according to Jenkins. Can
>>>> someone run the cleanup job? The machine is full,
>>>>
>>>> @apache-ci-beam-jenkins-15:/tmp$ df -h
>>>> Filesystem  Size  Used Avail Use% Mounted on
>>>> udev 52G 0   52G   0% /dev
>>>> tmpfs11G  265M   10G   3% /run
>>>> */dev/sda1   485G  484G  880M 100% /*
>>>> tmpfs52G 0   52G   0% /dev/shm
>>>> tmpfs   5.0M 0  5.0M   0% /run/lock
>>>> tmpfs52G 0   52G   0% /sys/fs/cgroup
>>>> tmpfs11G 0   11G   0% /run/user/1017
>>>> tmpfs11G 0   11G   0% /run/user/1037
>>>>
>>>> apache-ci-beam-jenkins-15:/tmp$ sudo du -ah --time . | sort -rhk 1,1 |
>>>> head -n 20
>>>> 20G 2020-07-24 17:52.
>>>> 580M2020-07-22 17:31./junit1031982597110125586
>>>> 517M2020-07-22 17:31
>>>>  ./junit1031982597110125586/junit8739924829337821410/heap_dump.hprof
>>>> 517M2020-07-22 17:31
>>>>  ./junit1031982597110125586/junit8739924829337821410
>>>> 263M2020-07-22 12:23./pip-install-2GUhO_
>>>> 263M2020-07-20 09:30./pip-install-sxgwqr
>>>> 263M2020-07-17 13:56./pip-install-bWSKIV
>>>> 242M2020-07-21 20:25./beam-pipeline-tempmByU6T
>>>> 242M2020-07-21 20:21./beam-pipeline-tempV85xeK
>>>> 242M2020-07-21 20:15./beam-pipeline-temp7dJROJ
>>>> 236M2020-07-21 20:25./beam-pipeline-tempmByU6T/tmpOWj3Yr
>>>> 236M2020-07-21 20:21./beam-pipeline-tempV85xeK/tmppbQHB3
>>>> 236M2020-07-21 20:15./beam-pipeline-temp7dJROJ/tmpgOXPKW
>>>> 111M2020-07-23 00:57./pip-install-1JnyNE
>>>> 105M2020-07-23 00:17./beam-artifact1374651823280819755
>>>> 105M2020-07-23 00:16./beam-artifact5050755582921936972
>>>> 105M2020-07-23 00:16./beam-artifact1834064452502646289
>>>> 105M2020-07-23 00:15./beam-artifact682561790267074916
>>>> 105M2020-07-23 00:15./beam-artifact4691304965824489394
>>>> 105M2020-07-23 00:14./beam-artifact4050383819822604421
>>>>
>>>> On Wed, Jul 22, 2020 at 12:03 PM Robert Bradshaw 
>>>> wrote:
>>>>
>>>>> On Wed, Jul 22, 2020 at 11:57 AM Tyson Hamilton 
>>>>> wrote:
>>>>>
>>>>>> Ah I see, thanks Kenn. I found some advice from the Apache infra wiki
>>>>>> that also suggests using a tmpdir inside the workspace [1]:
>>>>>>
>>>>>> Procedures Projects can take to clean up disk space
>>>>>>
>>>>>> Projects can help themselves and Infra by taking some basic steps to
>>>>>> help clean up their jobs after themselves on the build nodes.
>>>>>>
>>>>>>
>>>>>>
>>>>>>1. Use a ./tmp dir in your jobs workspace. That way it gets
>>>>>>cleaned up when job workspaces expire.
>>>>>>
>>>>>>
>>>>> Tests should be (able to be) written to use the standard temporary
>>>>> file mechanisms, and the environment set up on Jenkins such that that 
>>>>> falls
>>>>> into the respective workspaces. Ideally this should be as simple as 
>>>>> setting
>>>>> the TMPDIR (or similar) environment variable (and making sure it exists/is
>>>>> writable).
>>>>>
>>>>>>
>>>>>>1. Configure your jobs to wipe workspaces 

Re: No space left on device - beam-jenkins 1 and 7

2020-07-24 Thread Udi Meiri
;>>> @apache-ci-beam-jenkins-4:/tmp$ sudo du -ah --time . | sort -rhk 1,1
>>>>>> | head -n 20
>>>>>> 1.6G2020-07-21 02:25.
>>>>>> 242M2020-07-17 18:48./beam-pipeline-temp3ybuY4
>>>>>> 242M2020-07-17 18:46./beam-pipeline-tempuxjiPT
>>>>>> 242M2020-07-17 18:44./beam-pipeline-tempVpg1ME
>>>>>> 242M2020-07-17 18:42./beam-pipeline-tempJ4EpyB
>>>>>> 242M2020-07-17 18:39./beam-pipeline-tempepea7Q
>>>>>> 242M2020-07-17 18:35./beam-pipeline-temp79qot2
>>>>>> 236M2020-07-17 18:48./beam-pipeline-temp3ybuY4/tmpy_Ytzz
>>>>>> 236M2020-07-17 18:46./beam-pipeline-tempuxjiPT/tmpN5_UfJ
>>>>>> 236M2020-07-17 18:44./beam-pipeline-tempVpg1ME/tmpxSm8pX
>>>>>> 236M2020-07-17 18:42./beam-pipeline-tempJ4EpyB/tmpMZJU76
>>>>>> 236M2020-07-17 18:39./beam-pipeline-tempepea7Q/tmpWy1vWX
>>>>>> 236M2020-07-17 18:35./beam-pipeline-temp79qot2/tmpvN7vWA
>>>>>> 3.7M2020-07-17 18:48./beam-pipeline-temp3ybuY4/tmprlh_di
>>>>>> 3.7M2020-07-17 18:46./beam-pipeline-tempuxjiPT/tmpLmVWfe
>>>>>> 3.7M2020-07-17 18:44./beam-pipeline-tempVpg1ME/tmpvrxbY7
>>>>>> 3.7M2020-07-17 18:42./beam-pipeline-tempJ4EpyB/tmpLTb6Mj
>>>>>> 3.7M2020-07-17 18:39./beam-pipeline-tempepea7Q/tmptYF1v1
>>>>>> 3.7M2020-07-17 18:35./beam-pipeline-temp79qot2/tmplfV0Rg
>>>>>> 2.7M2020-07-17 20:10./pip-install-q9l227ef
>>>>>>
>>>>>>
>>>>>> @apache-ci-beam-jenkins-11:/tmp$ sudo du -ah --time . | sort -rhk 1,1
>>>>>> | head -n 20
>>>>>> 817M2020-07-21 02:26.
>>>>>> 242M2020-07-19 12:14./beam-pipeline-tempUTXqlM
>>>>>> 242M2020-07-19 12:11./beam-pipeline-tempx3Yno3
>>>>>> 242M2020-07-19 12:05./beam-pipeline-tempyCrMYq
>>>>>> 236M2020-07-19 12:14./beam-pipeline-tempUTXqlM/tmpstXoL0
>>>>>> 236M2020-07-19 12:11./beam-pipeline-tempx3Yno3/tmpnnVn65
>>>>>> 236M2020-07-19 12:05./beam-pipeline-tempyCrMYq/tmpRF0iNs
>>>>>> 3.7M2020-07-19 12:14./beam-pipeline-tempUTXqlM/tmpbJjUAQ
>>>>>> 3.7M2020-07-19 12:11./beam-pipeline-tempx3Yno3/tmpsmmzqe
>>>>>> 3.7M2020-07-19 12:05./beam-pipeline-tempyCrMYq/tmp5b3ZvY
>>>>>> 2.0M2020-07-19 12:14./beam-pipeline-tempUTXqlM/tmpoj3orz
>>>>>> 2.0M2020-07-19 12:11./beam-pipeline-tempx3Yno3/tmptng9sZ
>>>>>> 2.0M2020-07-19 12:05./beam-pipeline-tempyCrMYq/tmpWp6njc
>>>>>> 1.2M2020-07-19 12:14./beam-pipeline-tempUTXqlM/tmphgdj35
>>>>>> 1.2M2020-07-19 12:11./beam-pipeline-tempx3Yno3/tmp8ySXpm
>>>>>> 1.2M2020-07-19 12:05./beam-pipeline-tempyCrMYq/tmpNVEJ4e
>>>>>> 992K2020-07-12 12:00./junit642086915811430564
>>>>>> 988K2020-07-12 12:00./junit642086915811430564/beam
>>>>>> 984K2020-07-12 12:00./junit642086915811430564/beam/nodes
>>>>>> 980K2020-07-12 12:00./junit642086915811430564/beam/nodes/0
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 20, 2020 at 6:46 PM Udi Meiri  wrote:
>>>>>>
>>>>>>> You're right, job workspaces should be hermetic.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jul 20, 2020 at 1:24 PM Kenneth Knowles 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I'm probably late to this discussion and missing something, but why
>>>>>>>> are we writing to /tmp at all? I would expect TMPDIR to point somewhere
>>>>>>>> inside the job directory that will be wiped by Jenkins, and I would 
>>>>>>>> expect
>>>>>>>> code to always create temp files via APIs that respect this. Is 
>>>>>>>> Jenkins not
>>>>>>>> cleaning up? Do we not have the ability to set this up? Do we have 
>>>>>>>> bugs in
>>&

Re: Jenkins trigger phrase "run seed job" not working?

2020-07-23 Thread Udi Meiri
I have the same issue with Jenkins privileges. There's usually no insight
to test triggering logic.
For instance I happen to know that tests won't be started right now because
Infra is restarting Jenkins to install a plugin, but that's only because I
opened the ticket.

I think fetching the list of allowed user IDs as part of the seed job is
okay. Even if this disables phrases we can always manually trigger the seed
job from the Jenkins UI.

On Thu, Jul 23, 2020 at 1:11 PM Damian Gadomski 
wrote:

> Yes, I thought that whitelisting apache organization will do the trick,
> but apparently, it doesn't. Actually, it makes sense as we want to allow
> only beam committers and not all apache committers. I don't know the
> implications of membership in the apache github organization, but you for
> instance are not there :) Neither is Ahmet.
>
>
> Therefore there's nothing wrong with the Ghprb plugin, it correctly
> forbade triggering. From my investigation, the "beam-committers" GitHub
> team (which is under the apache org) is the list of people that should be
> allowed. But firstly, you cant whitelist a team with Ghprb. There's a
> ticket for that, open for 5 years
> <https://github.com/jenkinsci/ghprb-plugin/issues/160>. I could implement
> that but, secondly, the team is secret. I can't even see it. Even asfbot
> doesn't have permission to see it.
>
> You may ask, how it worked before, because on the builds.apache.org
> somehow only committers were allowed to trigger PR builds. It appeared that
> Infra created a webhook relay. It's configured here
> <https://github.com/apache/infrastructure-puppet/blob/deployment/modules/gitbox/files/conf/relay.yaml>
>  and
> it filters out all the non-committers events. I wish I had known that
> before as it was also the reason for different issues during the migration.
> Anyway, it would be hard to use that mechanism in our case as we want to
> configure it depending on the job.
>
>
> There's a publicly available source of committers list - it's LDAP. I've
> tested it and it allows anonymous connection and provides the list of the
> committers as well as the github usernames. My current idea is to read this
> from LDAP as a part of the seed job and configure the jobs with the apache
> committers present on the ghprb whitelist.
>
>
> Hope that I didn't miss anything ;) It isn't that easy to investigate that
> kind of issues with my poor privileges ;)
>
>
> Regards,
>
> Damian
>
>
> On Thu, Jul 23, 2020 at 6:52 PM Udi Meiri  wrote:
>
>> Thanks Damian! I saw that the config also has this:
>>   orgWhitelist(['apache'])
>> Shouldn't that be enough to allow all Apache committers?
>>
>> I traced the code for the membership check here:
>>
>> https://github.com/jenkinsci/ghprb-plugin/blob/4e86ed47a96a01eeaa51a479ff604252109635f6/src/main/java/org/jenkinsci/plugins/ghprb/GhprbGitHub.java#L27
>> Is there a way to see these logs?
>>
>>
>> On Thu, Jul 23, 2020 at 7:08 AM Damian Gadomski <
>> damian.gadom...@polidea.com> wrote:
>>
>>> Hi,
>>>
>>> You are right, the current behavior is wrong, I'm currently working to
>>> fix it asap. Our intention was to disable that only for non-committers.
>>>
>>> As a workaround, as a committer, you could manually add yourself (your
>>> GitHub username) to the whitelist of the SeedJob configuration:
>>> https://ci-beam.apache.org/job/beam_SeedJob/configure
>>> Then, your comment "Run Seed Job" will trigger the build. I've already
>>> manually triggered it for you that way.
>>>
>>> Of course, it will only work until the seed job gets executed - it will
>>> then override the whitelist with an empty one.
>>>
>>> [image: Selection_408.png]
>>>
>>> As a target solution, I'm planning to fetch the list of beam committers
>>> from LDAP and automatically add them to the whitelist above as a part of
>>> the seed job. I'll keep you updated about the progress.
>>>
>>> Regards,
>>> Damian
>>>
>>>
>>> On Wed, Jul 22, 2020 at 11:03 PM Ahmet Altay  wrote:
>>>
>>>> +Damian Gadomski , it might be related to
>>>> this change: https://github.com/apache/beam/pull/12319.
>>>>
>>>> /cc +Tyson Hamilton 
>>>>
>>>> On Wed, Jul 22, 2020 at 1:17 PM Udi Meiri  wrote:
>>>>
>>>>> HI,
>>>>> I'm trying to test a groovy change but I can't seem to trigger the
>>>>> seed job. It worked yesterday so I'm not sure what changed.
>>>>>
>>>>> https://github.com/apache/beam/pull/12326
>>>>>
>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Jenkins trigger phrase "run seed job" not working?

2020-07-23 Thread Udi Meiri
Thanks Damian! I saw that the config also has this:
  orgWhitelist(['apache'])
Shouldn't that be enough to allow all Apache committers?

I traced the code for the membership check here:
https://github.com/jenkinsci/ghprb-plugin/blob/4e86ed47a96a01eeaa51a479ff604252109635f6/src/main/java/org/jenkinsci/plugins/ghprb/GhprbGitHub.java#L27
Is there a way to see these logs?


On Thu, Jul 23, 2020 at 7:08 AM Damian Gadomski 
wrote:

> Hi,
>
> You are right, the current behavior is wrong, I'm currently working to fix
> it asap. Our intention was to disable that only for non-committers.
>
> As a workaround, as a committer, you could manually add yourself (your
> GitHub username) to the whitelist of the SeedJob configuration:
> https://ci-beam.apache.org/job/beam_SeedJob/configure
> Then, your comment "Run Seed Job" will trigger the build. I've already
> manually triggered it for you that way.
>
> Of course, it will only work until the seed job gets executed - it will
> then override the whitelist with an empty one.
>
> [image: Selection_408.png]
>
> As a target solution, I'm planning to fetch the list of beam committers
> from LDAP and automatically add them to the whitelist above as a part of
> the seed job. I'll keep you updated about the progress.
>
> Regards,
> Damian
>
>
> On Wed, Jul 22, 2020 at 11:03 PM Ahmet Altay  wrote:
>
>> +Damian Gadomski , it might be related to
>> this change: https://github.com/apache/beam/pull/12319.
>>
>> /cc +Tyson Hamilton 
>>
>> On Wed, Jul 22, 2020 at 1:17 PM Udi Meiri  wrote:
>>
>>> HI,
>>> I'm trying to test a groovy change but I can't seem to trigger the seed
>>> job. It worked yesterday so I'm not sure what changed.
>>>
>>> https://github.com/apache/beam/pull/12326
>>>
>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Jenkins trigger phrase "run seed job" not working?

2020-07-22 Thread Udi Meiri
HI,
I'm trying to test a groovy change but I can't seem to trigger the seed
job. It worked yesterday so I'm not sure what changed.

https://github.com/apache/beam/pull/12326


smime.p7s
Description: S/MIME Cryptographic Signature


Re: No space left on device - beam-jenkins 1 and 7

2020-07-20 Thread Udi Meiri
You're right, job workspaces should be hermetic.



On Mon, Jul 20, 2020 at 1:24 PM Kenneth Knowles  wrote:

> I'm probably late to this discussion and missing something, but why are we
> writing to /tmp at all? I would expect TMPDIR to point somewhere inside the
> job directory that will be wiped by Jenkins, and I would expect code to
> always create temp files via APIs that respect this. Is Jenkins not
> cleaning up? Do we not have the ability to set this up? Do we have bugs in
> our code (that we could probably find by setting TMPDIR to somewhere
> not-/tmp and running the tests without write permission to /tmp, etc)
>
> Kenn
>
> On Mon, Jul 20, 2020 at 11:39 AM Ahmet Altay  wrote:
>
>> Related to workspace directory growth, +Udi Meiri  filed
>> a relevant issue previously (
>> https://issues.apache.org/jira/browse/BEAM-9865) for cleaning up
>> workspace directory after successful jobs. Alternatively, we can consider
>> periodically cleaning up the /src directories.
>>
>> I would suggest moving the cron task from internal cron scripts to the
>> inventory job (
>> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_Inventory.groovy#L51).
>> That way, we can see all the cron jobs as part of the source tree, adjust
>> frequencies and clean up codes with PRs. I do not know how internal cron
>> scripts are created, maintained, and how would they be recreated for new
>> worker instances.
>>
>> /cc +Tyson Hamilton 
>>
>> On Mon, Jul 20, 2020 at 4:50 AM Damian Gadomski <
>> damian.gadom...@polidea.com> wrote:
>>
>>> Hey,
>>>
>>> I've recently created a solution for the growing /tmp directory. Part of
>>> it is the job mentioned by Tyson: *beam_Clean_tmp_directory*. It's
>>> intentionally not triggered by cron and should be a last resort solution
>>> for some strange cases.
>>>
>>> Along with that job, I've also updated every worker with an internal
>>> cron script. It's being executed once a week and deletes all the files (and
>>> only files) that were not accessed for at least three days. That's designed
>>> to be as safe as possible for the running jobs on the worker (not to delete
>>> the files that are still in use), and also to be insensitive to the current
>>> workload on the machine. The cleanup will always happen, even if some
>>> long-running/stuck jobs are blocking the machine.
>>>
>>> I also think that currently the "No space left" errors may be a
>>> consequence of growing workspace directory rather than /tmp. I didn't do
>>> any detailed analysis but e.g. currently, on apache-beam-jenkins-7 the
>>> workspace directory size is 158 GB while /tmp is only 16 GB. We should
>>> either guarantee the disk size to hold workspaces for all jobs (because
>>> eventually, every worker will execute each job) or clear also the
>>> workspaces in some way.
>>>
>>> Regards,
>>> Damian
>>>
>>>
>>> On Mon, Jul 20, 2020 at 10:43 AM Maximilian Michels 
>>> wrote:
>>>
>>>> +1 for scheduling it via a cron job if it won't lead to test failures
>>>> while running. Not a Jenkins expert but maybe there is the notion of
>>>> running exclusively while no other tasks are running?
>>>>
>>>> -Max
>>>>
>>>> On 17.07.20 21:49, Tyson Hamilton wrote:
>>>> > FYI there was a job introduced to do this in Jenkins:
>>>> beam_Clean_tmp_directory
>>>> >
>>>> > Currently it needs to be run manually. I'm seeing some out of disk
>>>> related errors in precommit tests currently, perhaps we should schedule
>>>> this job with cron?
>>>> >
>>>> >
>>>> > On 2020/03/11 19:31:13, Heejong Lee  wrote:
>>>> >> Still seeing no space left on device errors on jenkins-7 (for
>>>> example:
>>>> >> https://builds.apache.org/job/beam_PreCommit_PythonLint_Commit/2754/
>>>> )
>>>> >>
>>>> >>
>>>> >> On Fri, Mar 6, 2020 at 7:11 PM Alan Myrvold 
>>>> wrote:
>>>> >>
>>>> >>> Did a one time cleanup of tmp files owned by jenkins older than 3
>>>> days.
>>>> >>> Agree that we need a longer term solution.
>>>> >>>
>>>> >>> Passing recent tests on all executors except jenkins-12, which has
>>>> not
>>>> >>> scheduled recent builds for the pas

Re: [PROPOSAL] Make PBegin and PDone public in the Python SDK

2020-07-14 Thread Udi Meiri
So it sounds like we should:
- Make PBegin public
- Deprecate PDone return type in favor of None
- Update the programming guide's Composite Transforms section.


On Tue, Jul 14, 2020 at 5:13 PM Robert Burke  wrote:

> For contrast, the Go SDK provides an Impulse transform directly (analogous
> to PBegin, part of the model) and has a ParDo0 (which like PDone has no
> output Pcollections).  The numeral suffixing the go ParDo functions
> indicate the number of Output Pcollections are expected from the passed in
> DoFm.
>
> On Tue, Jul 14, 2020, 5:03 PM Robert Bradshaw  wrote:
>
>> Yes, PBegin and PDone are used in the SDKs, but are not part of the model.
>>
>> I would be supportive of making PBegin more public to denote that a
>> transform is a "root" of the pipeline. PDone was required for Java,
>> however I don't think there's any use for it in the Python SDK (a
>> transform can simply not return any value (which is equivalent to
>> returning None) if it has no outputs.
>>
>> On Mon, Jul 13, 2020 at 8:17 PM Udi Meiri  wrote:
>> >
>> > Details:
>> > One item of interest that came up during the implementation of
>> BEAM-10258 [1] is how to treat PTransforms that act like sources or sinks.
>> > These transforms have either no input or output PCollections,
>> respectively.
>> >
>> > Internally, we use PBegin and PDone to denote this. (ex: [2])
>> > IIUC, PBegin and PDone aren't part of the Beam model, and in the
>> pipeline description they manifest as empty input and output lists.
>> >
>> > To support type hinting, I propose making these types public.
>> > They are not currently listed in [3] and the documentation implies
>> they're internal.
>> > Java SDK already supports these types and makes them public. [4]
>> >
>> >
>> > [1] https://github.com/apache/beam/pull/12009
>> > [2]
>> https://github.com/apache/beam/blob/e73e1d1cce93930fa3d85046b9bbae7c724926bf/sdks/python/apache_beam/io/gcp/experimental/spannerio.py#L471-L506
>> > [3]
>> https://github.com/apache/beam/blob/e73e1d1cce93930fa3d85046b9bbae7c724926bf/sdks/python/apache_beam/pvalue.py#L61
>> > [4]
>> https://github.com/apache/beam/blob/e73e1d1cce93930fa3d85046b9bbae7c724926bf/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/PTransform.java#L53-L56
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


[PROPOSAL] Make PBegin and PDone public in the Python SDK

2020-07-13 Thread Udi Meiri
Details:
One item of interest that came up during the implementation of BEAM-10258
[1] is how to treat PTransforms that act like sources or sinks.
These transforms have either no input or output PCollections, respectively.

Internally, we use PBegin and PDone to denote this. (ex: [2])
IIUC, PBegin and PDone aren't part of the Beam model, and in the pipeline
description they manifest as empty input and output lists.

To support type hinting, I propose making these types public.
They are not currently listed in [3] and the documentation implies they're
internal.
Java SDK already supports these types and makes them public. [4]


[1] https://github.com/apache/beam/pull/12009
[2]
https://github.com/apache/beam/blob/e73e1d1cce93930fa3d85046b9bbae7c724926bf/sdks/python/apache_beam/io/gcp/experimental/spannerio.py#L471-L506
[3]
https://github.com/apache/beam/blob/e73e1d1cce93930fa3d85046b9bbae7c724926bf/sdks/python/apache_beam/pvalue.py#L61
[4]
https://github.com/apache/beam/blob/e73e1d1cce93930fa3d85046b9bbae7c724926bf/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/PTransform.java#L53-L56


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Monitoring performance for releases

2020-07-10 Thread Udi Meiri
On Thu, Jul 9, 2020 at 12:48 PM Maximilian Michels  wrote:

> Not yet, I just learned about the migration to a new frontend, including
> a new backend (InfluxDB instead of BigQuery).
>
> >  - Are the metrics available on metrics.beam.apache.org?
>
> Is http://metrics.beam.apache.org online? I was never able to access it.
>

It doesn't support https. I had to add an exception to the HTTPS Everywhere
extension for "metrics.beam.apache.org".


>
> >  - What is the feature delta between usinig metrics.beam.apache.org
> (much better UI) and using apache-beam-testing.appspot.com?
>
> AFAIK it is an ongoing migration and the delta appears to be high.
>
> >  - Can we notice regressions faster than release cadence?
>
> Absolutely! A report with the latest numbers including statistics about
> the growth of metrics would be useful.
>
> >  - Can we get automated alerts?
>
> I think we could setup a Jenkins job to do this.
>
> -Max
>
> On 09.07.20 20:26, Kenneth Knowles wrote:
> > Questions:
> >
> >   - Are the metrics available on metrics.beam.apache.org
> > ?
> >   - What is the feature delta between usinig metrics.beam.apache.org
> >  (much better UI) and using
> > apache-beam-testing.appspot.com  >?
> >   - Can we notice regressions faster than release cadence?
> >   - Can we get automated alerts?
> >
> > Kenn
> >
> > On Thu, Jul 9, 2020 at 10:21 AM Maximilian Michels  > > wrote:
> >
> > Hi,
> >
> > We recently saw an increase in latency migrating from Beam 2.18.0 to
> > 2.21.0 (Python SDK with Flink Runner). This proofed very hard to
> debug
> > and it looks like each version in between the two versions let to
> > increased latency.
> >
> > This is not the first time we saw issues when migrating, another
> > time we
> > had a decline in checkpointing performance and thus added a
> > checkpointing test [1] and dashboard [2] (see checkpointing widget).
> >
> > That makes me wonder if we should monitor performance (throughput /
> > latency) for basic use cases as part of the release testing.
> Currently,
> > our release guide [3] mentions running examples but not evaluating
> the
> > performance. I think it would be good practice to check relevant
> charts
> > with performance measurements as part of of the release process. The
> > release guide should reflect that.
> >
> > WDYT?
> >
> > -Max
> >
> > PS: Of course, this requires tests and metrics to be available. This
> PR
> > adds latency measurements to the load tests [4].
> >
> >
> > [1] https://github.com/apache/beam/pull/11558
> > [2]
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
> > [3] https://beam.apache.org/contribute/release-guide/
> > [4] https://github.com/apache/beam/pull/12065
> >
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Proposal] Supporting Python Annotation Typehints on PTransform

2020-07-07 Thread Udi Meiri
Thank you Saavan, bumping this in case people missed it.


On Mon, Jun 29, 2020 at 3:22 PM Saavan Nanavati  wrote:

> Hi all!
>
> Currently, in the Python SDK, we don't support annotation-style type hints
> for PTransforms.
>
> This email includes a proposal to support PEP 484 annotations on
> PTransform's expand() function, and would allow you to write something like
> the following:
>
> class MapStrToInt(beam.PTransform):
> def expand(pcoll: PCollection[str]) -> PCollection[int]:
> return pcoll | beam.Map(lambda elem: int(elem)
>
> You can view the proposal here
> <https://docs.google.com/document/d/1B91mV_KSJvkQopvIqa1LhV0pkPfMEeavKHb6Ql9GFaM/edit?usp=sharing>,
> the WIP PR here <https://github.com/apache/beam/pull/12009>, or the full
> text of the proposal below.
>
> Let me know what you think, or if you have any suggestions for
> improvements. Thanks!
>
> Best,
> Saavan
>
> ---
>
> Supporting Python Annotations on PTransforms
>
> https://issues.apache.org/jira/browse/BEAM-10258
>
> Author: Saavan Nanavati (saa...@google.com )
>
> Reviewer: Udi Meiri (eh...@google.com)
>
> Overview
>
> For any developer working in a dynamically-typed language such as Python,
> type safety is an important consideration that can help reduce runtime
> bugs, speed up the development process with IDE-assisted code completion
> and type hints using static type-checkers such as MyPy, and aid
> investigations into otherwise complex error messages, with regards to Coder
> objects, for instance.
>
> Due to the complex nature of pipelines in Beam, an internal type-checking
> system is employed by Beam to fill in the gaps that static type-checkers
> have (such as failing to match type hints across PTransforms or
> type-checking at runtime). These type hints are used to raise errors during
> pipeline construction and, optionally, at runtime as well (but at the
> expense of pipeline performance).
>
> This design document outlines a proposed improvement that would allow
> Beam’s internal typing system to integrate and utilize type hint
> annotations (that conform to PEP 484
> <https://www.python.org/dev/peps/pep-0484/>) on PTransform’s primary
> method, expand(...).
>
> Table of Contents
>
> Background
> <https://docs.google.com/document/d/1B91mV_KSJvkQopvIqa1LhV0pkPfMEeavKHb6Ql9GFaM/edit#heading=h.drvuif6t4bls>
>
> Goals
> <https://docs.google.com/document/d/1B91mV_KSJvkQopvIqa1LhV0pkPfMEeavKHb6Ql9GFaM/edit#heading=h.wlzpbos00ni>
>
> Design Details
> <https://docs.google.com/document/d/1B91mV_KSJvkQopvIqa1LhV0pkPfMEeavKHb6Ql9GFaM/edit#heading=h.qds1titcbwpu>
>
> Estimate of Work
> <https://docs.google.com/document/d/1B91mV_KSJvkQopvIqa1LhV0pkPfMEeavKHb6Ql9GFaM/edit#heading=h.7bp0u7a4fgs>
>
> Background
>
> There are currently 3 ways to declare type hints in Beam pipelines.
>
>
>1.
>
>Inline
>1.
>
>   with_input_types(...)
>   2.
>
>   with_input_types(...)
>   2.
>
>Decorators
>1.
>
>   @beam.typehints.with_input_types(...)
>   2.
>
>   @beam.typehints.with_output_types(...)
>   3.
>
>Annotations
>1.
>
>   def func(elem: int) -> str:
>   ...
>
>
> Since inline hints are not associated with PTransforms themselves,
> decorators and annotations are the preferred methods for code that is going
> to be reused, however annotations are not currently supported for
> PTransforms. This proposed feature would fix that, allowing you to write
> code akin to the following:
>
> import apache_beam as beam
>
> from apache_beam.pvalue import PCollection
>
> class MapStrToInt(beam.PTransform):
>
> def expand(pcoll: PCollection[str]) -> PCollection[int]:
>
> return pcoll | beam.Map(lambda elem: int(elem)
>
> This is in contrast to the recommended way of type-hinting PTransforms in
> the status quo, shown below:
>
> import apache_beam as beam
>
> @beam.typehints.with_input_types(str)
>
> @beam.typehints.with_input_types(int)
>
> class MapStrToInt(beam.PTransform):
>
> def expand(pcoll):
>
> return pcoll | beam.Map(lambda elem: int(elem)
>
> In addition to providing the end-user a number of approaches for how they
> can type-hint their pipelines, thereby expanding coverage for Beam’s
> internal typing system, there is potentially an additional benefit for
> static type checkers like MyPy to be able to catch errors, since they are
> only capable of working with PEP 484 annotations.
>
>
> Goals
>
>-
>
>Support PEP 484 annotations on the input/output PCollection of the
>expand(...) me

Re: [Proposal] Adding Python Coverage Reports To CI/CD

2020-07-06 Thread Udi Meiri
The idea was to add coverage to one of the already run precommit tox tasks.
This should keep the additional overhead to a minimum.
py37-cloud is the current candidate, since it contains the most
dependencies so fewer tests are skipped.
Saavan, do you have an estimate for the overhead?

Once coverage information is available, we could make use of it for review
purposes.


On Mon, Jul 6, 2020 at 3:06 PM Mikhail Gryzykhin <
gryzykhin.mikh...@gmail.com> wrote:

> I wouldn't consider build time as a blocker to add report. Even if build
> time is rather slower, we can run coverage report periodically as a
> separate job and still get use of it.
>
> On Mon, Jul 6, 2020, 2:38 PM Robert Bradshaw  wrote:
>
>> This sounds useful to me, and as it's purely informational would be a
>> low cost to try out. The one question is how it would impact build
>> runtimes--do you have an estimate for what the cost is here?
>>
>> On Sun, Jul 5, 2020 at 1:14 PM Saavan Nanavati  wrote:
>> >
>> > Hey everyone,
>> >
>> > Currently, during the Jenkins build process, we don't generate any code
>> coverage reports for the Python SDK. This email includes a proposal to
>> generate python coverage reports during the pre-commit build, upload them
>> to codecov.io for analysis and visualization, and automatically post the
>> resulting stats back to GitHub PRs to help developers decide whether their
>> tests need revision.
>> >
>> > You can view/comment on the proposal here, or the full text of the
>> proposal at the end of this email. Let me know what you think, or if there
>> are any suggestions for improvements. Thanks!
>> >
>> > Python Coverage Reports For CI/CD
>> >
>> > Author: Saavan Nanavati (saa...@google.com)
>> >
>> > Reviewer: Udi Meiri (eh...@google.com)
>> >
>> >
>> > Overview
>> >
>> >
>> > This is a proposal for generating code coverage reports for the Python
>> SDK during Jenkins’ pre-commit phase, and uploading them to codecov.io
>> for analysis, with integration back into GitHub using the service’s sister
>> app.
>> >
>> >
>> > This would extend the pre-commit build time but provide valuable
>> information for developers to revise and improve their tests before their
>> PR is merged, rather than after when it’s less likely developers will go
>> back to improve their coverage numbers.
>> >
>> >
>> > This particular 3rd party service has a litany of awesome benefits:
>> >
>> > It’s free for open-source projects
>> >
>> > It seamlessly integrates into GitHub via a comment-bot (example here)
>> >
>> > It overlays coverage report information directly onto GitHub code using
>> Sourcegraph
>> >
>> > It requires no changes to Jenkins, thereby reducing the risk of
>> breaking the live test-infra
>> >
>> > It’s extensible and can later be used for the Java & Go SDKs if it
>> proves to be awesome
>> >
>> > It has an extremely responsive support team that’s happy to help
>> open-source projects
>> >
>> >
>> > A proof-of-concept can be seen here and here.
>> >
>> >
>> > Goals
>> >
>> >
>> > Provide coverage stats for the Python SDK that update with every
>> pre-commit run
>> >
>> > Integrate these reports into GitHub so developers can take advantage of
>> the information
>> >
>> > Open a discussion for how these coverage results can be utilized in
>> code reviews
>> >
>> >
>> > Non-Goals
>> >
>> > Calculate coverage statistics using external tests located outside of
>> the Python SDK
>> >
>> >
>> > This is ideal, but would require not only merging multiple coverage
>> reports together but, more importantly, waiting for these tests to be
>> triggered in the first place. The main advantage of calculating coverage
>> during pre-commit is that developers can revise their PRs before merging,
>> which is not guaranteed if this is a goal.
>> >
>> >
>> > However, it could be something to explore for the future.
>> >
>> > Background
>> >
>> >
>> > Providing code coverage for the Python SDK has been a problem since at
>> least 2017 (BEAM-2762) with the previous solution being to calculate
>> coverage in post-commit with coverage.py, and then sending the report to
>> coveralls.io which would post to GitHub. At some point, this solution
>>

Re: Commands to detect style issues quickly before sending PR

2020-06-26 Thread Udi Meiri
Another tip for Python:
Also "pip install pre-commit" will run yapf and pylint on changed files
during "git commit".


On Fri, Jun 26, 2020 at 4:03 PM Valentyn Tymofieiev 
wrote:

> See also:
> https://cwiki.apache.org/confluence/display/BEAM/Python+Tips#PythonTips-Formatting
>
> > Task :sdks:python:test-suites:tox:py38:setupVirtualenv FAILED
> https://github.com/apache/beam/pull/12109 should fix that.
>
> On Fri, Jun 26, 2020 at 2:51 PM Alex Amato  wrote:
>
>> I sent out some PRs a few days ago, and quickly discovered a bunch of
>> errors and have been spending most of my time playing wack-a-mole without
>> knowing how to repro them all locally.
>>
>> I asked this a few years ago, and wanted to make sure I have something up
>> to date to work with. Ideally, I'd like a single command line for
>> simplicity. Here is what I've been using. I'm not sure if we have a script
>> or gradle target which already covers this or not
>>
>> *Java*
>> time ./gradlew spotlessApply && ./gradlew checkstyleMain checkstyleTest
>> javadoc spotbugsMain compileJava compileTestJava
>>
>> *Python *
>> ./gradlew  :sdks:python:test-suites:tox:py2:lintPy27_3 && ./gradlew  
>> :sdks:python:test-suites:tox:py37:lintPy37
>> && ./gradlew :sdks:python:test-suites:tox:py38:formatter
>>
>> (I think this might be correct, maybe there is a faster way to run it
>> directly with tox as well)
>>
>> 
>> Though the python command is failing for me, perhaps I need to install
>> another python version. I think we have setup steps for those in the wiki...
>>
>>
>> creating
>> build/temp.linux-x86_64-3.8/third_party/protobuf/src/google/protobuf/util/internal
>>
>> creating
>> build/temp.linux-x86_64-3.8/third_party/protobuf/src/google/protobuf/stubs
>>
>> creating
>> build/temp.linux-x86_64-3.8/third_party/protobuf/src/google/protobuf/io
>>
>> x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare
>> -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat
>> -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat
>> -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC
>> -DHAVE_PTHREAD=1 -I. -Igrpc_root -Igrpc_root/include
>> -Ithird_party/protobuf/src -I/usr/include/python3.8
>> -I/usr/local/google/home/ajamato/beam/build/gradleenv/-1227304282/include/python3.8
>> -c grpc_tools/_protoc_compiler.cpp -o
>> build/temp.linux-x86_64-3.8/grpc_tools/_protoc_compiler.o -std=c++11
>> -fno-wrapv -frtti
>>
>> grpc_tools/_protoc_compiler.cpp:216:10: fatal error: Python.h: No
>> such file or directory
>>
>>   216 | #include "Python.h"
>>
>>   |  ^~
>>
>> compilation terminated.
>>
>> error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
>>
>> 
>>
>> ERROR: Command errored out with exit status 1:
>> /usr/local/google/home/ajamato/beam/build/gradleenv/-1227304282/bin/python3.8
>> -u -c 'import sys, setuptools, tokenize; sys.argv[0] =
>> '"'"'/tmp/pip-install-xmf_k_sy/grpcio-tools/setup.py'"'"';
>> __file__='"'"'/tmp/pip-install-xmf_k_sy/grpcio-tools/setup.py'"'"';f=getattr(tokenize,
>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>> install --record /tmp/pip-record-9bhhuq55/install-record.txt
>> --single-version-externally-managed --compile --install-headers
>> /usr/local/google/home/ajamato/beam/build/gradleenv/-1227304282/include/site/python3.8/grpcio-tools
>> Check the logs for full command output.
>>
>>
>> *> Task :sdks:python:test-suites:tox:py38:setupVirtualenv* FAILED
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] New PMC Member: Alexey Romanenko

2020-06-16 Thread Udi Meiri
Congratulations!

On Tue, Jun 16, 2020 at 9:33 AM rahul patwari 
wrote:

> Congrats Alexey!
>
> On Tue, Jun 16, 2020 at 9:27 PM Ismaël Mejía  wrote:
>
>> Please join me and the rest of Beam PMC in welcoming Alexey Romanenko as
>> our
>> newest PMC member.
>>
>> Alexey has significantly contributed to the project in different ways: new
>> features and improvements in the Spark runner(s) as well as maintenance of
>> multiple IO connectors including some of our most used ones (Kafka and
>> Kinesis/Aws). Alexey is also quite active helping new contributors and
>> our user
>> community in the mailing lists / slack and Stack overflow.
>>
>> Congratulations Alexey!  And thanks for being a part of Beam!
>>
>> Ismaël
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Automation for Jira

2020-06-15 Thread Udi Meiri
Interesting: you could consider the JIRA as active as long as the linked
PRs are open.

On Mon, Jun 15, 2020 at 2:28 PM Luke Cwik  wrote:

> One thing I noticed is that links being added to issues automatically
> (e.g. a PR is opened that tags something) doesn't reset the activity
> counter so things are marked stale even though there are PRs opened for the
> issue recently.
>
> On Thu, Jun 11, 2020 at 10:37 AM Kenneth Knowles  wrote:
>
>> Yes, my inbox is hit as well. I'm enjoying going through some old bugs
>> actually. One takeaway is that we have a lot of early Jiras that are still
>> relevant, and also that there are a lot of duplicates. I think some
>> automation to help find duplicates might be helpful.
>>
>> Also, some accidental automation humor:
>> https://issues.apache.org/jira/browse/BEAM-6414
>>
>> Kenn
>>
>> On Tue, Jun 2, 2020 at 8:39 AM Brian Hulette  wrote:
>>
>>> RIP my inbox :)
>>> This is overwhelming, but I think it will be very good. Thanks for
>>> setting this up Kenn.
>>>
>>> Brian
>>>
>>> On Mon, Jun 1, 2020 at 9:57 PM Kenneth Knowles  wrote:
>>>
 I have now added modified 4:

 4a. labeling stale-P2 for unassigned 60 day old jiras
 4b. after 14 days downgrading stale-P2 labeled jiras to P3

 On Mon, Jun 1, 2020 at 9:06 PM Kenneth Knowles  wrote:

> I just added 3a and 3b. The comments will appear to be coming from me.
> That is a misconfiguration that I have now fixed. In the future they will
> come from the "Beam Jira Bot". There were 1119 stale-assigned issues.
>
> Kenn
>
> On Fri, May 1, 2020 at 1:41 PM Kenneth Knowles 
> wrote:
>
>> Based on the mild consensus and my availability, I just did #1. I
>> have not done any others. It seems #2 may be infeasible [1] and I am
>> convinced that we should not auto-close. I'll update again in a bit...
>>
>> Kenn
>>
>> [1] https://jira.atlassian.com/browse/JRACLOUD-28064
>>
>> On Wed, Apr 29, 2020 at 2:54 PM Ahmet Altay  wrote:
>>
>>> +1 for the automations. I agree with concerns related to #4. Auto
>>> closing issues is not a good experience. A person goes through the work 
>>> of
>>> reporting an issue. This might very well be their first contribution.
>>> Automatically closing these issues with no human comments might make the
>>> reporter feel ignored. Auto-lowering the priority is a good suggestion.
>>>
>>> I wonder if we can also do a spring cleaning up reviewing jira
>>> components/their default owners. If we can break the jira into more
>>> components, we could have more people as component owners, triaging 
>>> smaller
>>> per-component backlogs.
>>> On Wed, Apr 29, 2020 at 11:17 AM Tyson Hamilton 
>>> wrote:
>>>
 +1 for automation.

 Regarding #4, what about adding the constraint that this rule only
 applies to issues that are incomplete and require more information 
 from the
 reporter?

 Unfortunately it would require a human to triage issues to
 determine this and apply an appropriate label. Triage should happen
 regularly anyways, ideally even periodically for old issues, though 
 this
 may be asking a bit too much.

>>>
>>> Even with automation, manual triaging would be a valuable action. If
>>> the automation can reduce the backlog for manual reviewers, doing manual
>>> triage would be easier to do, incremental work.
>>>
>>>

 Regarding #5 & #6, having some SLO for P0/P1 issues for both
 updates and closures would be helpful in setting expectations. A daily 
 P0
 violation email to dev@ sounds right, for P1 weekly. What would
 the Slack notification look like? It would be neat if it could ping the
 assignee directly. What group would be victims for the auto-assigner?

>>>
>>> I agree with this. Email, or a dashboard would work equally well.
>>> (We need to first agree on SLOs though.)
>>>
>>>
>>>

 On 2020/04/29 17:15:48, Brian Hulette  wrote:
 > Agree I think this all sounds good except for 4.
 >
 > I like the idea of using automation to help tame the backlog of
 jiras, but
 > I worry that 4 could lead to a bad experience for users. Say they
 file a
 > jira and maybe get it assigned, and then watch as it bounces all
 the way
 > down to closed as obsolete because it was ignored.
 > The status quo (the bug just gets ignored anyway) isn't great,
 but at least
 > the user doesn't have automation working against them.
 >
 > Is there something else we can do to make sure these bugs get
 attention?
 >
 > Brian
 >
 >
 > On Wed, Apr 29, 2020 at 10:00 AM Robert Bradshaw <
 

Re: Python2.7 Beam End-of-Life Date

2020-06-15 Thread Udi Meiri
+1

On Mon, Jun 15, 2020 at 4:27 PM Ahmet Altay  wrote:

> As a concrete proposal, could we commit to removing python 2 support by
> 2.24? In other words, mark the next release 2.23 as the last python 2
> compatible Beam version.
>
> On Mon, Jun 15, 2020 at 2:09 PM Valentyn Tymofieiev 
> wrote:
>
>> Another input here:
>>
>> If you opened a Python PR in the last few days, you probably noticed that
>> our test suites were broken by a transitive dependency of Beam that dropped
>> python 2 support, but did not declare python_requires>=3 in its setup.py
>> [1]. This temporarily broke a subset of Beam Py2 users (who did not
>> explicitly pin the 'rsa' dependency), and still affects Beam
>> development[2].
>>
>> This is the second time[3] Beam is affected with an issue of this kind,
>> so support of Python 2 starts to slow down our development, and add toil
>> for maintainers of packages we depend on (both directly and transitively).
>>
>> [1] https://github.com/sybrenstuvel/python-rsa/issues/152
>> [2]
>> https://lists.apache.org/thread.html/r9993b40b0c1cb8682ce56013165d4b80fdde0ee469a73bcb9466ddfb%40%3Cdev.beam.apache.org%3E
>> [3] https://github.com/hamcrest/PyHamcrest/issues/131
>>
>> On Tue, Jun 9, 2020 at 4:06 PM Ahmet Altay  wrote:
>>
>>> Thank you for re-opening this Valentyn. I am in favor of EOLing py2
>>> support sooner than later. The reality is that we will not be effectively
>>> supporting beam python 2 for a long time while the ecosystem already EOLed
>>> python 2. That said, a significant chunk (but no longer a majority) of our
>>> users are still using python 2. Upgrades are painful, it might be
>>> especially painful nowadays. It would be good to hear counter view points,
>>> user voices related to this.
>>>
>>> On Thu, Jun 4, 2020 at 4:53 PM Valentyn Tymofieiev 
>>> wrote:
>>>
 Back at the end of February we decided to revisit this conversation in
 3 months. Do folks on this thread have any new input or perspective
 regarding us balancing "user pain/contributor pain/our ability to
 continuously test with python 2 in a shifting environment"?

 Some new information on my end is that we have been seeing steady
 adoption of Python 3 among Beam Python users in Dataflow, particularly
 strong adoption among streaming users, and Dataflow is sunsetting Python 2
 support for all released Beam SDKs later this year [1]. We will have to
 remove Python 2 Beam test suites that use Dataflow  when Dataflow runner
 disables Py2 support if this happens before Beam Py2 EOL (when we have to
 remove all Py2 suites), including performance tests that still use Dataflow
 on Python 3.

 I am curious how much motivation there is in the community at this
 moment to continue Py2 support in Beam,  whether any previous Py3 migration
 blockers were resolved or any new blockers discovered among Beam users.

 [1] https://cloud.google.com/python/docs/python2-sunset/#dataflow

 On Fri, May 8, 2020 at 3:52 PM Valentyn Tymofieiev 
 wrote:

> That's good news! Thanks for sharing.
>
> Another datapoint, here are a few of Beam's dependencies that no
> longer release new py2 artifacts (I looked at REQUIRED_PACKAGES +  aws,
> gcp, and interactive extras):
>
> hdfs
> numpy
> pyarrow
> ipython
>
> There are more if we include transitive dependencies and test-only
> packages. I also remember encountering one issue last month that was 
> broken
> only on Py2, which we had to go back and fix.
>
> If others have noticed frictions related to ongoing Py2 support or
> have updates on previously mentioned Py3 migration blockers, feel free to
> post them.
>
> On Fri, May 8, 2020 at 9:19 AM Robert Bradshaw 
> wrote:
>
>> It hasn't been 3 months yet, but I wanted to call out a milestone that
>> Python 3 downloads crossed the 50% threshold on pypi, if just briefly.
>>
>> On Thu, Feb 13, 2020 at 12:40 AM Ismaël Mejía 
>> wrote:
>> >
>> > > I would suggest re-evaluating this within the next 3 months
>> again. We need to balance between user pain/contributor pain/our ability 
>> to
>> continuously test with python 2 in a shifting environment.
>> >
>> > Good idea for the in 3 months evaluation, at that point also
>> distributions will probably be phasing out python2 by default which
>> definitely help in this direction.
>> > Thanks for updating the roadmap Ahmet
>> >
>> >
>> > On Thu, Feb 13, 2020 at 2:49 AM Ahmet Altay 
>> wrote:
>> >>
>> >>
>> >>
>> >> On Wed, Feb 12, 2020 at 1:29 AM Ismaël Mejía 
>> wrote:
>> >>>
>> >>> I am with Chad on this, we should probably extend it a bit more,
>> even if it
>> >>> makes us struggle a bit at least we have some workarounds as
>> Robert suggests,
>> >>> and as Chad said there are still many people playing the 

Re: DISCUSS: FnAPI proto stabiliization

2020-06-12 Thread Udi Meiri
I'm not very familiar with this effort.
Were there ITs / POCs created for these changes? (to surface any obvious
bugs)
Are these changes usable in DirectRunner?


On Fri, Jun 12, 2020 at 8:50 AM Luke Cwik  wrote:

> A few months back there was a discussion[1] about performing work to
> stabilize the protos used for pipeline execution looking forward to cross
> language pipelines and runners who want to use them across SDK versions
> (Dataflow).
>
> All the proposed incompatible clean-up tasks were done and made it into
> 2.21 (there are some left related to documentation and cleaning up some
> stuff that can be removed in a backwards compatible way and general
> re-organization within the files to delineate what is stable and what
> isn't).
>
> Beyond documenting the versioning story (sketch below) in a more durable
> location then this ML, performing these last clean-up tasks and general
> re-organization within the files, is there anything else that should be
> done before we can vote and consider the protos to be stable (which would
> mean that 2.21 would contain the first stable version assuming no other
> incompatible changes are suggested)?
>
> The versioning story is around 3 parts and effectively occurs whenever
> there is an incompatible change such as:
> * adding a new field that didn't exist where it semantically changes what
> is to be done
> * removing a field that was effectively required
> * requiring an SDK or runner to behave differently (e.g. support large
> iterables, support a new API (such as a future map state for StatefulDoFns))
> The three ways of handling versioning for incompatible changes are:
> * many protos have URNs, when there is an incompatible change the URN
> should be changed. If it is effectively the same thing then this should
> lead to a version bump and update of the documentation reflecting what the
> requirements of the new version are.
> * there is a capabilities section on each environment, this should
> enumerate everything the SDK can support, protocols (e.g. large iterables,
> ...), coders, well known transforms, ...
> * there is a requirements section on the pipeline proto, this is an
> enumeration of everything the SDK needs the runner to know to be able to
> interpret the pipeline (e.g. splittable dofn, requires time sorted input,
> ...).
>
> Updating the URN of the transform/coder is typically the easiest way to
> handle incompatible changes followed by using the capabilities list to
> enable new things (used like an allowlist) and the requirements list to
> prevent runners from doing things they shouldn't (used like a denylist).
> Many features/APIs that are part of the initial version are implicitly not
> in either the capabilities or requirements lists to prevent a huge
> definition list and can be disabled in the future by relying on adding
> requirements that disable these currently unnamed features/APIs if it is
> ever necessary.
>
> 1:
> https://lists.apache.org/thread.html/rdf247cfa3a509f80578f03b2454ea1e50474ee3576a059486d58fdf4%40%3Cdev.beam.apache.org%3E
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Beam Jenkins Migration

2020-06-12 Thread Udi Meiri
This is great! Looking forward to it.

Would any metrics need to be migrated over to the new Jenkins?
http://metrics.beam.apache.org/



On Fri, Jun 12, 2020 at 9:57 AM Tyson Hamilton  wrote:

> Very exciting! Thanks for the advanced notice Damian.
>
> On Fri, Jun 12, 2020 at 7:58 AM Damian Gadomski <
> damian.gadom...@polidea.com> wrote:
>
>> Hello,
>>
>> During the last few days, I was preparing for the Beam Jenkins migration
>> from builds.apache.org to ci-beam.apache.org. The new Jenkins Master
>> will be dedicated only for Beam related jobs, all Beam Committers will have
>> build configure access, and Beam PMC will have Admin (GUI) Access.
>>
>> We (in cooperation with Infra) are almost ready for the migration itself
>> and I want to share with you the details of our plan. We are planning to
>> start the migration next week, most likely on Tuesday. I'll keep you
>> updated on the progress. We do not expect any issues nor the outage of the
>> CI services, everything should be more or less unnoticeable. Just don't be
>> surprised that the Jenkins URL will change to https://ci-beam.apache.org
>>
>> If you are curious, here are the steps that we are going to take:
>>
>> 1. Create 16 new CI nodes that will be connected to the new CI. We will
>> then have simultaneously running two CI servers.
>> 2. Verify that new builds work as expected on the new instance (compare
>> results of cron builds). (a day or two would be sufficient)
>> 3. Move the responsibility of Phrase/PR/Commit builds to the new CI,
>> disable on the old one.
>> 4. Modify the .test-infra/jenkins/README.md to point to the new instance
>> and replace Post-commit tests status in README.md and
>> .github/PULL_REQUEST_TEMPLATE.md
>> 5. Disable the jobs on the old Jenkins and add a description to each job
>> with the URL to the corresponding one on the new CI.
>> 6. Turn off VM instances of the old nodes.
>> 7. Remove VM instances of the old nodes.
>>
>> In case of any questions or doubts feel free to ask :)
>>
>> Regards,
>> Damian
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: python precommit error - google-auth depenedency?

2020-06-11 Thread Udi Meiri
BTW, the new pip resolver seems to do the right thing by installing
rsa==4.0 instead of 4.2 in this case (upgrade pip to 20.1.1):

pip --unstable-feature=resolver install apache-beam[gcp]

https://discuss.python.org/t/an-update-on-pip-and-dependency-resolution/1898/4

On Thu, Jun 11, 2020 at 2:40 PM Valentyn Tymofieiev 
wrote:

> > In python 2 oauth2client's rsa>3.14 requirement will resolve to
> latest python2 supporting version of rsa (4.0?)
>
> Unfortunately rsa 4.1 didn't set a python_requires stanza to prevent the
> breakage of Py2 users, opened:
> https://github.com/sybrenstuvel/python-rsa/issues/152.
>
> On Wed, Jun 10, 2020 at 7:14 PM Ahmet Altay  wrote:
>
>>
>>
>> On Wed, Jun 10, 2020 at 7:11 PM Bu Sun Kim  wrote:
>>
>>> Hi,
>>>
>>> google-auth has been released (with the wider pin
>>> <https://github.com/googleapis/google-auth-library-python/blob/6350834ee25295e0754b6611fdff257668a0b0c4/setup.py#L24>
>>>  on
>>> rsa).
>>>
>>
>> Thank you! Much appreciated!
>>
>>
>>>
>>> On Wed, Jun 10, 2020 at 6:07 PM Ahmet Altay  wrote:
>>>
>>>>
>>>>
>>>> On Wed, Jun 10, 2020 at 4:07 PM Kyle Weaver 
>>>> wrote:
>>>>
>>>>> The fix to google-auth has been merged. Is the plan just to wait until
>>>>> a new version of google-auth is released and ignore the failing tests 
>>>>> until
>>>>> then? (btw I filed a JIRA for this before I realized it was already being
>>>>> discussed here: https://issues.apache.org/jira/browse/BEAM-10232)
>>>>>
>>>>
>>>> Could we add it as a test dependency? Or if that is not possible, add
>>>> it but remove it before next release?
>>>>
>>>> It seems like there is a release PR on google-auth (
>>>> https://github.com/googleapis/google-auth-library-python/pull/525). I
>>>> asked +Bu Sun Kim  on the PR, they usually
>>>> release pretty quickly.
>>>>
>>>>
>>>>>
>>>>> On Wed, Jun 10, 2020 at 3:21 PM Udi Meiri  wrote:
>>>>>
>>>>>> Yes you're right, Py2 envs are still using 4.0.
>>>>>>
>>>>>> On Wed, Jun 10, 2020 at 3:03 PM Ahmet Altay  wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 10, 2020 at 2:25 PM Udi Meiri  wrote:
>>>>>>>
>>>>>>>> 4.1 drops Python 2 support, so I'm not sure if we're ready for that
>>>>>>>> yet.
>>>>>>>>
>>>>>>>
>>>>>>> Wouldn't that work by default? In python 2 oauth2client's rsa>3.14
>>>>>>> requirement will resolve to latest python2 supporting version of rsa 
>>>>>>> (4.0?)
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jun 10, 2020 at 2:20 PM Ahmet Altay 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Looks like there is an attempt to fix this:
>>>>>>>>> https://github.com/googleapis/google-auth-library-python/pull/524
>>>>>>>>>
>>>>>>>>> On Wed, Jun 10, 2020 at 2:07 PM Udi Meiri 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jun 10, 2020 at 1:59 PM Ahmet Altay 
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jun 10, 2020 at 1:29 PM Kenneth Knowles 
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> You may be interested in following
>>>>>>>>>>>> https://github.com/pypa/pip/issues/988 if you are not already.
>>>>>>>>>>>>
>>>>>>>>>>>> Kenn
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jun 10, 2020 at 12:17 PM Udi Meiri 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Seems like manually installing rsa==4.0 satisfies deps

Re: python precommit error - google-auth depenedency?

2020-06-10 Thread Udi Meiri
Yes you're right, Py2 envs are still using 4.0.

On Wed, Jun 10, 2020 at 3:03 PM Ahmet Altay  wrote:

>
>
> On Wed, Jun 10, 2020 at 2:25 PM Udi Meiri  wrote:
>
>> 4.1 drops Python 2 support, so I'm not sure if we're ready for that yet.
>>
>
> Wouldn't that work by default? In python 2 oauth2client's rsa>3.14
> requirement will resolve to latest python2 supporting version of rsa (4.0?)
>
>
>>
>> On Wed, Jun 10, 2020 at 2:20 PM Ahmet Altay  wrote:
>>
>>> Looks like there is an attempt to fix this:
>>> https://github.com/googleapis/google-auth-library-python/pull/524
>>>
>>> On Wed, Jun 10, 2020 at 2:07 PM Udi Meiri  wrote:
>>>
>>>>
>>>>
>>>> On Wed, Jun 10, 2020 at 1:59 PM Ahmet Altay  wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 10, 2020 at 1:29 PM Kenneth Knowles 
>>>>> wrote:
>>>>>
>>>>>> You may be interested in following
>>>>>> https://github.com/pypa/pip/issues/988 if you are not already.
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Wed, Jun 10, 2020 at 12:17 PM Udi Meiri  wrote:
>>>>>>
>>>>>>> Seems like manually installing rsa==4.0 satisfies deps, but pip
>>>>>>> doesn't do transitive deps well.
>>>>>>>
>>>>>>> Would it be right to put a direct dependency on rsa<4.1,>=3.1.4 in
>>>>>>> setup.py?
>>>>>>>
>>>>>>
>>>>> Did you find where the google-auth dependency is coming from? We might
>>>>> try to fix the problem at the source of that dependency instead of adding
>>>>> rsa to beam's setup.py.
>>>>>
>>>>
>>>> oauth2client depends on rsa>=3.14 with no upper limit. rsa 4.1 was
>>>> released today.
>>>> The places that require rsa<4.1 are deeper in the dependency tree. For
>>>> example:
>>>>
>>>> google-cloud-bigquery==1.24.0
>>>>   - google-api-core [required: >=1.15.0,<2.0dev, installed: 1.20.0]
>>>> - google-auth [required: >=1.14.0,<2.0dev, installed: 1.16.1]
>>>>   - rsa [required: >=3.1.4,<4.1, installed: 4.1]
>>>>
>>>>
>>>>>
>>>>>>
>>>>>>> On Wed, Jun 10, 2020 at 11:48 AM Udi Meiri  wrote:
>>>>>>>
>>>>>>>> Thanks, that helped in an unexpected way. :)
>>>>>>>> I should have used the "gcp" extra instead of "cloud" in my pip
>>>>>>>> install command above.
>>>>>>>>
>>>>>>>> On Wed, Jun 10, 2020 at 11:37 AM Valentyn Tymofieiev <
>>>>>>>> valen...@google.com> wrote:
>>>>>>>>
>>>>>>>>> > Any ideas on how to debug where this requirement is coming from?
>>>>>>>>> You could try installing and calling pipdeptree [1] from a Jenkins
>>>>>>>>> job, and see if it helps.
>>>>>>>>>
>>>>>>>>> [1] https://pypi.org/project/pipdeptree/
>>>>>>>>> On Wed, Jun 10, 2020 at 11:00 AM Udi Meiri 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>> I'm trying to understand these "pip check" failures:
>>>>>>>>>>
>>>>>>>>>> ERROR: google-auth 1.16.1 has requirement rsa<4.1,>=3.1.4, but 
>>>>>>>>>> you'll have rsa 4.1 which is incompatible
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2860/console
>>>>>>>>>>
>>>>>>>>>> However, when I do
>>>>>>>>>> pip install dist/apache-beam-2.23.0.dev0.tar.gz[test,cloud]
>>>>>>>>>>
>>>>>>>>>> locally, the google-auth package is not installed at all.
>>>>>>>>>> Any ideas on how to debug where this requirement is coming from?
>>>>>>>>>>
>>>>>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: python precommit error - google-auth depenedency?

2020-06-10 Thread Udi Meiri
4.1 drops Python 2 support, so I'm not sure if we're ready for that yet.

On Wed, Jun 10, 2020 at 2:20 PM Ahmet Altay  wrote:

> Looks like there is an attempt to fix this:
> https://github.com/googleapis/google-auth-library-python/pull/524
>
> On Wed, Jun 10, 2020 at 2:07 PM Udi Meiri  wrote:
>
>>
>>
>> On Wed, Jun 10, 2020 at 1:59 PM Ahmet Altay  wrote:
>>
>>>
>>>
>>> On Wed, Jun 10, 2020 at 1:29 PM Kenneth Knowles  wrote:
>>>
>>>> You may be interested in following
>>>> https://github.com/pypa/pip/issues/988 if you are not already.
>>>>
>>>> Kenn
>>>>
>>>> On Wed, Jun 10, 2020 at 12:17 PM Udi Meiri  wrote:
>>>>
>>>>> Seems like manually installing rsa==4.0 satisfies deps, but pip
>>>>> doesn't do transitive deps well.
>>>>>
>>>>> Would it be right to put a direct dependency on rsa<4.1,>=3.1.4 in
>>>>> setup.py?
>>>>>
>>>>
>>> Did you find where the google-auth dependency is coming from? We might
>>> try to fix the problem at the source of that dependency instead of adding
>>> rsa to beam's setup.py.
>>>
>>
>> oauth2client depends on rsa>=3.14 with no upper limit. rsa 4.1 was
>> released today.
>> The places that require rsa<4.1 are deeper in the dependency tree. For
>> example:
>>
>> google-cloud-bigquery==1.24.0
>>   - google-api-core [required: >=1.15.0,<2.0dev, installed: 1.20.0]
>> - google-auth [required: >=1.14.0,<2.0dev, installed: 1.16.1]
>>   - rsa [required: >=3.1.4,<4.1, installed: 4.1]
>>
>>
>>>
>>>>
>>>>> On Wed, Jun 10, 2020 at 11:48 AM Udi Meiri  wrote:
>>>>>
>>>>>> Thanks, that helped in an unexpected way. :)
>>>>>> I should have used the "gcp" extra instead of "cloud" in my pip
>>>>>> install command above.
>>>>>>
>>>>>> On Wed, Jun 10, 2020 at 11:37 AM Valentyn Tymofieiev <
>>>>>> valen...@google.com> wrote:
>>>>>>
>>>>>>> > Any ideas on how to debug where this requirement is coming from?
>>>>>>> You could try installing and calling pipdeptree [1] from a Jenkins
>>>>>>> job, and see if it helps.
>>>>>>>
>>>>>>> [1] https://pypi.org/project/pipdeptree/
>>>>>>> On Wed, Jun 10, 2020 at 11:00 AM Udi Meiri  wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> I'm trying to understand these "pip check" failures:
>>>>>>>>
>>>>>>>> ERROR: google-auth 1.16.1 has requirement rsa<4.1,>=3.1.4, but you'll 
>>>>>>>> have rsa 4.1 which is incompatible
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2860/console
>>>>>>>>
>>>>>>>> However, when I do
>>>>>>>> pip install dist/apache-beam-2.23.0.dev0.tar.gz[test,cloud]
>>>>>>>>
>>>>>>>> locally, the google-auth package is not installed at all.
>>>>>>>> Any ideas on how to debug where this requirement is coming from?
>>>>>>>>
>>>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: python precommit error - google-auth depenedency?

2020-06-10 Thread Udi Meiri
On Wed, Jun 10, 2020 at 1:59 PM Ahmet Altay  wrote:

>
>
> On Wed, Jun 10, 2020 at 1:29 PM Kenneth Knowles  wrote:
>
>> You may be interested in following https://github.com/pypa/pip/issues/988 if
>> you are not already.
>>
>> Kenn
>>
>> On Wed, Jun 10, 2020 at 12:17 PM Udi Meiri  wrote:
>>
>>> Seems like manually installing rsa==4.0 satisfies deps, but pip doesn't
>>> do transitive deps well.
>>>
>>> Would it be right to put a direct dependency on rsa<4.1,>=3.1.4 in
>>> setup.py?
>>>
>>
> Did you find where the google-auth dependency is coming from? We might try
> to fix the problem at the source of that dependency instead of adding rsa
> to beam's setup.py.
>

oauth2client depends on rsa>=3.14 with no upper limit. rsa 4.1 was released
today.
The places that require rsa<4.1 are deeper in the dependency tree. For
example:

google-cloud-bigquery==1.24.0
  - google-api-core [required: >=1.15.0,<2.0dev, installed: 1.20.0]
- google-auth [required: >=1.14.0,<2.0dev, installed: 1.16.1]
  - rsa [required: >=3.1.4,<4.1, installed: 4.1]


>
>>
>>> On Wed, Jun 10, 2020 at 11:48 AM Udi Meiri  wrote:
>>>
>>>> Thanks, that helped in an unexpected way. :)
>>>> I should have used the "gcp" extra instead of "cloud" in my pip install
>>>> command above.
>>>>
>>>> On Wed, Jun 10, 2020 at 11:37 AM Valentyn Tymofieiev <
>>>> valen...@google.com> wrote:
>>>>
>>>>> > Any ideas on how to debug where this requirement is coming from?
>>>>> You could try installing and calling pipdeptree [1] from a Jenkins
>>>>> job, and see if it helps.
>>>>>
>>>>> [1] https://pypi.org/project/pipdeptree/
>>>>> On Wed, Jun 10, 2020 at 11:00 AM Udi Meiri  wrote:
>>>>>
>>>>>> Hi,
>>>>>> I'm trying to understand these "pip check" failures:
>>>>>>
>>>>>> ERROR: google-auth 1.16.1 has requirement rsa<4.1,>=3.1.4, but you'll 
>>>>>> have rsa 4.1 which is incompatible
>>>>>>
>>>>>>
>>>>>> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2860/console
>>>>>>
>>>>>> However, when I do
>>>>>> pip install dist/apache-beam-2.23.0.dev0.tar.gz[test,cloud]
>>>>>>
>>>>>> locally, the google-auth package is not installed at all.
>>>>>> Any ideas on how to debug where this requirement is coming from?
>>>>>>
>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: python precommit error - google-auth depenedency?

2020-06-10 Thread Udi Meiri
Seems like manually installing rsa==4.0 satisfies deps, but pip doesn't do
transitive deps well.

Would it be right to put a direct dependency on rsa<4.1,>=3.1.4 in setup.py?

On Wed, Jun 10, 2020 at 11:48 AM Udi Meiri  wrote:

> Thanks, that helped in an unexpected way. :)
> I should have used the "gcp" extra instead of "cloud" in my pip install
> command above.
>
> On Wed, Jun 10, 2020 at 11:37 AM Valentyn Tymofieiev 
> wrote:
>
>> > Any ideas on how to debug where this requirement is coming from?
>> You could try installing and calling pipdeptree [1] from a Jenkins job,
>> and see if it helps.
>>
>> [1] https://pypi.org/project/pipdeptree/
>> On Wed, Jun 10, 2020 at 11:00 AM Udi Meiri  wrote:
>>
>>> Hi,
>>> I'm trying to understand these "pip check" failures:
>>>
>>> ERROR: google-auth 1.16.1 has requirement rsa<4.1,>=3.1.4, but you'll have 
>>> rsa 4.1 which is incompatible
>>>
>>>
>>> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2860/console
>>>
>>> However, when I do
>>> pip install dist/apache-beam-2.23.0.dev0.tar.gz[test,cloud]
>>>
>>> locally, the google-auth package is not installed at all.
>>> Any ideas on how to debug where this requirement is coming from?
>>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: python precommit error - google-auth depenedency?

2020-06-10 Thread Udi Meiri
Thanks, that helped in an unexpected way. :)
I should have used the "gcp" extra instead of "cloud" in my pip install
command above.

On Wed, Jun 10, 2020 at 11:37 AM Valentyn Tymofieiev 
wrote:

> > Any ideas on how to debug where this requirement is coming from?
> You could try installing and calling pipdeptree [1] from a Jenkins job,
> and see if it helps.
>
> [1] https://pypi.org/project/pipdeptree/
> On Wed, Jun 10, 2020 at 11:00 AM Udi Meiri  wrote:
>
>> Hi,
>> I'm trying to understand these "pip check" failures:
>>
>> ERROR: google-auth 1.16.1 has requirement rsa<4.1,>=3.1.4, but you'll have 
>> rsa 4.1 which is incompatible
>>
>>
>> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2860/console
>>
>> However, when I do
>> pip install dist/apache-beam-2.23.0.dev0.tar.gz[test,cloud]
>>
>> locally, the google-auth package is not installed at all.
>> Any ideas on how to debug where this requirement is coming from?
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


python precommit error - google-auth depenedency?

2020-06-10 Thread Udi Meiri
Hi,
I'm trying to understand these "pip check" failures:

ERROR: google-auth 1.16.1 has requirement rsa<4.1,>=3.1.4, but you'll
have rsa 4.1 which is incompatible


https://builds.apache.org/job/beam_PreCommit_Python_Cron/2860/console

However, when I do
pip install dist/apache-beam-2.23.0.dev0.tar.gz[test,cloud]

locally, the google-auth package is not installed at all.
Any ideas on how to debug where this requirement is coming from?


smime.p7s
Description: S/MIME Cryptographic Signature


Kafka IO performance tests leaving behind disks on GCP

2020-06-04 Thread Udi Meiri
Hi,
I opened a bug on what seems to be leftover GKE disk images:
https://issues.apache.org/jira/browse/BEAM-10145

Can anyone familiar with these take a look?

Thanks!


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RESULT][VOTE] Accept the Firefly design donation as Beam Mascot - Deadline Mon April 6

2020-06-04 Thread Udi Meiri
That's great!

Can someone please make custom emojis for our Slack from the expression
sheet images? :) (
https://github.com/apache/beam/blob/master/website/www/site/static/images/mascot/model_sheet.png
)
A smiling firefly and one with heart eyes would be awesome.

On Wed, Jun 3, 2020 at 11:04 AM Aizhamal Nurmamat kyzy 
wrote:

> Thank you Ismael for reviewing and merging the pull request.
>
> Now the mascot files can be found here
> https://beam.apache.org/community/mascot/ on the website and here
> https://github.com/apache/beam/tree/master/website/www/site/static/images/mascot
>  in
> the repo.
>
> This project is complete now, therefore closing this thread.
>
> Thanks everyone!
>
> On Thu, May 21, 2020 at 7:07 PM Aizhamal Nurmamat kyzy <
> aizha...@apache.org> wrote:
>
>> Hello everyone,
>> Julian and I have created this pr[1] to create a page with the model
>> sheet, and links to the mascot image files. Can someone please review?
>> Thanks!
>> Aizhamal
>>
>> [1] https://github.com/apache/beam/pull/11780
>>
>> On Mon, May 11, 2020 at 10:14 AM Aizhamal Nurmamat kyzy <
>> aizha...@apache.org> wrote:
>>
>>> @Ismael, this is in my work items for as soon as we complete the
>>> migration of the website.
>>> @Kyle, thanks for filing Jira, I assigned it to myself.
>>>
>>> On Mon, May 11, 2020 at 10:03 AM Kyle Weaver 
>>> wrote:
>>>
 > Now that the vote has passed maybe we should add the images somewhere
 > in the website so people can easily find the Firefly to use it

 +1 Maybe something to revisit after the website overhaul is complete. I
 filed https://jira.apache.org/jira/browse/BEAM-9948 if anyone wants to
 take it.

 On Mon, May 11, 2020 at 12:57 PM Ismaël Mejía 
 wrote:

> Now that the vote has passed maybe we should add the images somewhere
> in the website so people can easily find the Firefly to use it.
> Something like what we do with our logos
> https://beam.apache.org/community/logos/
>
> WDYT? any taker?
>
> On Tue, Apr 28, 2020 at 7:43 PM Pablo Estrada 
> wrote:
> >
> > I'll be happy to as well!
> >
> > On Sun, Apr 26, 2020 at 4:18 AM Maximilian Michels 
> wrote:
> >>
> >> Hey Maria,
> >>
> >> I can testify :)
> >>
> >> Cheers,
> >> Max
> >>
> >> On 23.04.20 20:49, María Cruz wrote:
> >> > Hi everyone!
> >> > It is amazing to see how this process developed to collaboratively
> >> > create Apache Beam's mascot. Thank you to everyone who got
> involved!
> >> > I would like to write a blogpost for the Beam website, and I
> wanted to
> >> > ask you: would anyone like to offer their testimony about the
> process of
> >> > creating the Beam mascot, and what this means to you? Everyone's
> >> > testimony is welcome! If you witnessed the development of a
> mascot for
> >> > another open source project, even better =)
> >> >
> >> > Please feel free to express interest on this thread, and I'll
> reach out
> >> > to you off-list.
> >> >
> >> > Thanks,
> >> >
> >> > María
> >> >
> >> > On Fri, Apr 17, 2020 at 6:19 AM Jeff Klukas  >> > > wrote:
> >> >
> >> > I personally like the sound of "Datum" as a name. I also like
> the
> >> > idea of not assigning them a gender.
> >> >
> >> > As a counterpoint on the naming side, one of the slide decks
> >> > provided while iterating on the design mentions:
> >> >
> >> > > Mascot can change colors when it is “full of data” or has a
> “batch
> >> > of data” to process.  Yellow is supercharged and ready to
> process!
> >> >
> >> > Based on that, I'd argue that the mascot maps to the concept
> of a
> >> > bundle in the beam execution model and we should consider a
> name
> >> > that's a play on "bundle" or perhaps a play on "checkpoint".
> >> >
> >> > On Thu, Apr 16, 2020 at 3:44 PM Julian Bruno <
> juliangbr...@gmail.com
> >> > > wrote:
> >> >
> >> > Hi all,
> >> >
> >> > While working on the design of our Mascot
> >> > Some ideas showed up and I wish to share them.
> >> > In regard to Alex Van Boxel's question about the name of
> our Mascot.
> >> >
> >> > I was thinking about this yesterday night and feel it
> could be a
> >> > great idea to name the Mascot "*Data*" or "*Datum*". Both
> names
> >> > sound cute and make sense to me. I prefer the later.
> Datum means
> >> > a single piece of information. The Mascot is the first
> piece of
> >> > information and its job is to collect batches of data and
> >> > process it. Datum is in charge of linking information
> together.
> >> >
> >> > In 

Re: latest release on Github

2020-06-01 Thread Udi Meiri
Thanks Nathan and Kyle!

On Mon, Jun 1, 2020 at 12:22 PM Kyle Weaver  wrote:

> Filed https://jira.apache.org/jira/browse/BEAM-10168
>
> On Mon, Jun 1, 2020 at 3:20 PM Kyle Weaver  wrote:
>
>> I fixed it. There was a discussion on the mailing list recently about
>> automating Github releases [1], short of that however we can just use the
>> UI.
>>
>> [1]
>> https://lists.apache.org/thread.html/re367d262333078501c47856e9b8a0fc3fd7db60c2d2ebb181275481a%40%3Cdev.beam.apache.org%3E
>>
>>
>> On Mon, Jun 1, 2020 at 3:07 PM Udi Meiri  wrote:
>>
>>> Hi,
>>> Trying out the new Github layout (feature preview) I've noticed that the
>>> latest release is featured more prominently and is at v2.16.0 for some
>>> reason. https://github.com/apache/beam/releases/tag/v2.16.0
>>>
>>> Any idea how to fix this, and what instructions should be added to the
>>> release guide?
>>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


latest release on Github

2020-06-01 Thread Udi Meiri
Hi,
Trying out the new Github layout (feature preview) I've noticed that the
latest release is featured more prominently and is at v2.16.0 for some
reason. https://github.com/apache/beam/releases/tag/v2.16.0

Any idea how to fix this, and what instructions should be added to the
release guide?


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] Beam 2.21.0 Released

2020-05-28 Thread Udi Meiri
Woohoo!

On Thu, May 28, 2020 at 4:16 AM Kyle Weaver  wrote:

> The Apache Beam team is pleased to announce the release of version 2.21.0.
>
> Apache Beam is an open source unified programming model to define and
> execute data processing pipelines, including ETL, batch and stream
> (continuous) processing. See https://beam.apache.org
>
> You can download the release here:
>
> https://beam.apache.org/get-started/downloads/
>
> This release includes bug fixes, features, and improvements detailed on
> the Beam blog: https://beam.apache.org/blog/beam-2.21.0/
>
> Thanks to everyone who contributed to this release, and we hope you enjoy
> using Beam 2.21.0.
> -- Kyle Weaver, on behalf of The Apache Beam team
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] New committer: Robin Qiu

2020-05-19 Thread Udi Meiri
Congratulations Robin!

On Tue, May 19, 2020, 10:15 Valentyn Tymofieiev  wrote:

> Congratulations, Robin!
>
> On Tue, May 19, 2020 at 9:10 AM Yichi Zhang  wrote:
>
>> Congrats Robin!
>>
>> On Tue, May 19, 2020 at 8:56 AM Kamil Wasilewski <
>> kamil.wasilew...@polidea.com> wrote:
>>
>>> Congrats!
>>>
>>> On Tue, May 19, 2020 at 5:33 PM Jan Lukavský  wrote:
>>>
 Congrats Robin!
 On 5/19/20 5:01 PM, Tyson Hamilton wrote:

 Congratulations!

 On Tue, May 19, 2020 at 6:10 AM Omar Ismail 
 wrote:

> Congrats!
>
> On Tue, May 19, 2020 at 5:00 AM Gleb Kanterov 
> wrote:
>
>> Congratulations!
>>
>> On Tue, May 19, 2020 at 7:31 AM Aizhamal Nurmamat kyzy <
>> aizha...@apache.org> wrote:
>>
>>> Congratulations, Robin! Thank you for your contributions!
>>>
>>> On Mon, May 18, 2020, 7:18 PM Boyuan Zhang 
>>> wrote:
>>>
 Congrats~~

 On Mon, May 18, 2020 at 7:17 PM Reza Rokni  wrote:

> Congratulations!
>
> On Tue, May 19, 2020 at 10:06 AM Ahmet Altay 
> wrote:
>
>> Hi everyone,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new
>> committer: Robin Qiu .
>>
>> Robin has been active in the community for close to 2 years,
>> worked on HyperLogLog++ [1], SQL [2], improved documentation, and 
>> helped
>> with releases(*).
>>
>> In consideration of his contributions, the Beam PMC trusts him
>> with the responsibilities of a Beam committer [3].
>>
>> Thank you for your contributions Robin!
>>
>> -Ahmet, on behalf of the Apache Beam PMC
>>
>> [1]
>> https://www.meetup.com/Zurich-Apache-Beam-Meetup/events/265529665/
>> [2]
>> https://www.meetup.com/Belgium-Apache-Beam-Meetup/events/264933301/
>> [3] https://beam.apache.org/contribute/become-a-committer
>> /#an-apache-beam-committer
>> (*) And maybe he will be a release manager soon :)
>>
>> --
>
> Omar Ismail |  Technical Solutions Engineer |  omarism...@google.com |
>
>



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Beam 2.21 release update

2020-05-08 Thread Udi Meiri
+Chad Dombrova  , who added _find_protoc_gen_mypy.

I'm guessing that the code
in _install_grpcio_tools_and_generate_proto_files creates a kind of
virtualenv, but it only works well for staging Python modules and not
binaries like protoc-gen-mypy.
(I assume there's a reason why it doesn't invoke virtualenv, probably since
the list of things setup.py can expect to be installed is very minimal
(setuptools).)

One solution would be to make these setup.py dependencies explicit in
pyproject.toml, such that pip installs them before running setup.py:
https://pip.pypa.io/en/stable/reference/pip/#pep-517-and-518-support
It would help when using tools like pip ("pip wheel"), but I'm not sure
what the alternative for "python setup.py sdist" is.


On Thu, May 7, 2020 at 10:40 PM Thomas Weise  wrote:

> No additional stacktraces. Full error output below.
>
> It's not clear what is going wrong.
>
> There isn't any exception from the subprocess execution since the
> "WARNING:root:Installing grpcio-tools took 305.39 seconds." is printed.
>
> Also, the time it takes to perform the install is equivalent to
> successfully running the pip command.
>
> I will report back if I find anything else. Currently doing the
> explicit install via pip install -r sdks/python/build-requirements.txt
>
> Thanks,
> Thomas
>
> WARNING:root:Installing grpcio-tools took 269.27 seconds.
> INFO:gen_protos:Regenerating Python proto definitions (no output files).
> Process Process-1:
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in
> _bootstrap
> self.run()
>   File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
> self._target(*self._args, **self._kwargs)
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 378, in _install_grpcio_tools_and_generate_proto_files
> generate_proto_files(force=force)
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 315, in generate_proto_files
> protoc_gen_mypy = _find_protoc_gen_mypy()
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 233, in _find_protoc_gen_mypy
> (fname, ', '.join(search_paths)))
> RuntimeError: Could not find protoc-gen-mypy in /code/venvs/venv2/bin,
> /code/venvs/venv2/bin, /code/venvs/venv3/bin, /usr/local/sbin,
> /usr/local/bin, /usr/sbin, /usr/bin, /sbin, /bin
> Traceback (most recent call last):
>   File "setup.py", line 311, in 
> 'mypy': generate_protos_first(mypy),
>   File
> "/code/venvs/venv2/local/lib/python2.7/site-packages/setuptools/__init__.py",
> line 129, in setup
> return distutils.core.setup(**attrs)
>   File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
> dist.run_commands()
>   File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
> self.run_command(cmd)
>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File
> "/code/venvs/venv2/local/lib/python2.7/site-packages/wheel/bdist_wheel.py",
> line 204, in run
> self.run_command('build')
>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
> self.distribution.run_command(command)
>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File "/usr/lib/python2.7/distutils/command/build.py", line 128, in run
> self.run_command(cmd_name)
>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
> self.distribution.run_command(command)
>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
>     cmd_obj.run()
>   File "setup.py", line 235, in run
> gen_protos.generate_proto_files()
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 310, in generate_proto_files
> raise ValueError("Proto generation failed (see log for details).")
> ValueError: Proto generation failed (see log for details).
>
>
> On Thu, May 7, 2020 at 2:25 PM Udi Meiri  wrote:
>
>> It's hard to say without more details what's going on. Ahmet you're right
>> that it installs build-requirements.txt and retries calling
>> generate_proto_files().
>>
>> Thomas, were there additional stacktraces? (after a "During handling of
>> the above exception, another exception occurred:" message?)
>>
>>
>> On Thu, May 7, 2020 at 11:59 AM Ahmet Altay  wrote:
>>
>>>
>>>
>>> On Thu, May

Re: Beam 2.21 release update

2020-05-07 Thread Udi Meiri
It's hard to say without more details what's going on. Ahmet you're right
that it installs build-requirements.txt and retries calling
generate_proto_files().

Thomas, were there additional stacktraces? (after a "During handling of the
above exception, another exception occurred:" message?)


On Thu, May 7, 2020 at 11:59 AM Ahmet Altay  wrote:

>
>
> On Thu, May 7, 2020 at 11:56 AM Thomas Weise  wrote:
>
>> Thanks Udi! This is the issue. I'm trying to upgrade from 2.18 where
>> build-requirements.txt didn't exist.
>>
>> Is there a reason why this cannot happen automatically when
>> running python3.6 setup.py sdist bdist_wheel ?
>>
>
> I _believe_ this should happen automatically here:
> https://github.com/apache/beam/blob/master/sdks/python/gen_protos.py#L365.
> Maybe there is a problem there?
>
>
>>
>> Thomas
>>
>>
>> On Thu, May 7, 2020 at 11:07 AM Udi Meiri  wrote:
>>
>>> Probably not the issue, but double checking: are you running "pip
>>> install -r sdks/python/build-requirements.txt" first?
>>>
>>> On Wed, May 6, 2020 at 7:22 PM Thomas Weise  wrote:
>>>
>>>> I'm working on rebasing our fork to 2.21.0 and run into a problem
>>>> installing grpcio-tools that leads to *ModuleNotFoundError: No module
>>>> named 'grpc_tools'  *(see details below)
>>>>
>>>> I cannot reproduce this locally.
>>>>
>>>> Any =suggestions on what to look for?
>>>>
>>>> Thanks,
>>>> Thomas
>>>>
>>>> [?25hBuilding wheels for collected packages: future, gr
>>>>   Running setup.py bdist_wheel for future ... [?25l- \
>>>> [?25h  Stored in directory:
>>>> /root/.cache/pip/wheels/bf/c9/a3/c538d90ef17cf7823fa51fc701a7a7a910a80f6a405bf1
>>>>   Running setup.py bdist_wheel for grpcio ... [?25l- \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ |
>>>> [?25h  Stored in directory:
>>>> /root/.cache/pip/wheels/00/4d/5f/07d0d4283911d2b917b867a11b1622d9d2cc8c286eefd1
>>>> Successfully built future grpcio
>>>> Installing collected packages: six, grpcio, setuptools, protobuf,
>>>> grpcio-tools, future, mypy-protobuf
>>>> Successfully installed future-0.16.0 grpcio-1.28.1 grpcio-tools-1.14.2
>>>> mypy-protobuf-1.18 protobuf-3.11.3 setuptools-46.1.3 six-1.14.0
>>>> WARNING:root:Installing grpcio-tools took 305.39 seconds.
>>>> INFO:gen_protos:Regenerating Python proto definitions (no output files).
>>>> Process Process-1:
>>>> Traceback (most recent call last):
>>>>   File
>>>> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
>>>> 292, in generate_proto_files
>>>> from grpc_tools import protoc
>>>> ModuleNotFoundError: No module named 'grpc_tools'
>>>>
>>>> On Fri, Apr 10, 2020 at 10:01 AM Kyle Weaver 
>>>> wrote:
>>>>
>>>>> Hi everyone,
>>>>>
>>>>> Just a heads up that the Beam 2.21 release branch [1] is cut.
>>>>> - If you find any important issues that you think should be addressed
>>>>> in the release, please tag the jira with fix version 2.21.0 and cc me
>>>>> (username `ibzib`).
>>>>> - Make sure to update the change log [2] with any significant changes
>>>>> if you haven't already. Send a PR with the change and tag me. (I imagine
>>>>> I'm not the only one who forgot to do this :).)
>>>>>
>>>>> Thanks,
>>>>> Kyle
>>>>>
>>>>> [1] https://github.com/apache/beam/blob/release-2.21.0
>>>>> [2] https://github.com/apache/beam/blob/master/CHANGES.md
>>>>>
>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Beam 2.21 release update

2020-05-07 Thread Udi Meiri
Probably not the issue, but double checking: are you running "pip install
-r sdks/python/build-requirements.txt" first?

On Wed, May 6, 2020 at 7:22 PM Thomas Weise  wrote:

> I'm working on rebasing our fork to 2.21.0 and run into a problem
> installing grpcio-tools that leads to *ModuleNotFoundError: No module
> named 'grpc_tools'  *(see details below)
>
> I cannot reproduce this locally.
>
> Any =suggestions on what to look for?
>
> Thanks,
> Thomas
>
> [?25hBuilding wheels for collected packages: future, gr
>   Running setup.py bdist_wheel for future ... [?25l- \
> [?25h  Stored in directory:
> /root/.cache/pip/wheels/bf/c9/a3/c538d90ef17cf7823fa51fc701a7a7a910a80f6a405bf1
>   Running setup.py bdist_wheel for grpcio ... [?25l- \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ |
> [?25h  Stored in directory:
> /root/.cache/pip/wheels/00/4d/5f/07d0d4283911d2b917b867a11b1622d9d2cc8c286eefd1
> Successfully built future grpcio
> Installing collected packages: six, grpcio, setuptools, protobuf,
> grpcio-tools, future, mypy-protobuf
> Successfully installed future-0.16.0 grpcio-1.28.1 grpcio-tools-1.14.2
> mypy-protobuf-1.18 protobuf-3.11.3 setuptools-46.1.3 six-1.14.0
> WARNING:root:Installing grpcio-tools took 305.39 seconds.
> INFO:gen_protos:Regenerating Python proto definitions (no output files).
> Process Process-1:
> Traceback (most recent call last):
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 292, in generate_proto_files
> from grpc_tools import protoc
> ModuleNotFoundError: No module named 'grpc_tools'
>
> On Fri, Apr 10, 2020 at 10:01 AM Kyle Weaver  wrote:
>
>> Hi everyone,
>>
>> Just a heads up that the Beam 2.21 release branch [1] is cut.
>> - If you find any important issues that you think should be addressed in
>> the release, please tag the jira with fix version 2.21.0 and cc me
>> (username `ibzib`).
>> - Make sure to update the change log [2] with any significant changes if
>> you haven't already. Send a PR with the change and tag me. (I imagine I'm
>> not the only one who forgot to do this :).)
>>
>> Thanks,
>> Kyle
>>
>> [1] https://github.com/apache/beam/blob/release-2.21.0
>> [2] https://github.com/apache/beam/blob/master/CHANGES.md
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Python 3.7 docker container fails to build

2020-04-30 Thread Udi Meiri
I summarized my idea here: https://issues.apache.org/jira/browse/BEAM-9865


On Thu, Apr 30, 2020 at 2:01 PM Maximilian Michels  wrote:

> On 30.04.20 21:48, Hannah Jiang wrote:
> > --info tag was passed to docker image build commands with PythonDocker
> > Precommit to capture more logs. Without the tag, errors from
> > DockerFile step are not printed out to the console.
>
> Thanks for the info (pun intended).
>
> On 30.04.20 21:48, Hannah Jiang wrote:
> > Indeed, I can see the no space left on device in the following but
> > not in the log above:
> >
> > --info tag was passed to docker image build commands with PythonDocker
> > Precommit to capture more logs. Without the tag, errors from DockerFile
> > step are not printed out to the console.
> >
> > On Thu, Apr 30, 2020 at 11:19 AM Udi Meiri  > <mailto:eh...@google.com>> wrote:
> >
> > I checked node 8 and it had over 40GB space available. Does your job
> > require more than that?
> >
> > Long term, I'm thinking we could clean up workspaces for successful
> > jobs. This should free up additional space (I guess at least 100GB).
> > https://plugins.jenkins.io/ws-cleanup/ - we already use this plugin
> > to clean workspaces at job start.
> >
> >
> > On Thu, Apr 30, 2020, 07:33 Maximilian Michels  > <mailto:m...@apache.org>> wrote:
> >
> > *It's working again, probably because it's running on a different
> > machine now.
> >
> > Who can check the disk space of the Jenkins hosts?
> >
> > Thanks,
> > Max
> >
> > On 30.04.20 11:55, Maximilian Michels wrote:
> > > Sorry, I meant to include the Jenkins log:
> > >
> >
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/5/console
> > >
> > > Thanks for investigating Hannah! Indeed, I can see the no
> > space left on
> > > device in the following but not in the log above:
> > >
> >
> https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/473/console
> > >
> > > I'm going to try running the build again. Do you think we
> > could add more
> > > storage to our Jenkins hosts or delete old build data?
> > >
> > > Thanks,
> > > Max
> > >
> > > On 30.04.20 08:43, Hannah Jiang wrote:
> > >> Max, I found a link from your PR and noticed below errors.
> > This would be
> > >> the true error.
> > >>
> > >> *07:57:03* >*Task :sdks:python:container:py37:docker*
> > >> *07:57:03*  [91mERROR: Could not install packages due to an
> > EnvironmentError: [Errno 28] No space left on device
> > >> *07:57:03*
> > >> *07:57:03*  [0m
> > >> *07:57:03* >*Task :sdks:python:container:py35:docker*
> > >> *07:57:03*  [91mERROR: Could not install packages due to an
> > EnvironmentError: [Errno 28] No space left on device
> > >>
> > >>
> > >>
> > >> On Wed, Apr 29, 2020 at 5:59 PM Hannah Jiang
> > mailto:hannahji...@google.com>
> > >> <mailto:hannahji...@google.com
> > <mailto:hannahji...@google.com>>> wrote:
> > >>
> > >> There is a PythonDocker Precommit test running for PRs
> > with Python
> > >> changes. It seems running well.[1]
> > >> Max, can you please give me a link so I can check more
> > details? Do
> > >> other images with different Python versions fail as well?
> > >>
> > >>
> >  1.
> https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/
> > >>
> > >>
> > >> On Wed, Apr 29, 2020 at 2:44 PM Ahmet Altay
> > mailto:al...@google.com>
> > >> <mailto:al...@google.com <mailto:al...@google.com>>>
> wrote:
> > >>
> > >> +Valentyn Tymofieiev <mailto:valen...@google.com
> > <mailto:valen...@google.com>> +Hannah Jiang
> > >> <mailto:hannahji...@google.com
> > <mailto:hannahji...@google.com>> --

Re: Python 3.7 docker container fails to build

2020-04-30 Thread Udi Meiri
I checked node 8 and it had over 40GB space available. Does your job
require more than that?

Long term, I'm thinking we could clean up workspaces for successful jobs.
This should free up additional space (I guess at least 100GB).
https://plugins.jenkins.io/ws-cleanup/ - we already use this plugin to
clean workspaces at job start.


On Thu, Apr 30, 2020, 07:33 Maximilian Michels  wrote:

> *It's working again, probably because it's running on a different
> machine now.
>
> Who can check the disk space of the Jenkins hosts?
>
> Thanks,
> Max
>
> On 30.04.20 11:55, Maximilian Michels wrote:
> > Sorry, I meant to include the Jenkins log:
> >
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/5/console
> >
> > Thanks for investigating Hannah! Indeed, I can see the no space left on
> > device in the following but not in the log above:
> >
> https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/473/console
> >
> > I'm going to try running the build again. Do you think we could add more
> > storage to our Jenkins hosts or delete old build data?
> >
> > Thanks,
> > Max
> >
> > On 30.04.20 08:43, Hannah Jiang wrote:
> >> Max, I found a link from your PR and noticed below errors. This would be
> >> the true error.
> >>
> >> *07:57:03* >*Task :sdks:python:container:py37:docker*
> >> *07:57:03*  [91mERROR: Could not install packages due to an
> EnvironmentError: [Errno 28] No space left on device
> >> *07:57:03*
> >> *07:57:03*  [0m
> >> *07:57:03* >*Task :sdks:python:container:py35:docker*
> >> *07:57:03*  [91mERROR: Could not install packages due to an
> EnvironmentError: [Errno 28] No space left on device
> >>
> >>
> >>
> >> On Wed, Apr 29, 2020 at 5:59 PM Hannah Jiang  >> > wrote:
> >>
> >> There is a PythonDocker Precommit test running for PRs with Python
> >> changes. It seems running well.[1]
> >> Max, can you please give me a link so I can check more details? Do
> >> other images with different Python versions fail as well?
> >>
> >> 1.
> https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/
> >>
> >>
> >> On Wed, Apr 29, 2020 at 2:44 PM Ahmet Altay  >> > wrote:
> >>
> >> +Valentyn Tymofieiev  +Hannah Jiang
> >>  -- in case they have relevant
> >> information.
> >>
> >> On Wed, Apr 29, 2020 at 12:35 PM Maximilian Michels
> >> mailto:m...@apache.org>> wrote:
> >>
> >> Hi,
> >>
> >> has anyone noticed the Python 3.7 Docker container fails to
> >> build? I
> >> haven't been able to build the Python 3.7 container, neither
> >> locally nor
> >> on Jenkins.
> >>
> >> I get:
> >>
> >> 17:48:10 > Task :sdks:python:container:py37:docker
> >> 17:49:36 The command '/bin/sh -c pip install -r
> >> /tmp/base_image_requirements.txt && python -c "from
> >> google.protobuf.internal import api_implementation; assert
> >> api_implementation._default_implementation_type == 'cpp';
> print
> >> ('Verified fast protobuf used.')" && rm -rf
> >> /root/.cache/pip' returned a
> >> non-zero code: 1
> >> 17:49:36
> >> 17:49:36 > Task :sdks:python:container:py37:docker FAILED
> >>
> >>
> >> Cheers,
> >> Max
> >>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Jenkins test triggers

2020-04-28 Thread Udi Meiri
Hi,
Has anyone noticed any changes today (since ~6h ago) to how tests are
triggered on PRs?

Are they triggering always, some of the time, not at all?
Are phrase comment triggers working?

Thanks.
(background: https://issues.apache.org/jira/browse/INFRA-19836)


smime.p7s
Description: S/MIME Cryptographic Signature


Re: How to submit PRs for dependant changes?

2020-04-28 Thread Udi Meiri
(a) or (c) should work. (c) is preferred if you want faster reviews.

For multiple JIRAs, I've seen both [BEAM-123,BEAM-456] and
[BEAM-123][BEAM-456] formats. One of them works but I'm not sure which. :D
You can always manually add a PR to a JIRA.



On Sun, Apr 26, 2020 at 2:49 PM Reuven Lax  wrote:

> For c), I don't think you need merge resolutions. You can submit each
> commit in a separate PR, and rebase your branch after each one.
>
> On Sun, Apr 26, 2020 at 10:25 AM Niel Markwick  wrote:
>
>>
>> Hey Beam devs...
>>
>> I have 4 changes to submit as PRs to fix 4 independent issues in the
>> io.gcp.SpannerIO class.
>>
>> The PRs are notionally independent, but will cause merge conflicts if
>> submitted separately, as the fix for each issue will change code related to
>> the fix for some of the others.
>>
>> How do you prefer the PRs to be submitted?
>>
>> a) one single PR with 4 sequential commits within it
>> b) one single PR with all changes squashed.
>> c) 4 separate conflicting PRs which will have to be merged separately,
>> and a merge conflict resolution after each one.
>>
>> a) is how it is in my repo.
>> b) would be easy, but less clear what the changes were for.
>> c) I guess would be clearest in the Beam changelog.
>>
>> If the answer is a) or b), how would I specify multiple JIRA tickets in
>> the PR title?
>>
>> Thanks!
>>
>> --
>> 
>> * •  **Niel Markwick*
>> * •  *Cloud Solutions Architect 
>> * •  *Google Belgium
>> * •  *ni...@google.com
>> * •  *+32 2 894 6771
>>
>>
>> Google Belgium NV/SA, Steenweg op Etterbeek 180, 1040 Brussel, Belgie. RPR: 
>> 0878.065.378
>>
>> If you have received this communication by mistake, please don't forward
>> it to anyone else (it may contain confidential or privileged information),
>> please erase all copies of it, including all attachments, and please let
>> the sender know it went to the wrong person. Thanks
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Jenkins jobs not running for my PR 10438

2020-04-28 Thread Udi Meiri
Alexey, what you're doing should be working (commits should trigger tests,
as should "retest this please" and other phrases).

https://issues.apache.org/jira/browse/INFRA-19836 tracks this issue

On Tue, Apr 28, 2020 at 10:04 AM Alexey Romanenko 
wrote:

> Does anyone know the “golden rule” how to trigger Jenkins tests?
>
> For example:
> https://github.com/apache/beam/pull/11341
> I tried several times and it’s still not triggered.
>
> On 28 Apr 2020, at 13:33, Ismaël Mejía  wrote:
>
> done
>
> On Tue, Apr 28, 2020 at 12:47 PM Shoaib Zafar <
> shoaib.za...@venturedive.com> wrote:
>
>> Hello Beam Committers,
>>
>> I would appreciate if you could trigger precommit checks for the PR:
>> https://github.com/apache/beam/pull/11210 along with the python
>> post-commit check (Run Python 3.5 PostCommit).
>>
>> Thanks and Regards.
>>
>> *Shoaib Zafar*
>> Software Engineering Lead
>> Mobile: +92 333 274 6242
>> Skype: live:shoaibzafar_1
>>
>> 
>>
>>
>> On Wed, Apr 22, 2020 at 9:40 PM Rehman Murad Ali <
>> rehman.murad...@venturedive.com> wrote:
>>
>>> Hello Beam Committers.
>>>
>>> Would you please trigger basic tests as well as all *validatesRunner*
>>> test on this PR:
>>> https://github.com/apache/beam/pull/11154 
>>> 
>>>
>>>
>>> *Thanks & Regards*
>>>
>>>
>>>
>>> *Rehman Murad Ali*
>>> Software Engineer
>>> Mobile: +92 3452076766 <+92%20345%202076766>
>>> Skype: rehman.muradali
>>>
>>>
>>> On Wed, Apr 22, 2020 at 9:25 PM Yoshiki Obata 
>>> wrote:
>>>
 Hello Beam Committers,

 I would appreciate if you could trigger precommit checks for these PRs;
 https://github.com/apache/beam/pull/11493
 https://github.com/apache/beam/pull/11494

 Regards
 yoshiki

 2020年4月21日(火) 1:11 Luke Cwik :

> The precommits started and I provided the comments for the postcommits
> as you have requested but they have yet to start.
>
> On Mon, Apr 20, 2020 at 8:31 AM Shoaib Zafar <
> shoaib.za...@venturedive.com> wrote:
>
>> Hello Beam Committers.
>>
>> Would you please trigger the pre-commit checks on the PR:
>> https://github.com/apache/beam/pull/11210 along with the python
>> post-commit checks (Run Python PostCommit, Run Python 3.5 PostCommit)?
>>
>> Thanks! Regards,
>>
>> *Shoaib Zafar*
>> Software Engineering Lead
>> Mobile: +92 333 274 6242
>> Skype: live:shoaibzafar_1
>>
>> 
>>
>>
>> On Fri, Apr 17, 2020 at 1:19 PM Ismaël Mejía 
>> wrote:
>>
>>> done
>>>
>>> On Thu, Apr 16, 2020 at 4:32 PM Rehman Murad Ali <
>>> rehman.murad...@venturedive.com> wrote:
>>>
 Hello Beam Committers.

 Would you please trigger basic tests as well as validatesRunner
 test on this PR:

 
 https://github.com/apache/beam/pull/11350


 *Thanks & Regards*



 *Rehman Murad Ali*
 Software Engineer
 Mobile: +92 3452076766 <+92%20345%202076766>
 Skype: rehman.muradali


 On Mon, Apr 13, 2020 at 10:16 PM Ahmet Altay 
 wrote:

> Done.
>
> On Mon, Apr 13, 2020 at 8:52 AM Shoaib Zafar <
> shoaib.za...@venturedive.com> wrote:
>
>> Hello Beam Committers.
>>
>> Would you please trigger the pre-commit checks on the PR:
>> https://github.com/apache/beam/pull/11210 along with the python
>> post-commit checks (Run Python PostCommit, Run Python 3.5 
>> PostCommit)?
>>
>> Thanks!
>>
>> *Shoaib Zafar*
>> Software Engineering Lead
>> Mobile: +92 333 274 6242
>> Skype: live:shoaibzafar_1
>>
>> 
>>
>>
>> On Mon, Apr 13, 2020 at 4:00 PM Ismaël Mejía 
>> wrote:
>>
>>> done
>>>
>>> On Mon, Apr 13, 2020 at 12:42 PM Rehman Murad Ali
>>>  wrote:
>>> >
>>> > Hi Beam Committers!
>>> >
>>> > Thanks( Ismael )
>>> >
>>> > I appreciate if someone could trigger these tests on this PR
>>> https://github.com/apache/beam/pull/11154
>>> >
>>> > run dataflow validatesrunner
>>> > run flink validatesrunner
>>> > Run Java Flink PortableValidatesRunner Streaming
>>> >
>>> > Thanks
>>> >
>>> >
>>> >
>>> > Rehman Murad Ali
>>> > Software Engineer
>>> > Mobile: +92 3452076766 <+92%20345%202076766>
>>> > Skype: rehman.muradali
>>> >
>>> >
>>> >
>>> > On Wed, Apr 1, 2020 at 1:19 PM Ismaël Mejía 
>>> wrote:
>>> >>
>>> >> done
>>> >>
>>> >> On Wed, 

Re: Jira PR links not being generated?

2020-04-27 Thread Udi Meiri
I think it's a recent change. The page https://s.apache.org/asfyaml-notify
was updated last week but I didn't see an announcement.

On Mon, Apr 27, 2020 at 3:34 PM Kyle Weaver  wrote:

> I made a PR for this, though I still haven't found sufficient explanation
> as to why we did not need this file last week, and now we do this week.
> https://github.com/apache/beam/pull/11541
>
> On Mon, Apr 27, 2020 at 6:24 PM Udi Meiri  wrote:
>
>> We had such a file for a short while but it was removed:
>> https://github.com/apache/beam/pull/10645
>> I don't believe it contained any PR link settings though
>> +Pablo Estrada 
>>
>> On Mon, Apr 27, 2020 at 1:56 PM Kyle Weaver  wrote:
>>
>>> I went ahead and filed https://issues.apache.org/jira/browse/BEAM-9833 since
>>> it looks like this is how things will be done from now on. Which raises the
>>> question, does anyone know how Beam managed these settings before? Or were
>>> there previously no project-level controls?
>>>
>>> On Mon, Apr 27, 2020 at 4:39 PM Kyle Weaver  wrote:
>>>
>>>> Thanks for the pointer Kenn. I searched existing INFRA issues and found
>>>> [1] (among others). Looks like we may need to add a .asf.yaml file [2]. I
>>>> guess infra must have changed this recently without us picking up on it?
>>>> <https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories#id-.asf.yamlfeaturesforgitrepositories-Notificationsettingsforrepositories>
>>>>
>>>> [1] https://issues.apache.org/jira/browse/INFRA-20171
>>>> [2]
>>>> https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories#id-.asf.yamlfeaturesforgitrepositories-Notificationsettingsforrepositories
>>>>
>>>> On Mon, Apr 27, 2020 at 4:25 PM Kenneth Knowles 
>>>> wrote:
>>>>
>>>>> I suggest filing an issue with INFRA.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Fri, Apr 24, 2020 at 10:12 AM Kyle Weaver 
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I've noticed links from Jira issues to related Github PRs have not
>>>>>> been generated the past few days. Does anyone know why?
>>>>>>
>>>>>> Kyle
>>>>>>
>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: JIRA Committer Permissions

2020-04-27 Thread Udi Meiri
Should this step be added to our new committer guide?

On Fri, Apr 24, 2020 at 6:21 PM Luke Cwik  wrote:

> I noticed that several committers only had contributor level permissions
> and I went and updated your account permissions for the Beam project to be
> committer level. Feel free to let me know If you run into any issues.
>
> There were about ~25 accounts like this.
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Jira PR links not being generated?

2020-04-27 Thread Udi Meiri
We had such a file for a short while but it was removed:
https://github.com/apache/beam/pull/10645
I don't believe it contained any PR link settings though
+Pablo Estrada 

On Mon, Apr 27, 2020 at 1:56 PM Kyle Weaver  wrote:

> I went ahead and filed https://issues.apache.org/jira/browse/BEAM-9833 since
> it looks like this is how things will be done from now on. Which raises the
> question, does anyone know how Beam managed these settings before? Or were
> there previously no project-level controls?
>
> On Mon, Apr 27, 2020 at 4:39 PM Kyle Weaver  wrote:
>
>> Thanks for the pointer Kenn. I searched existing INFRA issues and found
>> [1] (among others). Looks like we may need to add a .asf.yaml file [2]. I
>> guess infra must have changed this recently without us picking up on it?
>> 
>>
>> [1] https://issues.apache.org/jira/browse/INFRA-20171
>> [2]
>> https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories#id-.asf.yamlfeaturesforgitrepositories-Notificationsettingsforrepositories
>>
>> On Mon, Apr 27, 2020 at 4:25 PM Kenneth Knowles  wrote:
>>
>>> I suggest filing an issue with INFRA.
>>>
>>> Kenn
>>>
>>> On Fri, Apr 24, 2020 at 10:12 AM Kyle Weaver 
>>> wrote:
>>>
 Hi all,

 I've noticed links from Jira issues to related Github PRs have not been
 generated the past few days. Does anyone know why?

 Kyle

>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] Beam 2.20.0 Released

2020-04-24 Thread Udi Meiri
You'll need to add --tags

On Fri, Apr 24, 2020 at 11:53 AM Jan Lukavský  wrote:

> Hm, that is strange:
>
> ~/git/apache/beam$ git status
> On branch master
> Your branch is up to date with 'origin/master'.
> ~/git/apache/beam$ git pull
> Already up to date.
> ~/git/apache/beam$ git tag | grep v2.19.0
> v2.19.0
> v2.19.0-RC1
> ~/git/apache/beam$ git tag | grep v2.20.0
> ~/git/apache/beam$
>
> I'm obviously missing something.
>
> Jan
> On 4/24/20 7:01 PM, Thomas Weise wrote:
>
> Here is the release tag:
> https://github.com/apache/beam/releases/tag/v2.20.0
>
>
> On Fri, Apr 24, 2020 at 9:28 AM Kyle Weaver  wrote:
>
>> > Is is possible we are missing git tag for this release? I cannot find
>> it.
>>
>> You mean https://github.com/apache/beam/tree/release-2.20.0?
>>
>> On Fri, Apr 24, 2020 at 9:04 AM Jan Lukavský  wrote:
>>
>>> Hi Rui,
>>>
>>> thanks making for this release! Is is possible we are missing git tag
>>> for this release? I cannot find it.
>>>
>>> Thanks,
>>>
>>>  Jan
>>> On 4/16/20 8:47 PM, Rui Wang wrote:
>>>
>>> Note that due to a bug on infrastructure, the website change failed to
>>> publish. But 2.20.0 artifacts are available to use right now.
>>>
>>>
>>>
>>> -Rui
>>>
>>> On Thu, Apr 16, 2020 at 11:45 AM Rui Wang  wrote:
>>>
 The Apache Beam team is pleased
 to announce the release of version 2.20.0.

 Apache Beam is an open source unified programming model to define and
 execute data processing pipelines, including ETL, batch and stream
 (continuous) processing. See https://beam.apache.org

 You can download the release here:

 https://beam.apache.org/get-started/downloads/

 This release includes bug fixes, features, and improvements detailed on
 the Beam blog: https://beam.apache.org/blog/2020/04/15/beam-2.20.0.html

 Thanks to everyone who contributed to this release, and we hope you
 enjoy
 using Beam 2.20.0.
 -- Rui Wang, on behalf of The Apache Beam team

>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Website publish jobs fail recently

2020-04-16 Thread Udi Meiri
I did get a response but I didn't see it because I'm not subscribed to
builds@.

This is the original announcement of the breaking change:

https://lists.apache.org/thread.html/r00c669dd82bbde47958e81ecb330116de131d774e2d4df26a06fe92f@%3Cbuilds.apache.org%3E


On Thu, Apr 16, 2020 at 11:14 AM Udi Meiri  wrote:

> I emailed them yes
>
> On Thu, Apr 16, 2020, 11:09 Ahmet Altay  wrote:
>
>> Did we end up emailing builds@ or should this be filed as a ticket to
>> infra ?
>>
>> +Rui Wang  -- This is preventing Rui from updating
>> the website post 2.20 release.
>>
>> On Tue, Apr 14, 2020 at 9:01 PM Kenneth Knowles  wrote:
>>
>>> Indeed, publish jobs have write access to things, which normal builds do
>>> not.
>>>
>>> I suggest reaching out to bui...@apache.org
>>>
>>> Kenn
>>>
>>> On Tue, Apr 14, 2020 at 5:58 PM Udi Meiri  wrote:
>>>
>>>> Hey, I was looking at this today but could not figure it out.
>>>> The machines we run the publish jobs probably vary from our regular
>>>> apache-beam-testing Jenkins ones.
>>>> I tried researching all the reasons why this might be happening but
>>>> came up empty.
>>>>
>>>> On Tue, Apr 14, 2020 at 10:19 AM Kyle Weaver 
>>>> wrote:
>>>>
>>>>> I think Udi is fixing it. Jira:
>>>>> https://issues.apache.org/jira/browse/BEAM-9737
>>>>>
>>>>> On Tue, Apr 14, 2020 at 1:11 PM Mikhail Gryzykhin 
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Have anyone seen the following error of website publish?
>>>>>> <https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PostCommit_Website_Publish/6018/console>
>>>>>>
>>>>>> *16:33:47* jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir -
>>>>>> /repo/build/website/generated-local-content/security
>>>>>>
>>>>>> I tried the failed target locally and it succeeded. Seems there's
>>>>>> some issue with jenkins configuration.
>>>>>>
>>>>>> Regards,
>>>>>> Mikhail.
>>>>>>
>>>>>>
>>>>>>
>>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Website publish jobs fail recently

2020-04-16 Thread Udi Meiri
I emailed them yes

On Thu, Apr 16, 2020, 11:09 Ahmet Altay  wrote:

> Did we end up emailing builds@ or should this be filed as a ticket to
> infra ?
>
> +Rui Wang  -- This is preventing Rui from updating the
> website post 2.20 release.
>
> On Tue, Apr 14, 2020 at 9:01 PM Kenneth Knowles  wrote:
>
>> Indeed, publish jobs have write access to things, which normal builds do
>> not.
>>
>> I suggest reaching out to bui...@apache.org
>>
>> Kenn
>>
>> On Tue, Apr 14, 2020 at 5:58 PM Udi Meiri  wrote:
>>
>>> Hey, I was looking at this today but could not figure it out.
>>> The machines we run the publish jobs probably vary from our regular
>>> apache-beam-testing Jenkins ones.
>>> I tried researching all the reasons why this might be happening but came
>>> up empty.
>>>
>>> On Tue, Apr 14, 2020 at 10:19 AM Kyle Weaver 
>>> wrote:
>>>
>>>> I think Udi is fixing it. Jira:
>>>> https://issues.apache.org/jira/browse/BEAM-9737
>>>>
>>>> On Tue, Apr 14, 2020 at 1:11 PM Mikhail Gryzykhin 
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Have anyone seen the following error of website publish?
>>>>> <https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PostCommit_Website_Publish/6018/console>
>>>>>
>>>>> *16:33:47* jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir -
>>>>> /repo/build/website/generated-local-content/security
>>>>>
>>>>> I tried the failed target locally and it succeeded. Seems there's some
>>>>> issue with jenkins configuration.
>>>>>
>>>>> Regards,
>>>>> Mikhail.
>>>>>
>>>>>
>>>>>
>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: sdks:java:container:generateThirdPartyLicenses effect on build time / stability

2020-04-15 Thread Udi Meiri
If this process is used in releases we would benefit from running it
regularly to ensure it isn't broken and thus delay releases (and add work
for the release manager).
Does it make sense to put it in postcommit?

On Wed, Apr 15, 2020 at 2:30 PM Kyle Weaver  wrote:

> Looks like the same error as this Jira:
> https://issues.apache.org/jira/browse/BEAM-9764
>
> Even if/when we are able to fix this particular issue, I agree it is best
> not to run this job except for releases because of the inherent network
> cost and possible reliability issues. +Hannah Jiang
>  What do you think?
>
> On Wed, Apr 15, 2020 at 5:20 PM Thomas Weise  wrote:
>
>> The new feature to assemble licenses is very useful but appears to add
>> several minutes (7-8?)  build time to jobs that need to build a container.
>>
>> Does it also seem to cause occasional build failures?
>>
>> https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Phrase/131/
>>
>> Would it be possible to perform this task only during release builds?
>>
>> Thanks,
>> Thomas
>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Website publish jobs fail recently

2020-04-14 Thread Udi Meiri
Hey, I was looking at this today but could not figure it out.
The machines we run the publish jobs probably vary from our regular
apache-beam-testing Jenkins ones.
I tried researching all the reasons why this might be happening but came up
empty.

On Tue, Apr 14, 2020 at 10:19 AM Kyle Weaver  wrote:

> I think Udi is fixing it. Jira:
> https://issues.apache.org/jira/browse/BEAM-9737
>
> On Tue, Apr 14, 2020 at 1:11 PM Mikhail Gryzykhin 
> wrote:
>
>> Hi all,
>>
>> Have anyone seen the following error of website publish?
>> 
>>
>> *16:33:47* jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir -
>> /repo/build/website/generated-local-content/security
>>
>> I tried the failed target locally and it succeeded. Seems there's some
>> issue with jenkins configuration.
>>
>> Regards,
>> Mikhail.
>>
>>
>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Implementing type hints on multi-output PTransforms

2020-04-13 Thread Udi Meiri
On Tue, Mar 31, 2020 at 10:16 AM Joshua B. Harrison 
wrote:

> Ok - that makes sense. My specific workaround was to remove the
> with_output_types for now, so advising the user on this in the error
> message would be nice. I was just worried about silently passing.
>
> As for the formalization:
>
>1. I am a little confused on how this is different than passing
>multiple tagged inputs to a PTransform that does a CoGroupBy*. In this
>case, with_input_types seems to expect a union of all the types for the
>keyed values. Why would the same not work for output types?
>
> Since the outputs are distinct PCollections (as in the example
in BEAM-4132) they each have their own element type. If one of these
PCollections is then passed as an input to a transform, our type checking
is more precise if we use the type of just that pcoll instead of the union
of all pcoll types returned.

>
>1. What is the process for proposing a formalized solution? Should I
>start a document, or does one already exist? Or does this kind of thing get
>tracked via Jira issues?
>
> In this case I think an email titled "[PROPSAL] ..." to this mailing list
describing what you want to change should be enough. A document could also
work; I'm not aware of one that touches on this.
Your PR would need to have an associated JIRA (feel free to take over any
of the ones I've mentioned).


> Best,
> Joshua
>
> On Tue, Mar 31, 2020 at 11:07 AM Udi Meiri  wrote:
>
>> Hi Joshua,
>> I've been working on type hints recently.
>> Your issue is similar to this:
>> https://issues.apache.org/jira/browse/BEAM-8782 (exactly the same if
>> tags are passed to with_outputs() in the example).
>> There's also this related bug about type inference:
>> https://issues.apache.org/jira/browse/BEAM-4132
>>
>> I agree with Luke that it would be helpful to point to a workaround in
>> the error message (such as removing with_output_types).
>>
>> From what I remember, we'll need to formalize how multi-output type hints
>> are provided to Beam.
>> For example, by passing keywords to with_output_types: main=type,
>> TAG=type, etc.
>>
>> On Tue, Mar 31, 2020 at 9:55 AM Luke Cwik  wrote:
>>
>>> I can see that argument but what does a user need to do in this case if
>>> we raise NotImplementedError? Would the need to disable type checking
>>> everywhere?
>>>
>>> Over the long term users will need to deal with improvements to type
>>> checking and will need to fix typing errors when they change Apache Beam
>>> versions.
>>>
>>>
>>> On Tue, Mar 31, 2020 at 9:34 AM Joshua B. Harrison <
>>> josh.harri...@gmail.com> wrote:
>>>
>>>> The current code errors out with a cryptic message around tag types in
>>>> the multi-output. Adding a NotImplementedError was just an attempt to make
>>>> the failure reason more clear.
>>>>
>>>> I would be worried about trivially passing because then the user might
>>>> think they have type checking safety when they don't, which could cause
>>>> failures at later stages and might be hard to debug. Do you agree?
>>>>
>>>> Best,
>>>> Joshua
>>>>
>>>> On Tue, Mar 31, 2020 at 10:16 AM Luke Cwik  wrote:
>>>>
>>>>> Would the NotImplementedError cause users pipeline errors or is that a
>>>>> signal to the type checking mechanism to ignore it?
>>>>> If this would cause failures I would rather make the unsupported case
>>>>> return something that would be trivially true.
>>>>>
>>>>> On Mon, Mar 30, 2020 at 12:01 PM Joshua B. Harrison <
>>>>> josh.harri...@gmail.com> wrote:
>>>>>
>>>>>> Hey all,
>>>>>>
>>>>>> I brought up an issue recently on the user forums noting issues
>>>>>> around type hints and multi-output PTransforms:
>>>>>> https://lists.apache.org/thread.html/r94bf2e43f09a290dbe87d5a8d7eedb34ea215e0bea861521cbdb0c1c%40%3Cuser.beam.apache.org%3E
>>>>>>
>>>>>> As mentioned there, I think that a NotImplementedError should be
>>>>>> raised when attaching type hints to multi-output PTransforms while the
>>>>>> correct implementation is figured out. And that a 'correct' 
>>>>>> implementation
>>>>>> would look something like the Union typehints that are expected on
>>>>>> multi-input PTransforms.
>>>>>>
>>>>>> I am happy to help out and wanted to get the discussion started
>>>>>> around what the community would like to see here. Thank you all for a 
>>>>>> great
>>>>>> product.
>>>>>>
>>>>>> Best,
>>>>>> Joshua
>>>>>>
>>>>>> --
>>>>>> Joshua Harrison |  Software Engineer |  joshharri...@gmail.com
>>>>>>  |  404-433-0242 <(404)%20433-0242>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Joshua Harrison |  Software Engineer |  joshharri...@gmail.com
>>>>  |  404-433-0242 <(404)%20433-0242>
>>>>
>>>
>
> --
> Joshua Harrison |  Software Engineer |  joshharri...@gmail.com
>  |  404-433-0242 <(404)%20433-0242>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [DISCUSS] Let's establish a guideline for using Python type annotations in Beam codebase

2020-04-13 Thread Udi Meiri
I agree with Robert to only put Any where it really can be any type.

I'm not sure how much typing we should add. At minimum: external APIs and
wherever mypy complains.
Ideally I would like to have annotations everywhere, because this reduces
uncertainty when modifying existing code.
You are increasing test coverage (e.g. via mypy) when you add type
annotations.

OTOH, the tradeoff is that adding types changes structural (duck) typing to
nominal.
To keep using structural typing you can define a custom Protocol type,
which slightly increases code complexity and may be a barrier to entry for
new developers.


On Mon, Apr 13, 2020 at 12:11 PM Robert Bradshaw 
wrote:

> On Mon, Apr 13, 2020 at 11:48 AM Valentyn Tymofieiev 
> wrote:
>
>>
>> On Mon, Apr 13, 2020 at 10:53 AM Robert Bradshaw 
>> wrote:
>>
>>> On Mon, Apr 13, 2020 at 10:38 AM Valentyn Tymofieiev <
>>> valen...@google.com> wrote:
>>>
 To clarify, I don't suggest that every variable should have a defined
 type that doesn't change. However, I'd like to establish a culture where we
 consistently add type annotations when we write new code. Where type is
 defined gradually, we can use flexible annotations: "# type: (Any) -> Any"
 or something like that.

>>>
>>> IMHO we should use such annotations when the inputs/outputs are truly
>>> Any, or at least a wide enough variety of types that it's not worth the
>>> effort to be more explicit.
>>>
>> Agreed.
>>
>>>
>>>
 We can argue that there is a point of diminishing returns as well, and
 this is a valid point too. A possible  tradeoff may be to
 require  annotations, docstrings or both in *most* functions/methods.
 Possible definition of 'most' - all functions unless they meet three of the
 following criteria[1]:
 - not externally visible
 - very short
 - obvious

 [1]
 http://google.github.io/styleguide/pyguide.html#383-functions-and-methods
 

>>>
>>> While I'd be open to this. Let's get the type checkers enabled in
>>> presubmit and see what it takes to keep those happy before establishing
>>> more strict criterea.
>>>
>> That's reasonable, thanks for your feedback. Is there a JIRA issue
>> tracking this effort?
>>
>
> https://issues.apache.org/jira/browse/BEAM-7746
>
>
>> (It does sound like we have consensus on using type comments until 2.7 is
>>> dropped.)
>>>
>>>
 On Fri, Apr 10, 2020 at 4:56 PM Robert Bradshaw 
 wrote:

> On Fri, Apr 10, 2020 at 4:00 PM Valentyn Tymofieiev <
> valen...@google.com> wrote:
>
>> My preference is also for type-comments for now.
>>
>> Is it possible to configure the type checkers that we use to require
>> type-comments in new code?
>>
>
> My personal opinion is that there comes a point where there's
> diminishing return on explicitly typing everything (there's a reason 
> people
> choose Python over Java) which is one of the big selling points of gradual
> typing, but before we can consider this the first step is to simply enable
> the type checkers on presubmit (IIRC we're really close).
>
>
>> On Fri, Apr 10, 2020 at 1:46 PM Robert Bradshaw 
>> wrote:
>>
>>> I prefer type-comments, as they can be validated by type checkers.
>>> Once we drop 2.7, we can go with actual type annotations (and the 
>>> comments
>>> can be automatically converted over).
>>>
>>> On Fri, Apr 10, 2020 at 11:17 AM Valentyn Tymofieiev <
>>> valen...@google.com> wrote:
>>>
 I am seeing several styles we use to annotate non-pipeline code in
 Beam codebase:

 - informal docstring comments:
 file_pattern (str): the file glob to read,
 assign_context: Instance of AssignContext,
 - type comments like # type: (...) -> iobase.RestrictionTracker
 - pydoc-style annotation: A :class:`PTransform` object .

 It may be  a good idea to create a guideline which style to use
 when, that we can point at in code reviews, and be more consistent.

 Please suggest your opinions and preferences.

 Thanks

>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Implementing type hints on multi-output PTransforms

2020-03-31 Thread Udi Meiri
Hi Joshua,
I've been working on type hints recently.
Your issue is similar to this:
https://issues.apache.org/jira/browse/BEAM-8782 (exactly the same if tags
are passed to with_outputs() in the example).
There's also this related bug about type inference:
https://issues.apache.org/jira/browse/BEAM-4132

I agree with Luke that it would be helpful to point to a workaround in the
error message (such as removing with_output_types).

>From what I remember, we'll need to formalize how multi-output type hints
are provided to Beam.
For example, by passing keywords to with_output_types: main=type, TAG=type,
etc.

On Tue, Mar 31, 2020 at 9:55 AM Luke Cwik  wrote:

> I can see that argument but what does a user need to do in this case if we
> raise NotImplementedError? Would the need to disable type checking
> everywhere?
>
> Over the long term users will need to deal with improvements to type
> checking and will need to fix typing errors when they change Apache Beam
> versions.
>
>
> On Tue, Mar 31, 2020 at 9:34 AM Joshua B. Harrison <
> josh.harri...@gmail.com> wrote:
>
>> The current code errors out with a cryptic message around tag types in
>> the multi-output. Adding a NotImplementedError was just an attempt to make
>> the failure reason more clear.
>>
>> I would be worried about trivially passing because then the user might
>> think they have type checking safety when they don't, which could cause
>> failures at later stages and might be hard to debug. Do you agree?
>>
>> Best,
>> Joshua
>>
>> On Tue, Mar 31, 2020 at 10:16 AM Luke Cwik  wrote:
>>
>>> Would the NotImplementedError cause users pipeline errors or is that a
>>> signal to the type checking mechanism to ignore it?
>>> If this would cause failures I would rather make the unsupported case
>>> return something that would be trivially true.
>>>
>>> On Mon, Mar 30, 2020 at 12:01 PM Joshua B. Harrison <
>>> josh.harri...@gmail.com> wrote:
>>>
 Hey all,

 I brought up an issue recently on the user forums noting issues around
 type hints and multi-output PTransforms:
 https://lists.apache.org/thread.html/r94bf2e43f09a290dbe87d5a8d7eedb34ea215e0bea861521cbdb0c1c%40%3Cuser.beam.apache.org%3E

 As mentioned there, I think that a NotImplementedError should be raised
 when attaching type hints to multi-output PTransforms while the correct
 implementation is figured out. And that a 'correct' implementation would
 look something like the Union typehints that are expected on multi-input
 PTransforms.

 I am happy to help out and wanted to get the discussion started around
 what the community would like to see here. Thank you all for a great
 product.

 Best,
 Joshua

 --
 Joshua Harrison |  Software Engineer |  joshharri...@gmail.com
  |  404-433-0242 <(404)%20433-0242>

>>>
>>
>> --
>> Joshua Harrison |  Software Engineer |  joshharri...@gmail.com
>>  |  404-433-0242 <(404)%20433-0242>
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [BEAM-9322] Python SDK discussion on correct output tag names

2020-03-26 Thread Udi Meiri
On Thu, Mar 26, 2020 at 10:13 AM Luke Cwik  wrote:

> The issue seems to be that a PCollection can have a "tag" associated with
> it and PTransform expansion can return an arbitrary nested dictionary/tuple
> yet we need to figure out what the user wanted as the local name for the
> PCollection from all this information.
>
> Will this break people who rely on the generated PCollection output tags?
> One big question is whether a composite transform cares about the name
> that is used. For primitive transforms such as ParDo, this is very much a
> yes because the pickled code likely references that name in some way. Some
> composites could have the same need where the payload that is stored as
> part of the composite references these local names and hence we have to
> tell people how to instruct the SDK during transform expansion about what
> name will be used unambiguously (as long as we document and have tests
> around this we can choose from many options). Finally, in the XLang world,
> we need to preserve the names that were provided to us and not change them;
> which is more about making the Python SDK handle XLang transform expansion
> carefully.
>
> Am I missing edge cases?
> Concatenation of strings leads to collisions if the delimiter character is
> used within the tags or map keys. You could use an escaping encoding to
> guarantee that the concatenation always generates unique names.
>
> Some alternatives I thought about were:
> * Don't allow arbitrary nestings returned during expansion, force
> composite transforms to always provide an unambiguous name (either a tuple
> with PCollections with unique tags or a dictionary with untagged
> PCollections or a singular PCollection (Java and Go SDKs do this)).
>

I believe that aligning with Java and Go would be the right way to go here.
I don't know if this would limit expressiveness.


> * Have a "best" effort naming system (note the example I give can have
> many of the "rules" re-ordered) e.g. if all the PCollection tags are unique
> then use only them, followed by if a flat dictionary is returned then use
> only the keys as names, followed by if a flat tuple is returned then use
> indices, and finally fallback to the hierarchical naming scheme.
>
>
> On Tue, Mar 24, 2020 at 1:07 PM Sam Rohde  wrote:
>
>> Hi All,
>>
>> *Problem*
>> I would like to discuss BEAM-9322
>>  and the
>> correct way to set the output tags of a transform with nested PCollections,
>> e.g. a dict of PCollections, a tuple of dicts of PCollections. Before the
>> fixing of BEAM-1833 ,
>> the Python SDK when applying a PTransform would auto-generate the output
>> tags for the output PCollections even if they are manually set by the user:
>>
>> class MyComposite(beam.PTransform):
>>   def expand(self, pcoll):
>> a = PCollection.from_(pcoll)
>> a.tag = 'a'
>>
>> b = PCollection.from_(pcoll)
>> b.tag = 'b'
>> return (a, b)
>>
>> would yield a PTransform with two output PCollection and output tags with
>> 'None' and '0' instead of 'a' and 'b'. This was corrected for simple cases
>> like this. However, this fails when the PCollections share the same output
>> tag (of course). This can happen like so:
>>
>> class MyComposite(beam.PTransform):
>>   def expand(self, pcoll):
>> partition_1 = beam.Partition(pcoll, ...)
>> partition_2 = beam.Partition(pcoll, ...)
>> return (partition_1[0], partition_2[0])
>>
>> With the new code, this leads to an error because both output
>> PCollections have an output tag of '0'.
>>
>> *Proposal*
>> When applying PTransforms to a pipeline (pipeline.py:550) we name the
>> PCollections according to their position in the tree concatenated with the
>> PCollection tag and a delimiter. From the first example, the output
>> PCollections of the applied transform will be: '0.a' and '1.b' because it
>> is a tuple of PCollections. In the second example, the outputs should be:
>> '0.0' and '1.0'. In the case of a dict of PCollections, it should simply be
>> the keys of the dict.
>>
>> What do you think? Am I missing edge cases? Will this be unexpected to
>> users? Will this break people who rely on the generated PCollection output
>> tags?
>>
>> Regards,
>> Sam
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Are docker image tags shared within a jenkins worker?

2020-03-26 Thread Udi Meiri
The Python HDFS IT uses the jenkins BUILD_TAG to create unique names:
PROJECT_NAME=$(echo hdfs_IT-${BUILD_TAG:-non-jenkins})

The BUILD_TAG is unique and easily traced back to the Jenkins job that made
it.
It might need some sanitizing though if it contains any invalid characters.

On Tue, Mar 24, 2020 at 1:50 PM Hannah Jiang  wrote:

> This can be done by 1). passing "-Pdocker-tag=xxx" to the test and 2).
> make sure to specify the custom tag when using docker images.
> For example, *:sdks:python:test-suites:portable:py35:preCommitPy35
> -Pdocker-tag=20200324 *will create an image with a tag 20200324.
> *--environment_config=path/to/container/image* pipeline option can be
> used for Python pipeline to pass custom docker images.
>
>
>
> On Tue, Mar 24, 2020 at 11:42 AM Brian Hulette 
> wrote:
>
>> Failing run:
>> https://builds.apache.org/job/beam_PostCommit_XVR_Flink_PR/65/
>> Passing run:
>> https://builds.apache.org/job/beam_PostCommit_XVR_Flink_PR/66/
>>
>> On Tue, Mar 24, 2020 at 11:33 AM Hannah Jiang 
>> wrote:
>>
>>> Hi Brian
>>>
>>> I think that's possible if we use the default tag for the Jenkins tests.
>>> To prevent this, we can use a customized tag, for example, timestamp, for
>>> each build.
>>> Can you please point me to the failing tests? I will check more details.
>>>
>>> Thanks,
>>> Hannah
>>>
>>>
>>> On Tue, Mar 24, 2020 at 10:11 AM Brian Hulette 
>>> wrote:
>>>
 I ran into a test failure on the XVR tests in [1] which looked like the
 test was executing with a python docker container that did _not_ include
 the python changes in my PR. The test ran successfully after a second run.

 It seems likely that the initial failure occurred because some other
 job was running concurrently on the same jenkins worker and overwrote the `
 apache/beam_python2.7_sdk:2.21.0.dev` image that my run had generated.
 Is this possible? If so, is there something we should do to isolate these
 images?

 [1] https://github.com/apache/beam/pull/10055

>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Hello Beam Community!

2020-03-13 Thread Udi Meiri
Welcome!


On Fri, Mar 13, 2020 at 9:47 AM Yichi Zhang  wrote:

> Welcome!
>
> On Fri, Mar 13, 2020 at 9:40 AM Ahmet Altay  wrote:
>
>> Welcome Brittany!
>>
>> On Thu, Mar 12, 2020 at 6:32 PM Brittany Hermann 
>> wrote:
>>
>>> Hello Beam Community!
>>>
>>> My name is Brittany Hermann and I recently joined the Open Source team
>>> in Data Analytics at Google. As a Program Manager, I will be focusing on
>>> community engagement while getting to work on Apache Beam and Airflow
>>> projects! I have always thrived on creating healthy, diverse, and overall
>>> happy communities and am excited to bring that to the team. For a fun fact,
>>> I am a big Wisconsin Badgers Football fan and have a goldendoodle puppy
>>> named Ollie!
>>>
>>> I look forward to collaborating with you all!
>>>
>>> Kind regards,
>>>
>>> Brittany Hermann
>>>
>>>
>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Install Jenkins AnsiColor plugin

2020-03-12 Thread Udi Meiri
https://github.com/apache/beam/pull/2 is out, attempts to colorize
pytest output


On Sun, Mar 8, 2020 at 10:56 AM Chad Dombrova  wrote:

> I don’t believe that it was ever resolved.  I have a PR with a bunch of
> attempts to get it working but I never did figure it out.  IIRC there did
> seem to be some ansi plugin already installed but I couldn’t get it to
> work.
>
> -chad
>
>
> On Sun, Mar 8, 2020 at 10:52 AM Ismaël Mejía  wrote:
>
>> Did this ever happen? If not what is blocking it?
>>
>>
>>
>> On Tue, Oct 22, 2019 at 10:13 PM Udi Meiri  wrote:
>> >
>> > Your proposal will only affect the seed job (which doesn't do color
>> outputs AFAIK).
>> > I think you want to add colorizeOutput() here:
>> >
>> https://github.com/apache/beam/blob/bfebbd0d16361f61fa40bfdec2f0cb6f943f7c9a/.test-infra/jenkins/CommonJobProperties.groovy#L79-L95
>> >
>> > Otherwise no concerns from me.
>> >
>> > On Tue, Oct 22, 2019 at 12:01 PM Chad Dombrova 
>> wrote:
>> >>
>> >> thanks, so IIUC, I’m going to update job_00_seed.groovy like this:
>> >>
>> >>   wrappers {
>> >> colorizeOutput()
>> >> timeout {
>> >>   absolute(60)
>> >>   abortBuild()
>> >> }
>> >>   }
>> >>
>> >> Then add the comment run seed job
>> >>
>> >> Does anyone have any concerns with me trying this out now?
>> >>
>> >> -chad
>> >>
>> >>
>> >> On Tue, Oct 22, 2019 at 11:42 AM Udi Meiri  wrote:
>> >>>
>> >>> Also note that changing the job DSL doesn't take effect until the
>> "seed" job runs. (use the "run seed job" phrase)
>> >>>
>> >>> On Tue, Oct 22, 2019 at 11:06 AM Chad Dombrova 
>> wrote:
>> >>>>
>> >>>> Thanks, I'll look into this.  I have a PR I'm building up with a
>> handful of minor changes related to this.
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, Oct 22, 2019 at 10:45 AM Yifan Zou 
>> wrote:
>> >>>>>
>> >>>>> Thanks, Udi! The ansicolor plugin was applied to ASF Jenkins
>> universally. You might need to explicitly enable the coloroutput in your
>> jenkins dsl.
>> >>>>>
>> >>>>> On Tue, Oct 22, 2019 at 10:33 AM Udi Meiri 
>> wrote:
>> >>>>>>
>> >>>>>> Seems to be already installed:
>> https://issues.apache.org/jira/browse/INFRA-16944
>> >>>>>> Do we just need to enable it somehow?
>> >>>>>> This might work:
>> https://jenkinsci.github.io/job-dsl-plugin/#method/javaposse.jobdsl.dsl.helpers.wrapper.WrapperContext.colorizeOutput
>> >>>>>>
>> >>>>>> BTW, our Jenkins is maintained by ASF's Infrastructure team:
>> https://cwiki.apache.org/confluence/display/INFRA/Jenkins
>> >>>>>>
>> >>>>>> On Tue, Oct 22, 2019 at 10:23 AM Chad Dombrova 
>> wrote:
>> >>>>>>>
>> >>>>>>> Hi all,
>> >>>>>>> As a user trying to grok failures in jenkins I think it would be
>> a huge help to have color output support.  This is something that works out
>> of the box for CI tools like gitlab and travis, and it really helps bring
>> that 21st century feel to your logs :)
>> >>>>>>>
>> >>>>>>> There's a Jenkins plugin for colorizing ansi escape sequences
>> here:
>> >>>>>>> https://plugins.jenkins.io/ansicolor
>> >>>>>>>
>> >>>>>>> I think this is something that has to be deployed by a Jenkins
>> admin.
>> >>>>>>>
>> >>>>>>> -chad
>> >>>>>>>
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Python Static Typing: Next Steps

2020-03-02 Thread Udi Meiri
Off-topic: Python lint via pre-commit should be much faster. (I wrote my
own modified-file-only lint in the past)

On Mon, Mar 2, 2020 at 2:08 PM Kyle Weaver  wrote:

> > Python lint takes 4-5mins to complete. I think if the mypy analysis is
> really on the order of 10s, the additional time won't matter and could
> always be enabled.
>
> +1 of course it would be nice to make mypy as fast as possible, but I
> don't think speed needs to be a blocker. The productivity gains we'd get
> from reliable type analysis more than offset the cost IMO.
>
> On Mon, Mar 2, 2020 at 2:03 PM Luke Cwik  wrote:
>
>> Python lint takes 4-5mins to complete. I think if the mypy analysis is
>> really on the order of 10s, the additional time won't matter and could
>> always be enabled.
>>
>> On Mon, Mar 2, 2020 at 1:21 PM Chad Dombrova  wrote:
>>
>>> I believe that mypy via pre-commit hook will be faster than 10s since it
 only applies to modified files.

>>>
>>> Correct, with a few caveats:
>>>
>>>- pre-commit can be setup to only run if a python file changes.  so
>>>modifying a java file won't trigger mypy to run.
>>>- if *any* python file changes mypy has to run on the whole
>>>codebase, because a change to one file can affect the others (i.e. a
>>>function arg type changes).  it's not really meaningful to run mypy on a
>>>single file.
>>>- the mypy daemon tracks which files have changed, and runs
>>>incremental updates.  so if we setup the precommit hook to run the 
>>> daemon,
>>>we should see that get appreciably faster.  I'll do some tests and report
>>>back.
>>>
>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Python Static Typing: Next Steps

2020-03-02 Thread Udi Meiri
Let's go forward with this and see. I volunteer to help as well.

I believe that mypy via pre-commit hook will be faster than 10s since it
only applies to modified files.

On Mon, Mar 2, 2020 at 10:53 AM Robert Bradshaw  wrote:

> +1
>
> We should enable this on jenkins, plus trivial instructions (ideally a
> one-liner tox command) to run it locally. Hopefully the errors will be
> easy enough for contributors to figure out (in particular local to and
> commensurate in complexity with the code that they're editing), and I
> agree it's the only way to keep them accurate (which is a net positive
> for tooling and developers).
>
> Running it as part of a pre-commit hook could be discussed once we
> have a bit more experience (but 10s is certainly on the long side).
>
> On Mon, Mar 2, 2020 at 10:01 AM Luke Cwik  wrote:
> >
> > +1
> >
> > The typing information has really helped me several times figuring out
> that API contracts and expected types.
> >
> > On Mon, Mar 2, 2020 at 9:54 AM Pablo Estrada  wrote:
> >>
> >> I am in favor of enabling the test, and also am happy to start
> answering questions too.
> >> Thanks so much Chad for leading this.
> >> Best
> >> -P.
> >>
> >> On Mon, Mar 2, 2020 at 9:44 AM Chad Dombrova  wrote:
> >>>
> >>> Good news everyone!
> >>> We nearly have the full beam codebase passing in mypy.
> >>>
> >>> As we are now approaching the zero-error event horizon, I'd like to
> open up a discussion around enabling mypy in the PythonLint job.  Every day
> or so a PR is merged that introduces some new mypy errors, so enabling this
> test is the only way I see to keep the annotations accurate and thus useful.
> >>>
> >>> Developer fatigue is a real concern here, since static typing has a a
> steep learning curve, and there are still not a lot of experts to help
> consult on PRs.  Here are some things that I hope will mitigate those
> concerns:
> >>>
> >>> We have a lot of tying coverage, so that means plenty of examples of
> how to solve different types of problems
> >>> Running mypy only takes 10 seconds to complete (if you execute it
> outside of gradle / tox), and that will get better when we get to 0
> errors.  Also, running mypy in daemon mode should speed that up even more
> >>> I have a PR[1] to allow developers to easily (and optionally) setup
> yapf to run in a local git pre-commit hook;  I'd like to do the same for
> mypy.
> >>> I will make myself and members of my team available to help out with
> typing questions in PRs
> >>>
> >>> Is there anyone else on the list who is knowledgable about python
> static typing who would like to volunteer to be flagged on typing questions?
> >>>
> >>> What else can we do to make this transition easier?
> >>>
> >>> [1] https://github.com/apache/beam/pull/10810
> >>>
> >>> -chad
> >>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] New Committer: Kamil Wasilewski

2020-02-28 Thread Udi Meiri
Welcome Kamil!

On Fri, Feb 28, 2020 at 12:53 PM Mark Liu  wrote:

> Congrats, Kamil!
>
> On Fri, Feb 28, 2020 at 12:23 PM Ismaël Mejía  wrote:
>
>> Congratulations Kamil!
>>
>> On Fri, Feb 28, 2020 at 7:09 PM Yichi Zhang  wrote:
>>
>>> Congrats, Kamil!
>>>
>>> On Fri, Feb 28, 2020 at 9:53 AM Valentyn Tymofieiev 
>>> wrote:
>>>
 Congratulations, Kamil!

 On Fri, Feb 28, 2020 at 9:34 AM Pablo Estrada 
 wrote:

> Hi everyone,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Kamil Wasilewski
>
> Kamil has contributed to Beam in many ways, including the performance
> testing infrastructure, and a custom BQ source, along with other
> contributions.
>
> In consideration of his contributions, the Beam PMC trusts him with
> the responsibilities of a Beam committer[1].
>
> Thanks for your contributions Kamil!
>
> Pablo, on behalf of the Apache Beam PMC.
>
> [1] https://beam.apache.org/contribute/become-a-committer
> /#an-apache-beam-committer
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

2020-02-26 Thread Udi Meiri
I agree with having low-frequency tests for low-priority versions.
Low-priority versions could be determined according to least usage.



On Wed, Feb 26, 2020 at 4:06 PM Robert Bradshaw  wrote:

> On Wed, Feb 26, 2020 at 3:29 PM Kenneth Knowles  wrote:
> >
> > Are these divergent enough that they all need to consume testing
> resources? For example can lower priority versions be daily runs or some
> such?
>
> For the 3.x series, I think we will get the most signal out of the
> lowest and highest version, and can get by with smoke tests +
> infrequent post-commits for the ones between.
>
> > Kenn
> >
> > On Wed, Feb 26, 2020 at 3:25 PM Robert Bradshaw 
> wrote:
> >>
> >> +1 to consulting users. Currently 3.5 downloads sit at 3.7%, or about
> >> 20% of all Python 3 downloads.
> >>
> >> I would propose getting in warnings about 3.5 EoL well ahead of time,
> >> at the very least as part of the 2.7 warning.
> >>
> >> Fortunately, supporting multiple 3.x versions is significantly easier
> >> than spanning 2.7 and 3.x. I would rather not impose an ordering on
> >> dropping 3.5 and adding 3.8 but consider their merits independently.
> >>
> >>
> >> On Wed, Feb 26, 2020 at 3:16 PM Kyle Weaver 
> wrote:
> >> >
> >> > 5 versions is too many IMO. We've had issues with Python precommit
> resource usage in the past, and adding another version would surely
> exacerbate those issues. And we have also already had to leave out certain
> features on 3.5 [1]. Therefore, I am in favor of dropping 3.5 before adding
> 3.8. After dropping Python 2 and adding 3.8, that will leave us with the
> latest three minor versions (3.6, 3.7, 3.8), which I think is closer to the
> "sweet spot." Though I would be interested in hearing if there are any
> users who would prefer we continue supporting 3.5.
> >> >
> >> > [1]
> https://github.com/apache/beam/blob/8658b95545352e51f35959f38334f3c7df8b48eb/sdks/python/apache_beam/runners/portability/flink_runner.py#L55
> >> >
> >> > On Wed, Feb 26, 2020 at 3:00 PM Valentyn Tymofieiev <
> valen...@google.com> wrote:
> >> >>
> >> >> I would like to start a discussion about identifying a guideline for
> answering questions like:
> >> >>
> >> >> 1. When will Beam support a new Python version (say, Python 3.8)?
> >> >> 2. When will Beam drop support for an old Python version (say,
> Python 3.5)?
> >> >> 3. How many Python versions should we aim to support concurrently
> (investigate issues, have continuous integration tests)?
> >> >> 4. What comes first: adding support for a new version (3.8) or
> deprecating older one (3.5)? This may affect the max load our test
> infrastructure needs to sustain.
> >> >>
> >> >> We are already getting requests for supporting Python 3.8 and there
> were some good reasons[1] to drop support for Python 3.5 (at least, early
> versions of 3.5). Answering these questions would help set expectations in
> Beam user community, Beam dev community, and  may help us establish
> resource requirements for test infrastructure and plan efforts.
> >> >>
> >> >> PEP-0602 [2] establishes a yearly release cycle for Python versions
> starting from 3.9. Each release is a long-term support release and is
> supported for 5 years: first 1.5 years allow for general bug fix support,
> remaining 3.5 years have security fix support.
> >> >>
> >> >> At every point, there may be up to 5 Python minor versions that did
> not yet reach EOL, see "Release overlap with 12 month diagram" [3]. We can
> try to support all of them, but that may come at a cost of velocity: we
> will have more tests to maintain, and we will have to develop Beam against
> a lower version for a longer period. Supporting less versions will have
> implications for user experience. It also may be difficult to ensure
> support of the most recent version early, since our  dependencies (e.g.
> picklers) may not be supporting them yet.
> >> >>
> >> >> Currently we support 4 Python versions (2.7, 3.5, 3.6, 3.7).
> >> >>
> >> >> Is 4 versions a sweet spot? Too much? Too little? What do you think?
> >> >>
> >> >> [1] https://github.com/apache/beam/pull/10821#issuecomment-590167711
> >> >> [2] https://www.python.org/dev/peps/pep-0602/
> >> >> [3] https://www.python.org/dev/peps/pep-0602/#id17
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] New committer: Chad Dombrova

2020-02-24 Thread Udi Meiri
Congrats and welcome, Chad!

On Mon, Feb 24, 2020 at 1:21 PM Pablo Estrada  wrote:

> Hi everyone,
>
> Please join me and the rest of the Beam PMC in welcoming a new committer:
> Chad Dombrova
>
> Chad has contributed to the project in multiple ways, including
> improvements to the testing infrastructure, and adding type annotations
> throughout the Python SDK, as well as working closely with the community on
> these improvements.
>
> In consideration of his contributions, the Beam PMC trusts him with the
> responsibilities of a Beam Committer[1].
>
> Thanks Chad for your contributions!
>
> -Pablo, on behalf of the Apache Beam PMC.
>
> [1]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: jira search in chrome omnibox

2020-02-14 Thread Udi Meiri
The JIRA tips page already has the instructions for Chrome, Ismaël. Feel
free to add the same for Firefox.
https://cwiki.apache.org/confluence/display/BEAM/Jira+Tips

On Fri, Feb 14, 2020 at 8:20 AM Ismaël Mejía  wrote:

> For Firefox users:
>
> You can replicate the same behavior but it requires a bit more work:
>
> Firefox uses a format called OpenSearch so you have to generate an
> installable XML via this page.
> https://ready.to/search/en/#
>
> the search name: Beam Issues
> the front search term: https://issues.apache.org/jira/browse/BEAM-
> Click then on Make search plugin to generate an URL:
>
> https://ready.to/search/en/?sna=Beam%20issues=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FBEAM-=utf=ono=pn#
> then you open the URL and click on `Opensearch plug-in Beam Issues` to
> install it
>
> Now you configure the keyword on Firefox preferences
>
> Preferences > Search on the left panel.
> Scroll down to One-Click Search Engines.
> Double-click on the Keyword column for the search engine you want to
> assign a shortcut to.
> Enter @ followed by your search shortcut keyword. For example: @beam
>
> Another user URL (to search on github open PRs):
>
> http://ready.to/search/en/?sna=Beam%20PRs=https%3A%2F%2Fgithub.com%2Fapache%2Fbeam%2Fpulls%3Futf8%3D%E2%9C%93%26amp%3Bq%3Dis%3Apr%2B=utf=ono=pn
>
> Enjoy!
>
> ps. This is so useful that probably is worth to put in cwiki, create the
> page Udi and I will add the Firefox section if you agree.
>
>
>
>
> On Fri, Aug 31, 2018 at 2:31 AM Udi Meiri  wrote:
>
>> Correction: this is the correct URL:
>> https://issues.apache.org/jira/secure/QuickSearch.jspa?searchString=%s
>>
>> It uses smart querying. Ex: Searching for "beam open pubsub" will search
>> for open bugs in project BEAM with the keyword "pubsub".
>>
>> On Tue, Aug 28, 2018 at 4:49 PM Valentyn Tymofieiev 
>> wrote:
>>
>>> Thanks for sharing.
>>>
>>> I have also found useful following custom search query for PRs:
>>> https://github.com/apache/beam/pulls?q=is%3Apr%20%s
>>>
>>> Sample usage: type 'pr', space, type: 'author:tvalentyn'.
>>>
>>> You could also incorporate 'author:' into the query:
>>> https://github.com/apache/beam/pulls?q=is%3Apr%20author%3A
>>>
>>> On Tue, Aug 28, 2018 at 4:26 PM Daniel Oliveira 
>>> wrote:
>>>
>>>> This seems pretty useful. Thanks Udi!
>>>>
>>>> On Mon, Aug 27, 2018 at 3:54 PM Udi Meiri  wrote:
>>>>
>>>>> In case you want to quickly look up JIRA tickets, e.g., typing 'j',
>>>>> space, 'BEAM-4696'.
>>>>> Search URL:
>>>>> https://issues.apache.org/jira/QuickSearch.jspa?searchString=%s
>>>>>
>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Sphinx Docs Command Error (:sdks:python:test-suites:tox:pycommon:docs)

2020-02-11 Thread Udi Meiri
For me the difference was about 20s longer (40s -> 60s approx). Not
significant IMO

On Tue, Feb 11, 2020 at 9:59 AM Ahmet Altay  wrote:

> Should we remove the "-j 8" option by default? Sphinx docs says this is an
> experimental option [1]. I do not recall docs generation taking a long
> time, does this increase significantly without this option?
>
> [1] http://www.sphinx-doc.org/en/stable/man/sphinx-build.html
>
> On Tue, Feb 11, 2020 at 1:16 AM Shoaib Zafar 
> wrote:
>
>> Thanks, Udi and Jincheng for the response.
>> The suggested solution worked for me as well.
>>
>> Regards,
>>
>> *Shoaib Zafar*
>> Software Engineering Lead
>> Mobile: +92 333 274 6242
>> Skype: live:shoaibzafar_1
>>
>> <http://venturedive.com/>
>>
>>
>> On Tue, Feb 11, 2020 at 1:17 PM jincheng sun 
>> wrote:
>>
>>> I have verified that this issue could be reproduced in my local
>>> environment (MacOS) and the solution suggested by Udi could work!
>>>
>>> Best,
>>> Jincheng
>>>
>>> Udi Meiri  于2020年2月11日周二 上午8:51写道:
>>>
>>>> I don't have those issues (running on Linux), but a possible workaround
>>>> could be to remove the "-j 8" flags (2 locations) in generate_pydoc.sh.
>>>>
>>>>
>>>> On Mon, Feb 10, 2020 at 11:06 AM Shoaib Zafar <
>>>> shoaib.za...@venturedive.com> wrote:
>>>>
>>>>> Hello Beamers.
>>>>>
>>>>> Just curious does anyone having trouble running
>>>>> ':sdks:python:test-suites:tox:pycommon:docs' command locally?
>>>>>
>>>>> After rebasing with master recently, I am facing sphinx thread fork
>>>>> error with on my macos mojave, using python 3.7.0.
>>>>> I Tried to add system variable "export
>>>>> OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES" (which I found on google)
>>>>> but no luck!
>>>>>
>>>>> Any suggestions/help?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Console Log:
>>>>> --
>>>>> 
>>>>> Creating file target/docs/source/apache_beam.utils.proto_utils.rst.
>>>>> Creating file target/docs/source/apache_beam.utils.retry.rst.
>>>>> Creating file
>>>>> target/docs/source/apache_beam.utils.subprocess_server.rst.
>>>>> Creating file
>>>>> target/docs/source/apache_beam.utils.thread_pool_executor.rst.
>>>>> Creating file target/docs/source/apache_beam.utils.timestamp.rst.
>>>>> Creating file target/docs/source/apache_beam.utils.urns.rst.
>>>>> Creating file target/docs/source/apache_beam.utils.rst.
>>>>> objc[8384]: +[__NSCFConstantString initialize] may have been in
>>>>> progress in another thread when fork() was called.
>>>>> objc[8384]: +[__NSCFConstantString initialize] may have been in
>>>>> progress in another thread when fork() was called. We cannot safely call 
>>>>> it
>>>>> or ignore it in the fork() child process. Crashing instead. Set a
>>>>> breakpoint on objc_initializeAfterForkError to debug.
>>>>>
>>>>> Traceback (most recent call last):
>>>>>   File
>>>>> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/cmd/build.py",
>>>>> line 304, in build_main
>>>>> app.build(args.force_all, filenames)
>>>>>   File
>>>>> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/application.py",
>>>>> line 335, in build
>>>>> self.builder.build_all()
>>>>>   File
>>>>> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/builders/__init__.py",
>>>>> line 305, in build_all
>>>>> self.build(None, summary=__('all source files'), method='all')
>>>>>   File
>>>>> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/builders/__init__.py"

Re: Sphinx Docs Command Error (:sdks:python:test-suites:tox:pycommon:docs)

2020-02-10 Thread Udi Meiri
I don't have those issues (running on Linux), but a possible workaround
could be to remove the "-j 8" flags (2 locations) in generate_pydoc.sh.


On Mon, Feb 10, 2020 at 11:06 AM Shoaib Zafar 
wrote:

> Hello Beamers.
>
> Just curious does anyone having trouble running
> ':sdks:python:test-suites:tox:pycommon:docs' command locally?
>
> After rebasing with master recently, I am facing sphinx thread fork error
> with on my macos mojave, using python 3.7.0.
> I Tried to add system variable "export
> OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES" (which I found on google) but no
> luck!
>
> Any suggestions/help?
>
> Thanks!
>
> Console Log:
> --
> 
> Creating file target/docs/source/apache_beam.utils.proto_utils.rst.
> Creating file target/docs/source/apache_beam.utils.retry.rst.
> Creating file target/docs/source/apache_beam.utils.subprocess_server.rst.
> Creating file
> target/docs/source/apache_beam.utils.thread_pool_executor.rst.
> Creating file target/docs/source/apache_beam.utils.timestamp.rst.
> Creating file target/docs/source/apache_beam.utils.urns.rst.
> Creating file target/docs/source/apache_beam.utils.rst.
> objc[8384]: +[__NSCFConstantString initialize] may have been in progress
> in another thread when fork() was called.
> objc[8384]: +[__NSCFConstantString initialize] may have been in progress
> in another thread when fork() was called. We cannot safely call it or
> ignore it in the fork() child process. Crashing instead. Set a breakpoint
> on objc_initializeAfterForkError to debug.
>
> Traceback (most recent call last):
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/cmd/build.py",
> line 304, in build_main
> app.build(args.force_all, filenames)
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/application.py",
> line 335, in build
> self.builder.build_all()
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/builders/__init__.py",
> line 305, in build_all
> self.build(None, summary=__('all source files'), method='all')
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/builders/__init__.py",
> line 360, in build
> updated_docnames = set(self.read())
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/builders/__init__.py",
> line 466, in read
> self._read_parallel(docnames, nproc=self.app.parallel)
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/builders/__init__.py",
> line 521, in _read_parallel
> tasks.join()
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/util/parallel.py",
> line 114, in join
> self._join_one()
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/util/parallel.py",
> line 120, in _join_one
> exc, logs, result = pipe.recv()
>   File
> "/Users/shoaib/.pyenv/versions/3.7.0/lib/python3.7/multiprocessing/connection.py",
> line 250, in recv
> buf = self._recv_bytes()
>   File
> "/Users/shoaib/.pyenv/versions/3.7.0/lib/python3.7/multiprocessing/connection.py",
> line 407, in _recv_bytes
> buf = self._recv(4)
>   File
> "/Users/shoaib/.pyenv/versions/3.7.0/lib/python3.7/multiprocessing/connection.py",
> line 383, in _recv
> raise EOFError
> EOFError
>
> Exception occurred:
>   File
> "/Users/shoaib/.pyenv/versions/3.7.0/lib/python3.7/multiprocessing/connection.py",
> line 383, in _recv
> raise EOFError
> EOFError
> The full traceback has been saved in
> /Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/tmp/sphinx-err-mphtfnei.log,
> if you want to report the issue to the developers.
> Please also report this if it was a user error, so that a better error
> message can be provided next time.
> A bug report can be filed in the tracker at <
> https://github.com/sphinx-doc/sphinx/issues>. Thanks!
> objc[8385]: +[__NSCFConstantString initialize] may have been in progress
> in another thread when fork() was called.
> objc[8385]: +[__NSCFConstantString initialize] may have been in progress
> in another thread when fork() was called. 

Re: Labels on PR

2020-02-10 Thread Udi Meiri
Cool!

On Mon, Feb 10, 2020 at 9:27 AM Robert Burke  wrote:

> +1 to autolabeling
>
> On Mon, Feb 10, 2020, 9:21 AM Luke Cwik  wrote:
>
>> Nice
>>
>> On Mon, Feb 10, 2020 at 2:52 AM Alex Van Boxel  wrote:
>>
>>> Ha, cool. I'll have a look at the autolabeler. The infra stuff is not
>>> something I've looked at... I'll dive into that.
>>>
>>>  _/
>>> _/ Alex Van Boxel
>>>
>>>
>>> On Mon, Feb 10, 2020 at 11:49 AM Ismaël Mejía  wrote:
>>>
 +1

 You don't need to write your own action, there is already one
 autolabeler action [1].
 INFRA can easily configure it for Beam (as they did for Avro [2]) if
 we request it.
 The plugin is quite easy to configure and works like a charm [3].

 [1] https://github.com/probot/autolabeler
 [1] https://issues.apache.org/jira/browse/INFRA-17367
 [2] https://github.com/apache/avro/blob/master/.github/autolabeler.yml


 On Mon, Feb 10, 2020 at 11:20 AM Alexey Romanenko <
 aromanenko@gmail.com> wrote:

> Great initiative, thanks Alex! I was thinking to add such labels into
> PR title but I believe that GitHub labels are better since it can be used
> easily for filtering, for example.
>
> Maybe it could be useful to add more granulation for labels, like
> “release”, “runners”, “website”, etc but I’m afraid to make the titles too
> heavy because of this.
>
> > On 10 Feb 2020, at 08:35, Alex Van Boxel  wrote:
> >
> > I've started putting labels on PR's. I've done the first page for
> now (as I'm afraid putting them on older once could affect the stale bot. 
> I
> hope this is ok.
> >
> > For now I'm only focussing on language and I'm going to see if I can
> write a GitLab action for it. I hope this is useful. Other kind of
> suggestions for labels, that can be automated, are welcome.
> >
> > 
> >  _/
> > _/ Alex Van Boxel
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [DISCUSS] Autoformat python code with Black

2020-02-07 Thread Udi Meiri
t;>>
>> >>>> Thanks to everyone involved in the discussion.
>> >>>>
>> >>>> I've taken a look at the first 50 recently updated Pull Requests.
>> Only few of them were affected. I hope it wouldn't be too hard to fix them.
>> >>>>
>> >>>> In any case, here you can find instructions on how to run formatter:
>> https://cwiki.apache.org/confluence/display/BEAM/Python+Tips (section
>> "Formatting").
>> >>>>
>> >>>> On Thu, Feb 6, 2020 at 12:42 PM Michał Walenia <
>> michal.wale...@polidea.com> wrote:
>> >>>>>
>> >>>>> Hi,
>> >>>>> the PR is merged, all checks were green :)
>> >>>>> Enjoy prettier Python!
>> >>>>>
>> >>>>> On Thu, Feb 6, 2020 at 11:11 AM Ismaël Mejía 
>> wrote:
>> >>>>>>
>> >>>>>> Agree no need for vote for this because the consensus is clear and
>> the sole
>> >>>>>> impact I can think of are pending PRs that will be broken. In the
>> Java case
>> >>>>>> what we did was to just notice every PR that was affected by the
>> change.
>> >>>>>> And clearly document how to validate and autoformat the code.
>> >>>>>>
>> >>>>>> So the earlier the better, go go autoformat!
>> >>>>>>
>> >>>>>> On Thu, Feb 6, 2020 at 1:38 AM Robert Bradshaw <
>> rober...@google.com> wrote:
>> >>>>>>>
>> >>>>>>> No, perhaps not. I agree there's consensus, just wondering what
>> the
>> >>>>>>> next steps should be to get this in. (The presubmits look like
>> they're
>> >>>>>>> all passing, with the exception of some breakage in java that
>> should
>> >>>>>>> be completely unrelated. Of course there's already merge
>> conflicts...)
>> >>>>>>>
>> >>>>>>> On Wed, Feb 5, 2020 at 3:55 PM Ahmet Altay 
>> wrote:
>> >>>>>>> >
>> >>>>>>> > Do we need a formal vote? There is consensus on this thread and
>> on the PR.
>> >>>>>>> >
>> >>>>>>> > On Wed, Feb 5, 2020 at 3:37 PM Robert Bradshaw <
>> rober...@google.com> wrote:
>> >>>>>>> >>
>> >>>>>>> >> The PR is looking good. Should we call a vote?
>> >>>>>>> >>
>> >>>>>>> >> On Mon, Jan 27, 2020 at 11:03 AM Robert Bradshaw <
>> rober...@google.com> wrote:
>> >>>>>>> >> >
>> >>>>>>> >> > Thanks. I commented on the PR. I think if we're going this
>> route we
>> >>>>>>> >> > should add a pre-commit, plus instructions on how to run the
>> tool
>> >>>>>>> >> > (similar to spotless).
>> >>>>>>> >> >
>> >>>>>>> >> > On Mon, Jan 27, 2020 at 10:00 AM Udi Meiri 
>> wrote:
>> >>>>>>> >> > >
>> >>>>>>> >> > > I've done a pass on the PR on code I'm familiar with.
>> >>>>>>> >> > > Please make a pass and add your suggestions on the PR.
>> >>>>>>> >> > >
>> >>>>>>> >> > > On Fri, Jan 24, 2020 at 7:15 AM Ismaël Mejía <
>> ieme...@gmail.com> wrote:
>> >>>>>>> >> > >>
>> >>>>>>> >> > >> Java build fails on any unformatted code so python
>> probably should be like that.
>> >>>>>>> >> > >> We have to ensure however that it fails early on that.
>> >>>>>>> >> > >> As Robert said time to debate the knobs :)
>> >>>>>>> >> > >>
>> >>>>>>> >> > >> On Fri, Jan 24, 2020 at 3:19 PM Kamil Wasilewski <
>> kamil.wasilew...@polidea.com> wrote:
>> >>>>>>> >> > >>>
>> >>>>>>> >> > >>> PR is ready: https://github.com/apache/beam/pull/10684.
>> Please share your comments ;-

Re: [RELEASE VOTE RESULT] Release 2.19.0, release candidate #1

2020-02-03 Thread Udi Meiri
Thank you Boyuan!

On Mon, Feb 3, 2020 at 3:40 PM Ahmet Altay  wrote:

> On Mon, Feb 3, 2020 at 1:22 PM Thomas Weise  wrote:
>
>> Impressive, probably the fastest/smoothest Beam release so far.
>>
>
> I agree! Thank you, Boyuan!
>
>
>>
>> On Mon, Feb 3, 2020 at 10:45 AM Boyuan Zhang  wrote:
>>
>>> I'm happy to announce that we have unanimously approved this release.
>>>
>>> There are 5 approving votes, 4 of which are binging:
>>> * Ahmet Altay
>>> * Ismaël Mejía
>>> * Jean-Baptiste Onofré
>>> * Robert Bradshaw
>>>
>>> There are no disapproving votes.
>>>
>>> Thanks for everyone's help! I'm going to finalize the release and send
>>> out the official release announcement later.
>>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [DISCUSSION] Improve release notes by adding a change list file

2020-02-03 Thread Udi Meiri
+1 to add this to the checklist

On Mon, Feb 3, 2020 at 4:57 PM Robert Bradshaw  wrote:

> On Mon, Feb 3, 2020 at 4:49 PM Ahmet Altay  wrote:
> >
> > On Mon, Feb 3, 2020 at 2:09 PM Robert Bradshaw 
> wrote:
> >>
> >> I would suggest we start with the simpler single file. If merge
> >> conflicts become an issue, we could look at other options, but I think
> >> it's worth keeping in mind that what we're trying to produce here is a
> >> single, higher-level, cohesive summary of the release rather than a
> >> 1:1 listing of commits, pull request, or jira entries (which we can
> >> link to). While new features often merit their own bullet points, this
> >> will allow for entries such as "Several improvements to portability
> >> including ..."
> >
> > I agree. If there are no objections I will go ahead with the PR I
> proposed. It adds a single change log file to begin with.
> >
> > We would need all committers to help after that by asking PR authors to
> update this file whenever it makes sense.
>
> Yes. Should we add it to the PR template checklist?
>
> >> On Mon, Feb 3, 2020 at 1:55 PM Ahmet Altay  wrote:
> >> >
> >> >
> >> >
> >> > On Sat, Feb 1, 2020 at 9:22 AM Chad Dombrova 
> wrote:
> >> >>
> >> >> In case it's of any use, there's a tool called towncrier[1] to help
> compile changelog fragments and compile them at time of delivery.
> >> >
> >> >
> >> > I would prefer not to have the complexity of multiple files and an
> added tool to the release process. I do not have a strong opinion though.
> If others prefer we can switch to this tool. One nice benefit of this tool
> would be to avoid merge conflicts if many different PRs edit the change log
> file all at the same time in a conflicting way.
> >> >
> >> >>
> >> >>
> >> >> I came across this when working on the python-attrs[2] project,
> which has some good documentation for contributors on how to use it:
> https://www.attrs.org/en/stable/contributing.html#changelog
> >> >>
> >> >>
> >> >> [1] https://github.com/hawkowl/towncrier
> >> >> [2] https://github.com/python-attrs/attrs
> >> >>
> >> >>
> >> >> On Fri, Jan 31, 2020 at 5:09 PM Ahmet Altay 
> wrote:
> >> >>>
> >> >>> Thank you for the quick responses. I sent out
> https://github.com/apache/beam/pull/10743 to make this change. Please
> provide feedback or directly edit the PR.
> >> >>>
> >> >>>
> >> >>> On Fri, Jan 31, 2020 at 3:58 PM Robert Bradshaw <
> rober...@google.com> wrote:
> >> 
> >>  Yes, yes, yes! This is the one model of release notes that I've
> >>  actually seen work well at scale.
> >> 
> >> 
> https://lists.apache.org/thread.html/41e03ace17dbcccf7e267ba6d538736b2a99a8e73e7fb45702766b17%40%3Cdev.beam.apache.org%3E
> >> 
> >>  Let's make it happen.
> >> 
> >>  On Fri, Jan 31, 2020 at 3:47 PM Robert Burke 
> wrote:
> >>  >
> >>  > I like this suggestion, Jira titles and commit summaries don't
> necessarily reflect the user impact for a given change (or set of changes).
> Being able to see the Forest instead of the trees.
> >>  >
> >>  > On Fri, Jan 31, 2020, 3:37 PM Kenneth Knowles 
> wrote:
> >>  >>
> >>  >> +1
> >>  >>
> >>  >> This is a great idea. Hope it can lead to higher-value view of
> relevant changes.
> >>  >>
> >>  >> I like it being in the root of the repo, so it lives next to
> the code.
> >>  >>
> >>  >> Since the website is also markdown, it could be copied over
> directly at release time, so it can be browsed there, too.
> >>  >>
> >>  >> Kenn
> >>  >>
> >>  >> On Fri, Jan 31, 2020 at 3:16 PM Ahmet Altay 
> wrote:
> >>  >>>
> >>  >>> Hi all,
> >>  >>>
> >>  >>> We currently have two major ways to communicate changes in a
> release:
> >>  >>> - A blog post, to highlight major changes in the release.
> (Example for 2.17: [1])
> >>  >>> - JIRA release notes pages listing all issues tagged for a
> specific release. (Example for 2.17 [2]).
> >>  >>>
> >>  >>> There are a few issues with this process:
> >>  >>> - It is difficult for the release manager to know what is
> important, what is a breaking change, what is dependency change etc. For
> example, there were more than 150 Jira issues tagged for 2.17 release.
> >>  >>> - Release blog has many items, and does not necessarily
> communicate important changes. It is difficult for users to discover major
> changes short of going through a large list.
> >>  >>> - People involved in authoring or reviewing a PRs usually have
> the most context about the change, and they are not necessarily involved in
> the release process to provide this additional information.
> >>  >>>
> >>  >>> Would it be helpful if we maintain a simple change list file
> and update it as part of the PRs with noteworthy changes? Release managers
> could use this information as is in their blog posts (or link to it). Users
> will have a single place to find highlights from various versions.
> >>  >>>
> >> 

Re: Jenkins jobs not running for my PR 10438

2020-01-31 Thread Udi Meiri
done

On Fri, Jan 31, 2020 at 9:07 AM Tomo Suzuki  wrote:

> HI Beam committers,
>
> Would you re-trigger the 2 failed checks in
> https://github.com/apache/beam/pull/10714 ?
> Run Java PreCommit
> Run Java_Examples_Dataflow PreCommit
>
>
> On Fri, Jan 31, 2020 at 7:51 AM Rehman Murad Ali <
> rehman.murad...@venturedive.com> wrote:
>
>> Hi,
>>
>> I appreciate if someone could trigger the jobs for this PR:
>>
>> https://github.com/apache/beam/pull/10627
>>
>> *Thanks*
>>
>> *Rehman Murad Ali*
>> Software Engineer
>> Mobile: +92 3452076766 <+92%20345%202076766>
>> Skype: rehman.muradali
>>
>>
>> On Thu, Jan 30, 2020 at 10:19 PM Luke Cwik  wrote:
>>
>>> done
>>>
>>> On Thu, Jan 30, 2020 at 12:28 AM Shoaib Zafar <
>>> shoaib.za...@venturedive.com> wrote:
>>>
 Hi Beam Committer,

 I appreciate if someone could trigger jobs for
 https://github.com/apache/beam/pull/10712.

 Thanks!

 *Shoaib Zafar*

 Software Engineering Lead
 Mobile: +92 333 274 6242
 Skype: live:shoaibzafar_1

 


 On Thu, Jan 30, 2020 at 9:09 AM Boyuan Zhang 
 wrote:

> Done : )
>
> On Wed, Jan 29, 2020 at 7:52 PM Tomo Suzuki 
> wrote:
>
>> HI Beam committers:
>> (Thanks, Luke!)
>>
>> Can somebody retrigger the following 2 failed checks for
>> https://github.com/apache/beam/pull/10714 ?
>> Run Java PreCommit
>> Run Java_Examples_Dataflow PreCommit
>>
>> On Wed, Jan 29, 2020 at 4:48 PM Luke Cwik  wrote:
>> >
>> > done
>> >
>> > On Wed, Jan 29, 2020 at 11:07 AM Tomo Suzuki 
>> wrote:
>> >>
>> >> Hi Beam committers,
>> >>
>> >> I appreciate if you can trigger the precommit checks for
>> >> https://github.com/apache/beam/pull/10714
>> >>
>> >> with following 6 additional commands (one command per comment):
>> >>
>> >> Run Java PostCommit
>> >> Run Java HadoopFormatIO Performance Test
>> >> Run BigQueryIO Streaming Performance Test Java
>> >> Run Dataflow ValidatesRunner
>> >> Run Spark ValidatesRunner
>> >> Run SQL Postcommit
>> >>
>> >> Regards,
>> >> Tomo
>> >>
>> >>
>>
>>
>> --
>> Regards,
>> Tomo
>>
>
>
> --
> Regards,
> Tomo
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] New committer: Hannah Jiang

2020-01-28 Thread Udi Meiri
Welcome and congrats Hannah!

On Tue, Jan 28, 2020 at 4:52 PM Robin Qiu  wrote:

> Congratulations, Hannah!
>
> On Tue, Jan 28, 2020 at 4:50 PM Alan Myrvold  wrote:
>
>> Congrats, Hannah
>>
>> On Tue, Jan 28, 2020 at 4:46 PM Connell O'Callaghan 
>> wrote:
>>
>>> Thank you for sharing Luke!!!
>>>
>>> Well done and congratulations Hannah!!
>>>
>>> On Tue, Jan 28, 2020 at 4:45 PM Heejong Lee  wrote:
>>>
 Congratulations! :)

 On Tue, Jan 28, 2020 at 4:43 PM Yichi Zhang  wrote:

> Congrats Hannah!
>
> On Tue, Jan 28, 2020 at 3:57 PM Yifan Zou  wrote:
>
>> Congratulations Hannah!!
>>
>> On Tue, Jan 28, 2020 at 3:55 PM Boyuan Zhang 
>> wrote:
>>
>>> Thanks for all your contributions! Congratulations~
>>>
>>> On Tue, Jan 28, 2020 at 3:44 PM Pablo Estrada 
>>> wrote:
>>>
 yoooho : D

 On Tue, Jan 28, 2020 at 3:21 PM Luke Cwik  wrote:

> Hi everyone,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Hannah Jiang
>
> Hannah has contributed to Beam in many ways, including work on
> building and releasing the Apache Beam SDK containers.
>
> In consideration of their contributions, the Beam PMC trusts them
> with the responsibilities of a Beam committer[1].
>
> Thanks for your contributions Hannah!
>
> Luke, on behalf of the Apache Beam PMC.
>
> [1]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>



smime.p7s
Description: S/MIME Cryptographic Signature


anyone working on updating com.google.cloud:google-cloud-spanner?

2020-01-28 Thread Udi Meiri
It's currently at 1.6.0, which a year old.
https://github.com/googleapis/java-spanner/releases?after=1.9.0

I would appreciate any help with this.

Tracking issues:
https://issues.apache.org/jira/browse/BEAM-8758
https://issues.apache.org/jira/browse/BEAM-8682


smime.p7s
Description: S/MIME Cryptographic Signature


[ANNOUNCE] Beam 2.18.0 Released

2020-01-28 Thread Udi Meiri
The Apache Beam team is pleased to announce the release of version 2.18.0.

Apache Beam is an open source unified programming model to define and
execute data processing pipelines, including ETL, batch and stream
(continuous) processing. See https://beam.apache.org

You can download the release here:

https://beam.apache.org/get-started/downloads/

This release includes bug fixes, features, and improvements detailed on
the Beam blog: https://beam.apache.org/blog/2020/01/13/beam-2.18.0.html

Thanks to everyone who contributed to this release, and we hope you enjoy
using Beam 2.18.0.
-- Udi Meiri, on behalf of The Apache Beam team


  1   2   3   4   >