Re: Python staging file weirdness

2019-12-04 Thread Udi Meiri
Thanks!
Another reason to periodically referesh workers.

On Wed, Nov 27, 2019 at 10:37 PM Valentyn Tymofieiev 
wrote:

> Tests job specify[1] a requirements.txt file that contains two entries:
> pyhamcrest, mock.
>
> We download[2]  sources of packages specified in requirements file,
> and packages they depend on. While doing so, it appears that we use a cache
> directory on jenkins to store the sources of the packages [3], perhaps to
> save a trip to pypi and reduce pypi flakiness? Then, we stage the entire
> cache directory[4], which includes all packages ever cached. Overtime the
> versions that our requirements packages need change, but I guess we don't
> clean the cache on Jenkins workers.
>
> [1]
> https://github.com/apache/beam/blob/438055c95116f4e6e419e5faa9c42f7d329c421c/sdks/python/scripts/run_integration_test.sh#L197
> [2]
> https://github.com/apache/beam/blob/438055c95116f4e6e419e5faa9c42f7d329c421c/sdks/python/apache_beam/runners/portability/stager.py#L469
> [3]
> https://github.com/apache/beam/blob/438055c95116f4e6e419e5faa9c42f7d329c421c/sdks/python/apache_beam/runners/portability/stager.py#L161
>
> [4]
> https://github.com/apache/beam/blob/438055c95116f4e6e419e5faa9c42f7d329c421c/sdks/python/apache_beam/runners/portability/stager.py#L172
>
> On Wed, Nov 27, 2019 at 11:55 AM Udi Meiri  wrote:
>
>> I was investigating a Dataflow postcommit test failure (endpoints_pb2
>> missing), and saw this in the staging directory:
>>
>> $ gsutil ls 
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/PyHamcrest-1.9.0.tar.gz
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/dataflow-worker.jar
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/dataflow_python_sdk.tar
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/funcsigs-1.0.2.tar.gz
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/mock-3.0.5.tar.gz
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/pipeline.pb
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/requirements.txt
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/setuptools-41.2.0.zip
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/setuptools-41.4.0.zip
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/setuptools-41.5.0.zip
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/setuptools-41.5.1.zip
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/setuptools-41.6.0.zip
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/setuptools-42.0.0.zip
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/setuptools-42.0.1.zip
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/six-1.12.0.tar.gz
>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/six-1.13.0.tar.gz
>>
>>
>> Does anyone know why so many versions of setuptools need to be staged?
>> Shouldn't 1 be enough?
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Python staging file weirdness

2019-12-05 Thread Udi Meiri
Looking at the source, it seems that it should be using a
os.path.join(tempfile.gettempdir(), 'dataflow-requirements-cache')
to create a different tmp directory on each run.

Also, sampling worker no. 2:

*jenkins@apache-beam-jenkins-2*:*~*$ ls -l /tmp/dataflow-requirements-cache/
total 7172
-rw-rw-r-- 1 jenkins jenkins  27947 Sep  6 22:46 *funcsigs-1.0.2.tar.gz*
-rw-rw-r-- 1 jenkins jenkins  28126 Sep  6 21:38 *mock-3.0.5.tar.gz*
-rw-rw-r-- 1 jenkins jenkins 376623 Sep  6 21:38 *PyHamcrest-1.9.0.tar.gz*
-rw-rw-r-- 1 jenkins jenkins 851251 Sep  6 21:38 *setuptools-41.2.0.zip*
-rw-rw-r-- 1 jenkins jenkins 855608 Oct  7 06:03 *setuptools-41.4.0.zip*
-rw-rw-r-- 1 jenkins jenkins 851068 Oct 28 06:10 *setuptools-41.5.0.zip*
-rw-rw-r-- 1 jenkins jenkins 851097 Oct 28 19:46 *setuptools-41.5.1.zip*
-rw-rw-r-- 1 jenkins jenkins 852541 Oct 29 14:06 *setuptools-41.6.0.zip*
-rw-rw-r-- 1 jenkins jenkins 852125 Nov 24 08:10 *setuptools-42.0.0.zip*
-rw-rw-r-- 1 jenkins jenkins 852264 Nov 25 20:55 *setuptools-42.0.1.zip*
-rw-rw-r-- 1 jenkins jenkins 858444 Dec  1 18:12 *setuptools-42.0.2.zip*
-rw-rw-r-- 1 jenkins jenkins  32725 Sep  6 21:38 *six-1.12.0.tar.gz*
-rw-rw-r-- 1 jenkins jenkins  33726 Nov  5 19:18 *six-1.13.0.tar.gz*


On Wed, Dec 4, 2019 at 8:00 PM Luke Cwik  wrote:

> Can we filter the cache directory only for the artifacts that we want and
> not everything that is there?
>
> On Wed, Dec 4, 2019 at 6:56 PM Valentyn Tymofieiev 
> wrote:
>
>> Luke, I am not sure I understand the question. The caching that happens
>> here is implemented in the SDK for requirements packages:
>> https://github.com/apache/beam/blob/438055c95116f4e6e419e5faa9c42f7d329c421c/sdks/python/apache_beam/runners/portability/stager.py#L161
>>
>>
>> On Wed, Dec 4, 2019 at 6:19 PM Luke Cwik  wrote:
>>
>>> Is there a way to use a cache on disk that is separate from the set of
>>> packages we use as requirements?
>>>
>>> On Wed, Dec 4, 2019 at 5:58 PM Udi Meiri  wrote:
>>>
>>>> Thanks!
>>>> Another reason to periodically referesh workers.
>>>>
>>>> On Wed, Nov 27, 2019 at 10:37 PM Valentyn Tymofieiev <
>>>> valen...@google.com> wrote:
>>>>
>>>>> Tests job specify[1] a requirements.txt file that contains two
>>>>> entries: pyhamcrest, mock.
>>>>>
>>>>> We download[2]  sources of packages specified in requirements file,
>>>>> and packages they depend on. While doing so, it appears that we use a 
>>>>> cache
>>>>> directory on jenkins to store the sources of the packages [3], perhaps to
>>>>> save a trip to pypi and reduce pypi flakiness? Then, we stage the entire
>>>>> cache directory[4], which includes all packages ever cached. Overtime the
>>>>> versions that our requirements packages need change, but I guess we don't
>>>>> clean the cache on Jenkins workers.
>>>>>
>>>>> [1]
>>>>> https://github.com/apache/beam/blob/438055c95116f4e6e419e5faa9c42f7d329c421c/sdks/python/scripts/run_integration_test.sh#L197
>>>>> [2]
>>>>> https://github.com/apache/beam/blob/438055c95116f4e6e419e5faa9c42f7d329c421c/sdks/python/apache_beam/runners/portability/stager.py#L469
>>>>> [3]
>>>>> https://github.com/apache/beam/blob/438055c95116f4e6e419e5faa9c42f7d329c421c/sdks/python/apache_beam/runners/portability/stager.py#L161
>>>>>
>>>>> [4]
>>>>> https://github.com/apache/beam/blob/438055c95116f4e6e419e5faa9c42f7d329c421c/sdks/python/apache_beam/runners/portability/stager.py#L172
>>>>>
>>>>> On Wed, Nov 27, 2019 at 11:55 AM Udi Meiri  wrote:
>>>>>
>>>>>> I was investigating a Dataflow postcommit test failure (endpoints_pb2
>>>>>> missing), and saw this in the staging directory:
>>>>>>
>>>>>> $ gsutil ls 
>>>>>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882
>>>>>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/PyHamcrest-1.9.0.tar.gz
>>>>>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/dataflow-worker.jar
>>>>>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/dataflow_python_sdk.tar
>>>>>> gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1126202146-314738.1574799706.314882/funcsigs-1.0.2.tar.gz
>>>>>>

Re: Python interactive runner: test dependencies removed

2019-12-05 Thread Udi Meiri
The pytest tasks are there for me (or someone else) to verify that they can
replace the nose ones.
If you make changes to tox environments, please make changes to the
corresponding -pytest env as well.

Regarding extras, go ahead in adding "interactive" to the extras option
(both py3x and py3x-pytest targets please).

On Thu, Dec 5, 2019 at 1:55 PM Ning Kang  wrote:

> Hi Udi,
>
> Are the temporary pytest tasks in use for pre-commit check or anything
> currently?
> I see there is still WIP for BEAM-3713
> <https://issues.apache.org/jira/browse/BEAM-3713>.
>
> There is only one task "pythonPreCommitPytest" depending on the pytest
> tasks using the pytest environment configs.
> And it's invoked here:
>
> PrecommitJobBuilder builderPytest = new PrecommitJobBuilder(
> scope: this,
> nameBase: 'Python_pytest',
> gradleTask: ':pythonPreCommitPytest',
> commitTriggering: false,
> timeoutMins: 180,
> )
>
> builderPytest.build {...}
>
>
> On Wed, Dec 4, 2019 at 5:51 PM Ning Kang  wrote:
>
>> Thanks for the heads up! I was wondering why the interactive tests are
>> skipped, lol.
>> So we are moving away from the deprecated pytest-runner (with the changes
>> in setup.py) but still sticking to pytest since it's replacing nosetest.
>>
>> Can I add "interactive" as "extras" to testenv "py37-pytest" and
>> "py36-pytest" in tox.ini
>> <https://github.com/apache/beam/blob/master/sdks/python/tox.ini#L100>
>>  then?
>>
>> @Ahmet Altay  fyi
>>
>> On Wed, Dec 4, 2019 at 5:22 PM Pablo Estrada  wrote:
>>
>>> +Ning Kang  +Sam Rohde  fyi
>>>
>>> On Wed, Nov 27, 2019 at 5:09 PM Udi Meiri  wrote:
>>>
>>>> As part of a move to stop using the deprecated (and racey) setup.py
>>>> keywords setup_requires and test_require, interactive runner dependencies
>>>> have been removed from tests in
>>>> https://github.com/apache/beam/pull/10227
>>>>
>>>> If this breaks any tests, please let me know.
>>>>
>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RELEASE] Tracking 2.18

2019-12-05 Thread Udi Meiri
Sorry Robert the release was already cut yesterday.



On Thu, Dec 5, 2019 at 8:37 AM Ismaël Mejía  wrote:

> Colm, I just merged your PR and cherry picked it into 2.18.0
> https://github.com/apache/beam/pull/10296
>
> On Thu, Dec 5, 2019 at 10:54 AM jincheng sun 
> wrote:
>
>> Thanks for the Tracking Udi!
>>
>> I have updated the status of some release blockers issues as follows:
>>
>> - BEAM-8733 closed
>> - BEAM-8620 reset the fix version to 2.19
>> - BEAM-8618 reset the fix version to 2.19
>>
>> Best,
>> Jincheng
>>
>> Colm O hEigeartaigh  于2019年12月5日周四 下午5:38写道:
>>
>>> Could we get this one in 2.18 as well?
>>> https://issues.apache.org/jira/browse/BEAM-8861
>>>
>>> Colm.
>>>
>>> On Wed, Dec 4, 2019 at 8:02 PM Udi Meiri  wrote:
>>>
>>>> Following the release calendar, I plan on cutting the 2.18 release
>>>> branch today.
>>>>
>>>> There are currently 8 release blockers
>>>> <https://issues.apache.org/jira/projects/BEAM/versions/12346383>.
>>>>
>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Python: pytest migration update

2019-12-09 Thread Udi Meiri
This PR  (in review) migrates
py27-gcp to using pytest.
It reduces the testPy2Gcp task down to ~13m

(from ~45m). This speedup will probably be lower once all 8 tasks are using
pytest.
It also adds 5 previously uncollected tests.


smime.p7s
Description: S/MIME Cryptographic Signature


request for access: pypi and dockerhub

2019-12-09 Thread Udi Meiri
Hi,

I'm following the release guide
, and it says I need
access to a couple of repos:
- apache_beam pypi project, need maintainer or owner grant, user: udim
- DockerHub Push Permission, need to be part of "maintainer team", user:
udim

Thanks!


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Python: pytest migration update

2019-12-09 Thread Udi Meiri
I have given this some thought honestly don't know if splitting into
separate jobs will help.
- I have seen race conditions with running setuptools in parallel, so more
isolation is better. OTOH, if 2 setuptools scripts run at the same time on
the same machine they might still race (same homedir, same /tmp dir).
- Retrying due to flakes will be faster, but if something is broken you'll
need to write 4x the number of "run python precommit" phrases.
- Increased parallelism may also run into quota issues with Dataflow.

What benefits do you see from splitting up the jobs?

On Mon, Dec 9, 2019 at 4:17 PM Chad Dombrova  wrote:

> After this PR goes in should we revisit breaking up the python tests into
> separate jenkins jobs by python version?  One of the problems with that
> plan originally was that we lost the parallelism that gradle provides
> because we were left with only one tox task per jenkins job, and so the
> total time to complete all python jenkins jobs went up a lot.  With
> pytest + xdist we should hopefully be able to keep the parallelism even
> with just one tox task.  This could be a big win.  I feel like I'm spending
> more time monitoring and re-queuing timed-out jenkins jobs lately than I am
> writing code.
>
> On Mon, Dec 9, 2019 at 10:32 AM Udi Meiri  wrote:
>
>> This PR <https://github.com/apache/beam/pull/10322> (in review) migrates
>> py27-gcp to using pytest.
>> It reduces the testPy2Gcp task down to ~13m
>> <https://scans.gradle.com/s/kj7ogemnd3toe/timeline?details=ancsbov425524>
>> (from ~45m). This speedup will probably be lower once all 8 tasks are using
>> pytest.
>> It also adds 5 previously uncollected tests.
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Python: pytest migration update

2019-12-09 Thread Udi Meiri
Valentyn, the speedup is due to parallelization.

On Mon, Dec 9, 2019 at 6:12 PM Chad Dombrova  wrote:

>
> On Mon, Dec 9, 2019 at 5:36 PM Udi Meiri  wrote:
>
>> I have given this some thought honestly don't know if splitting into
>> separate jobs will help.
>> - I have seen race conditions with running setuptools in parallel, so
>> more isolation is better.
>>
>
> What race conditions have you seen?  I think if we're doing things right,
> this should not be happening, but I don't think we're doing things right.
> One thing that I've noticed is that we're building into the source
> directory, but I also think we're also doing weird things like trying to
> copy the source directory beforehand.  I really think this system is
> tripping over many non-standard choices that have been made along the way.
> I have never these sorts of problems with in unittests that use tox, even
> when many are running in parallel.  I got pulled away from it, but I'm
> really hoping to address these issues here:
> https://github.com/apache/beam/pull/10038.
>

This comment
<https://issues.apache.org/jira/browse/BEAM-8481?focusedCommentId=16988369&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16988369>
summarizes what I believe may be the issue (setuptools races).

I believe copying the source directory was done in an effort to isolate the
parallel builds (setuptools, cythonize).


>
>> What benefits do you see from splitting up the jobs?
>>
>
> The biggest problem is that the jobs are doing too much and take too
> long.  This simple fact compounds all of the other problems.  It seems
> pretty obvious that we need to do less in each job, as long as the sum of
> all of these smaller jobs is not substantially longer than the one
> monolithic job.
>
> Benefits:
>
> - failures specific to a particular python version will be easier to spot
> in the jenkins error summary, and cheaper to re-queue.  right now the
> jenkins report mushes all of the failures together in a way that makes it
> nearly impossible to tell which python version they correspond to.  only
> the gradle scan gives you this insight, but it doesn't break the errors by
> test.
>

I agree Jenkins handles duplicate test names pretty badly (reloading will
periodically give you a different result).
With pytest I've been able to set the suite name so that should help with
identification. (I need to add pytest*.xml collection to the Jenkins job
first)


> - failures common to all python versions will be reported to the user
> earlier, at which point they can cancel the other jobs if desired.  *this
> is by far the biggest benefit. * why wait for 2 hours to see the same
> failure reported for 5 versions of python?  if that had run on one version
> of python I could maybe see that error in 30 minutes (while potentially
> other python versions waited in the queue).  Repeat for each change pushed.
> - flaky jobs will be cheaper to requeue (since it will affect a
> smaller/shorter job)
> - if xdist is giving us the parallel boost we're hoping for we should get
> under the 2 hour mark every time
>
> Basically we're talking about getting feedback to users faster.
>

+1


>
> I really don't mind pasting a few more phrases if it means faster feedback.
>
> -chad
>
>
>
>
>>
>> On Mon, Dec 9, 2019 at 4:17 PM Chad Dombrova  wrote:
>>
>>> After this PR goes in should we revisit breaking up the python tests
>>> into separate jenkins jobs by python version?  One of the problems with
>>> that plan originally was that we lost the parallelism that gradle provides
>>> because we were left with only one tox task per jenkins job, and so the
>>> total time to complete all python jenkins jobs went up a lot.  With
>>> pytest + xdist we should hopefully be able to keep the parallelism even
>>> with just one tox task.  This could be a big win.  I feel like I'm spending
>>> more time monitoring and re-queuing timed-out jenkins jobs lately than I am
>>> writing code.
>>>
>>> On Mon, Dec 9, 2019 at 10:32 AM Udi Meiri  wrote:
>>>
>>>> This PR <https://github.com/apache/beam/pull/10322> (in review)
>>>> migrates py27-gcp to using pytest.
>>>> It reduces the testPy2Gcp task down to ~13m
>>>> <https://scans.gradle.com/s/kj7ogemnd3toe/timeline?details=ancsbov425524>
>>>> (from ~45m). This speedup will probably be lower once all 8 tasks are using
>>>> pytest.
>>>> It also adds 5 previously uncollected tests.
>>>>
>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RELEASE] Tracking 2.18

2019-12-10 Thread Udi Meiri
Re: cherrypicks on top of the release-2.18.0 branch
The precommit tests are failing most likely due to some integration tests
(wordcount, etc.) that are expecting the new 2.18 worker on Dataflow.
I'm working on building an initial version of that worker so that the tests
may pass.

On Thu, Dec 5, 2019 at 4:39 PM Robert Bradshaw  wrote:

> Yeah, so I saw...
>
> On Thu, Dec 5, 2019 at 4:31 PM Udi Meiri  wrote:
> >
> > Sorry Robert the release was already cut yesterday.
> >
> >
> >
> > On Thu, Dec 5, 2019 at 8:37 AM Ismaël Mejía  wrote:
> >>
> >> Colm, I just merged your PR and cherry picked it into 2.18.0
> >> https://github.com/apache/beam/pull/10296
> >>
> >> On Thu, Dec 5, 2019 at 10:54 AM jincheng sun 
> wrote:
> >>>
> >>> Thanks for the Tracking Udi!
> >>>
> >>> I have updated the status of some release blockers issues as follows:
> >>>
> >>> - BEAM-8733 closed
> >>> - BEAM-8620 reset the fix version to 2.19
> >>> - BEAM-8618 reset the fix version to 2.19
> >>>
> >>> Best,
> >>> Jincheng
> >>>
> >>> Colm O hEigeartaigh  于2019年12月5日周四 下午5:38写道:
> >>>>
> >>>> Could we get this one in 2.18 as well?
> https://issues.apache.org/jira/browse/BEAM-8861
> >>>>
> >>>> Colm.
> >>>>
> >>>> On Wed, Dec 4, 2019 at 8:02 PM Udi Meiri  wrote:
> >>>>>
> >>>>> Following the release calendar, I plan on cutting the 2.18 release
> branch today.
> >>>>>
> >>>>> There are currently 8 release blockers.
> >>>>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Hello Beam Developers!

2019-12-10 Thread Udi Meiri
Hi Daniel!
What is your Jira ID?

On Mon, Dec 9, 2019 at 2:15 PM Daniel Collins  wrote:

> My name is Daniel, I'm a developer on the Cloud Pub/Sub Team at Google in
> New York.
>
> I'm looking to make some contributions to the Cloud Pub/Sub integration
> with beam. I was hoping that I could be added to the JIRA as a contributor.
>
> Looking forward to working with everyone!
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Python: pytest migration update

2019-12-10 Thread Udi Meiri
On Mon, Dec 9, 2019 at 9:33 PM Kenneth Knowles  wrote:

>
>
> On Mon, Dec 9, 2019 at 6:34 PM Udi Meiri  wrote:
>
>> Valentyn, the speedup is due to parallelization.
>>
>> On Mon, Dec 9, 2019 at 6:12 PM Chad Dombrova  wrote:
>>
>>>
>>> On Mon, Dec 9, 2019 at 5:36 PM Udi Meiri  wrote:
>>>
>>>> I have given this some thought honestly don't know if splitting into
>>>> separate jobs will help.
>>>> - I have seen race conditions with running setuptools in parallel, so
>>>> more isolation is better.
>>>>
>>>
>>> What race conditions have you seen?  I think if we're doing things
>>> right, this should not be happening, but I don't think we're doing things
>>> right. One thing that I've noticed is that we're building into the source
>>> directory, but I also think we're also doing weird things like trying to
>>> copy the source directory beforehand.  I really think this system is
>>> tripping over many non-standard choices that have been made along the way.
>>> I have never these sorts of problems with in unittests that use tox, even
>>> when many are running in parallel.  I got pulled away from it, but I'm
>>> really hoping to address these issues here:
>>> https://github.com/apache/beam/pull/10038.
>>>
>>
>> This comment
>> <https://issues.apache.org/jira/browse/BEAM-8481?focusedCommentId=16988369&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16988369>
>> summarizes what I believe may be the issue (setuptools races).
>>
>> I believe copying the source directory was done in an effort to isolate
>> the parallel builds (setuptools, cythonize).
>>
>
> Peanut gallery: containerized Jenkins builds seem like they would help,
> and they are the current recommended best practice, but we are not there
> yet. Agree/disagree?
>

I'm okay with containerized Jenkins builds as long as using pytest/tox
directly still works.


>
> What benefits do you see from splitting up the jobs?
>>>>
>>>
>>> The biggest problem is that the jobs are doing too much and take too
>>> long.  This simple fact compounds all of the other problems.  It seems
>>> pretty obvious that we need to do less in each job, as long as the sum of
>>> all of these smaller jobs is not substantially longer than the one
>>> monolithic job.
>>>
>>
> For some reason I keep forgetting the answer to this question: are we
> caching pypi immutable artifacts on every Jenkins worker?
>

I don't know.


>
>>
>>> Benefits:
>>>
>>> - failures specific to a particular python version will be easier to
>>> spot in the jenkins error summary, and cheaper to re-queue.  right now the
>>> jenkins report mushes all of the failures together in a way that makes it
>>> nearly impossible to tell which python version they correspond to.  only
>>> the gradle scan gives you this insight, but it doesn't break the errors by
>>> test.
>>>
>>
>> I agree Jenkins handles duplicate test names pretty badly (reloading will
>> periodically give you a different result).
>>
>
> Saw this in Java too w/ ValidatesRunner suites when they ran in one
> Jenkins job. Worthwhile to avoid.
>
> Kenn
>
>
>> With pytest I've been able to set the suite name so that should help with
>> identification. (I need to add pytest*.xml collection to the Jenkins job
>> first)
>>
>>
>>> - failures common to all python versions will be reported to the user
>>> earlier, at which point they can cancel the other jobs if desired.  *this
>>> is by far the biggest benefit. * why wait for 2 hours to see the same
>>> failure reported for 5 versions of python?  if that had run on one version
>>> of python I could maybe see that error in 30 minutes (while potentially
>>> other python versions waited in the queue).  Repeat for each change pushed.
>>> - flaky jobs will be cheaper to requeue (since it will affect a
>>> smaller/shorter job)
>>> - if xdist is giving us the parallel boost we're hoping for we should
>>> get under the 2 hour mark every time
>>>
>>> Basically we're talking about getting feedback to users faster.
>>>
>>
>> +1
>>
>>
>>>
>>> I really don't mind pasting a few more phrases if it means faster
>>> feedback.
>>>
>>> -chad
>>>
>>>
>>>
>>>
>&

Re: Cython unit test suites running without Cythonized sources

2019-12-10 Thread Udi Meiri
To follow up, since I'm trying to run cython-based tests using pytest:
- tox does in fact correctly install apache-beam with cythonized modules in
its virtualenv.
- Since our tests are under apache_beam/, local sources shadow those in the
installed apache_beam package.
- The original issue I raised (BEAM-8572
) was due to bad usage
of utils.check_compiled().

So I'm going to add an additional "python setup.py build_ext --inplace" to
pyXX-cython-pytest envs.
If we want to remove this additional step we'll probably have to move tests
under a separate directory that is not part of the apache_beam package.

See also:
https://github.com/tox-dev/tox/issues/514
https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure

On Mon, Nov 11, 2019 at 3:21 PM Ahmet Altay  wrote:

> Thank you for spending time on this to clarify it for all of us! Much
> appreciated.
>
> On Sun, Nov 10, 2019 at 3:45 PM Chad Dombrova  wrote:
>
>> Hi all,
>>
>>
>>> The sdist step creates a package that should be installed into each
>>> tox environment. If the tox environment has cython when this apache
>>> beam package is installed, it should be used. Nose (or whatever)
>>> should then run the tests.
>>>
>> I spent some time this weekend trying to understand the Beam python build
>> process, and here’s an overview of what I’ve learned:
>>
>>- the :sdks:python:sdist gradle task creates the source tarball (no
>>surprises there)
>>   - the protobuf stubs are generated during this process
>>- the sdist is provided to tox, which installs it into the the
>>virtualenv for that task
>>- for *-cython tasks, tox installs the cython dep and, as Ahmet
>>asserted, python setup.py nosetests performs the cythonization.
>>   - this cythonized build overrides the one installed by tox
>>
>> Here’s what I learned about the current status of tests wrt cython:
>>
>>- cython tox tasks *are* using cython (good!)
>>- non-cython tox tasks *are not* using cython (good!)
>>- none of the GCP or integration tests are using cython (bad?)
>>
>> This is intentional with the recent change to drop base only tests.
> Otherwise we would not have coverage for non-cythonized tests.
>
>>
>>- This is because the build is only cythonized when python setup.py
>>   nosetests is used in conjunction with tox (tox installs cython, python
>>   setup.py nosetests compiles it).
>>   - GCP tests don't install cython.  ITs don't use tox.
>>
>> To confirm my understanding of this, I created a PR [1] to assert that a
>> cythonized or pure-python build is being used.  A cythonized build is
>> expected by default on linux systems unless a special flag is provided to
>> inform the test otherwise.  It appears as though the portable tests passed
>> (i.e. used cython), but I forgot to add the assertion for those; like the
>> other ITs they are not using cython.
>>
>> *Questions:*
>>
>>- Is the lack of cython for ITs expected and/or desired?
>>- Why aren't ITs using tox?  It's quite possible to pass arguments
>>into tox to control it's behavior.  For example, it seems reasonable that
>>run_integration_test.sh could be inside tox
>>
>>
> For ITs the primary execution will happen on a remote system. As long as
> those remote workers have cython installed the tests will run in a
> cythonized environment. This is true for any IT tests, that runs in a
> container based on the Dockerfile we have in beam (which installs cython),
> and also true for legacy Dataflow environments that are not yet using this
> Beam provided Dockerfile.
>
>
>>
>> *Next Steps:*There has been some movement in the python community to
>> solve problems around build dependencies [2] and toolchains [3].  I hope to
>> have a proposal for how to simplify this process soon.
>>
> Thank you!
>
> [1] https://github.com/apache/beam/pull/10058
>> [2] https://www.python.org/dev/peps/pep-0517/
>> [3] https://www.python.org/dev/peps/pep-0518/
>>
>> -chad
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: request for access: pypi and dockerhub

2019-12-10 Thread Udi Meiri
Thank you both!

On Tue, Dec 10, 2019 at 1:58 PM Ahmet Altay  wrote:

> Udi, added 'udim' as a maintainer to pypi.
> Pablo, I updated you to be an owner. You should have privileges now.
>
> Ahmet
>
> On Tue, Dec 10, 2019 at 12:01 PM Pablo Estrada  wrote:
>
>> I've added you as a maintainer in docker hub.
>> I don't have privileges to add you in pypi. +Ahmet Altay
>>  can you?
>> -P.
>>
>> On Mon, Dec 9, 2019 at 4:34 PM Udi Meiri  wrote:
>>
>>> Hi,
>>>
>>> I'm following the release guide
>>> <https://beam.apache.org/contribute/release-guide/>, and it says I need
>>> access to a couple of repos:
>>> - apache_beam pypi project, need maintainer or owner grant, user: udim
>>> - DockerHub Push Permission, need to be part of "maintainer team", user:
>>> udim
>>>
>>> Thanks!
>>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Cython unit test suites running without Cythonized sources

2019-12-10 Thread Udi Meiri
Sorry I didn't realize you already had a solution for the shadowing issue
and BEAM-8572.

On Tue, Dec 10, 2019 at 6:21 PM Chad Dombrova  wrote:

> Hi Udi, I know you're aware of my PR
> <https://github.com/apache/beam/pull/10038>, but I really encourage you
> to look into pep517 and pep518.  They are the new solution for all of this
> -- declaring build dependencies and creating isolated out-of-source builds
> e.g. using tox.  Another thing I added to my PR is something which asserts
> that cython is *or is not* enabled based on an env var set in the tox
> config.  In other words, we're currently skipping tests when cython is not
> installed or when code is not cythonized, but we're not asserting whether
> we *expect* that code to be cythonized or not.  Also there are a few bugs
> where cython tests don't run at all because we're identifying the wrong
> module name to check for cythonization ("apache_beam.coders" which is a
> package).
>
> -chad
>
>
> On Tue, Dec 10, 2019 at 6:03 PM Udi Meiri  wrote:
>
>> To follow up, since I'm trying to run cython-based tests using pytest:
>> - tox does in fact correctly install apache-beam with cythonized modules
>> in its virtualenv.
>> - Since our tests are under apache_beam/, local sources shadow those in
>> the installed apache_beam package.
>> - The original issue I raised (BEAM-8572
>> <https://issues.apache.org/jira/browse/BEAM-8572>) was due to bad usage
>> of utils.check_compiled().
>>
>> So I'm going to add an additional "python setup.py build_ext --inplace"
>> to pyXX-cython-pytest envs.
>> If we want to remove this additional step we'll probably have to move
>> tests under a separate directory that is not part of the apache_beam
>> package.
>>
>> See also:
>> https://github.com/tox-dev/tox/issues/514
>> https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure
>>
>> On Mon, Nov 11, 2019 at 3:21 PM Ahmet Altay  wrote:
>>
>>> Thank you for spending time on this to clarify it for all of us! Much
>>> appreciated.
>>>
>>> On Sun, Nov 10, 2019 at 3:45 PM Chad Dombrova  wrote:
>>>
>>>> Hi all,
>>>>
>>>>
>>>>> The sdist step creates a package that should be installed into each
>>>>> tox environment. If the tox environment has cython when this apache
>>>>> beam package is installed, it should be used. Nose (or whatever)
>>>>> should then run the tests.
>>>>>
>>>> I spent some time this weekend trying to understand the Beam python
>>>> build process, and here’s an overview of what I’ve learned:
>>>>
>>>>- the :sdks:python:sdist gradle task creates the source tarball (no
>>>>surprises there)
>>>>   - the protobuf stubs are generated during this process
>>>>- the sdist is provided to tox, which installs it into the the
>>>>virtualenv for that task
>>>>- for *-cython tasks, tox installs the cython dep and, as Ahmet
>>>>asserted, python setup.py nosetests performs the cythonization.
>>>>   - this cythonized build overrides the one installed by tox
>>>>
>>>> Here’s what I learned about the current status of tests wrt cython:
>>>>
>>>>- cython tox tasks *are* using cython (good!)
>>>>- non-cython tox tasks *are not* using cython (good!)
>>>>- none of the GCP or integration tests are using cython (bad?)
>>>>
>>>> This is intentional with the recent change to drop base only tests.
>>> Otherwise we would not have coverage for non-cythonized tests.
>>>
>>>>
>>>>- This is because the build is only cythonized when python setup.py
>>>>   nosetests is used in conjunction with tox (tox installs cython, 
>>>> python
>>>>   setup.py nosetests compiles it).
>>>>   - GCP tests don't install cython.  ITs don't use tox.
>>>>
>>>> To confirm my understanding of this, I created a PR [1] to assert that
>>>> a cythonized or pure-python build is being used.  A cythonized build is
>>>> expected by default on linux systems unless a special flag is provided to
>>>> inform the test otherwise.  It appears as though the portable tests passed
>>>> (i.e. used cython), but I forgot to add the assertion for those; like the
>>>> other ITs they are not using cython.
>>>>
>>>> *Questions:*
>>>>

Re: Cython unit test suites running without Cythonized sources

2019-12-11 Thread Udi Meiri
The `changedir = {envsitepackagesdir}` setting is definitely something I
haven't thought of.
It solves the shadowing issue without needing to split tests and packages
from one another. (though I still think it's unnecessary to include tests
in the published package)

IIUC, isolated_build=True and the removal of setup.py invocation in the
current virtualenv should eliminate any Cython output files in the repo,
and no need for run_tox_cleanup.sh?


On Wed, Dec 11, 2019 at 9:38 AM Chad Dombrova  wrote:

> Hi Udi,
>
>> Sorry I didn't realize you already had a solution for the shadowing issue
>> and BEAM-8572.
>>
>
> No worries at all.  I haven't had much time to invest into that PR lately
> (most of it I did at home on my own time), but I did get past most of the
> major issues.  You've been working on so many of the same problems I was
> trying to solve there, and so far you've been coming to the same
> conclusions independently (e.g. removing pytest-runner and
> setup_requires).   It's great to have that validation, and it's helped
> reduce the scope of my PR.  Moving forward, I would love to team up on
> this.  Happy to answer any questions you have about the approach I took.
>
> -chad
>
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RELEASE] Tracking 2.18

2019-12-12 Thread Udi Meiri
Update: I'm accepting cherrypicks with failing tests if the corresponding
PR have passed them on master.

I recall (without proof) that in the past, even with released worker
containers for the in-process release, that ITs against the release branch
still fail.

On Tue, Dec 10, 2019 at 10:58 AM Udi Meiri  wrote:

> Re: cherrypicks on top of the release-2.18.0 branch
> The precommit tests are failing most likely due to some integration tests
> (wordcount, etc.) that are expecting the new 2.18 worker on Dataflow.
> I'm working on building an initial version of that worker so that the
> tests may pass.
>
> On Thu, Dec 5, 2019 at 4:39 PM Robert Bradshaw 
> wrote:
>
>> Yeah, so I saw...
>>
>> On Thu, Dec 5, 2019 at 4:31 PM Udi Meiri  wrote:
>> >
>> > Sorry Robert the release was already cut yesterday.
>> >
>> >
>> >
>> > On Thu, Dec 5, 2019 at 8:37 AM Ismaël Mejía  wrote:
>> >>
>> >> Colm, I just merged your PR and cherry picked it into 2.18.0
>> >> https://github.com/apache/beam/pull/10296
>> >>
>> >> On Thu, Dec 5, 2019 at 10:54 AM jincheng sun 
>> wrote:
>> >>>
>> >>> Thanks for the Tracking Udi!
>> >>>
>> >>> I have updated the status of some release blockers issues as follows:
>> >>>
>> >>> - BEAM-8733 closed
>> >>> - BEAM-8620 reset the fix version to 2.19
>> >>> - BEAM-8618 reset the fix version to 2.19
>> >>>
>> >>> Best,
>> >>> Jincheng
>> >>>
>> >>> Colm O hEigeartaigh  于2019年12月5日周四 下午5:38写道:
>> >>>>
>> >>>> Could we get this one in 2.18 as well?
>> https://issues.apache.org/jira/browse/BEAM-8861
>> >>>>
>> >>>> Colm.
>> >>>>
>> >>>> On Wed, Dec 4, 2019 at 8:02 PM Udi Meiri  wrote:
>> >>>>>
>> >>>>> Following the release calendar, I plan on cutting the 2.18 release
>> branch today.
>> >>>>>
>> >>>>> There are currently 8 release blockers.
>> >>>>>
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RELEASE] Tracking 2.18

2019-12-12 Thread Udi Meiri
Just merged 6 PRs. :)

On Thu, Dec 12, 2019 at 4:52 PM Udi Meiri  wrote:

> Update: I'm accepting cherrypicks with failing tests if the corresponding
> PR have passed them on master.
>
> I recall (without proof) that in the past, even with released worker
> containers for the in-process release, that ITs against the release branch
> still fail.
>
> On Tue, Dec 10, 2019 at 10:58 AM Udi Meiri  wrote:
>
>> Re: cherrypicks on top of the release-2.18.0 branch
>> The precommit tests are failing most likely due to some integration tests
>> (wordcount, etc.) that are expecting the new 2.18 worker on Dataflow.
>> I'm working on building an initial version of that worker so that the
>> tests may pass.
>>
>> On Thu, Dec 5, 2019 at 4:39 PM Robert Bradshaw 
>> wrote:
>>
>>> Yeah, so I saw...
>>>
>>> On Thu, Dec 5, 2019 at 4:31 PM Udi Meiri  wrote:
>>> >
>>> > Sorry Robert the release was already cut yesterday.
>>> >
>>> >
>>> >
>>> > On Thu, Dec 5, 2019 at 8:37 AM Ismaël Mejía  wrote:
>>> >>
>>> >> Colm, I just merged your PR and cherry picked it into 2.18.0
>>> >> https://github.com/apache/beam/pull/10296
>>> >>
>>> >> On Thu, Dec 5, 2019 at 10:54 AM jincheng sun <
>>> sunjincheng...@gmail.com> wrote:
>>> >>>
>>> >>> Thanks for the Tracking Udi!
>>> >>>
>>> >>> I have updated the status of some release blockers issues as follows:
>>> >>>
>>> >>> - BEAM-8733 closed
>>> >>> - BEAM-8620 reset the fix version to 2.19
>>> >>> - BEAM-8618 reset the fix version to 2.19
>>> >>>
>>> >>> Best,
>>> >>> Jincheng
>>> >>>
>>> >>> Colm O hEigeartaigh  于2019年12月5日周四 下午5:38写道:
>>> >>>>
>>> >>>> Could we get this one in 2.18 as well?
>>> https://issues.apache.org/jira/browse/BEAM-8861
>>> >>>>
>>> >>>> Colm.
>>> >>>>
>>> >>>> On Wed, Dec 4, 2019 at 8:02 PM Udi Meiri  wrote:
>>> >>>>>
>>> >>>>> Following the release calendar, I plan on cutting the 2.18 release
>>> branch today.
>>> >>>>>
>>> >>>>> There are currently 8 release blockers.
>>> >>>>>
>>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RELEASE] Tracking 2.18

2019-12-12 Thread Udi Meiri
Also marked 3 Jiras from these cherrypicks as blockers .
Current open blocker count: 7
<https://issues.apache.org/jira/projects/BEAM/versions/12346383>.

On Thu, Dec 12, 2019 at 5:21 PM Udi Meiri  wrote:

> Just merged 6 PRs. :)
>
> On Thu, Dec 12, 2019 at 4:52 PM Udi Meiri  wrote:
>
>> Update: I'm accepting cherrypicks with failing tests if the corresponding
>> PR have passed them on master.
>>
>> I recall (without proof) that in the past, even with released worker
>> containers for the in-process release, that ITs against the release branch
>> still fail.
>>
>> On Tue, Dec 10, 2019 at 10:58 AM Udi Meiri  wrote:
>>
>>> Re: cherrypicks on top of the release-2.18.0 branch
>>> The precommit tests are failing most likely due to some integration
>>> tests (wordcount, etc.) that are expecting the new 2.18 worker on Dataflow.
>>> I'm working on building an initial version of that worker so that the
>>> tests may pass.
>>>
>>> On Thu, Dec 5, 2019 at 4:39 PM Robert Bradshaw 
>>> wrote:
>>>
>>>> Yeah, so I saw...
>>>>
>>>> On Thu, Dec 5, 2019 at 4:31 PM Udi Meiri  wrote:
>>>> >
>>>> > Sorry Robert the release was already cut yesterday.
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Dec 5, 2019 at 8:37 AM Ismaël Mejía 
>>>> wrote:
>>>> >>
>>>> >> Colm, I just merged your PR and cherry picked it into 2.18.0
>>>> >> https://github.com/apache/beam/pull/10296
>>>> >>
>>>> >> On Thu, Dec 5, 2019 at 10:54 AM jincheng sun <
>>>> sunjincheng...@gmail.com> wrote:
>>>> >>>
>>>> >>> Thanks for the Tracking Udi!
>>>> >>>
>>>> >>> I have updated the status of some release blockers issues as
>>>> follows:
>>>> >>>
>>>> >>> - BEAM-8733 closed
>>>> >>> - BEAM-8620 reset the fix version to 2.19
>>>> >>> - BEAM-8618 reset the fix version to 2.19
>>>> >>>
>>>> >>> Best,
>>>> >>> Jincheng
>>>> >>>
>>>> >>> Colm O hEigeartaigh  于2019年12月5日周四 下午5:38写道:
>>>> >>>>
>>>> >>>> Could we get this one in 2.18 as well?
>>>> https://issues.apache.org/jira/browse/BEAM-8861
>>>> >>>>
>>>> >>>> Colm.
>>>> >>>>
>>>> >>>> On Wed, Dec 4, 2019 at 8:02 PM Udi Meiri  wrote:
>>>> >>>>>
>>>> >>>>> Following the release calendar, I plan on cutting the 2.18
>>>> release branch today.
>>>> >>>>>
>>>> >>>>> There are currently 8 release blockers.
>>>> >>>>>
>>>>
>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Python: pytest migration update

2019-12-13 Thread Udi Meiri
Update: 37m <https://scans.gradle.com/s/hqcvbxm2h6svg/timeline>
precommit time with the latest PR
<https://github.com/apache/beam/pull/10377> (in review).

On Tue, Dec 10, 2019 at 11:21 AM Udi Meiri  wrote:

>
>
> On Mon, Dec 9, 2019 at 9:33 PM Kenneth Knowles  wrote:
>
>>
>>
>> On Mon, Dec 9, 2019 at 6:34 PM Udi Meiri  wrote:
>>
>>> Valentyn, the speedup is due to parallelization.
>>>
>>> On Mon, Dec 9, 2019 at 6:12 PM Chad Dombrova  wrote:
>>>
>>>>
>>>> On Mon, Dec 9, 2019 at 5:36 PM Udi Meiri  wrote:
>>>>
>>>>> I have given this some thought honestly don't know if splitting into
>>>>> separate jobs will help.
>>>>> - I have seen race conditions with running setuptools in parallel, so
>>>>> more isolation is better.
>>>>>
>>>>
>>>> What race conditions have you seen?  I think if we're doing things
>>>> right, this should not be happening, but I don't think we're doing things
>>>> right. One thing that I've noticed is that we're building into the source
>>>> directory, but I also think we're also doing weird things like trying to
>>>> copy the source directory beforehand.  I really think this system is
>>>> tripping over many non-standard choices that have been made along the way.
>>>> I have never these sorts of problems with in unittests that use tox, even
>>>> when many are running in parallel.  I got pulled away from it, but I'm
>>>> really hoping to address these issues here:
>>>> https://github.com/apache/beam/pull/10038.
>>>>
>>>
>>> This comment
>>> <https://issues.apache.org/jira/browse/BEAM-8481?focusedCommentId=16988369&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16988369>
>>> summarizes what I believe may be the issue (setuptools races).
>>>
>>> I believe copying the source directory was done in an effort to isolate
>>> the parallel builds (setuptools, cythonize).
>>>
>>
>> Peanut gallery: containerized Jenkins builds seem like they would help,
>> and they are the current recommended best practice, but we are not there
>> yet. Agree/disagree?
>>
>
> I'm okay with containerized Jenkins builds as long as using pytest/tox
> directly still works.
>
>
>>
>> What benefits do you see from splitting up the jobs?
>>>>>
>>>>
>>>> The biggest problem is that the jobs are doing too much and take too
>>>> long.  This simple fact compounds all of the other problems.  It seems
>>>> pretty obvious that we need to do less in each job, as long as the sum of
>>>> all of these smaller jobs is not substantially longer than the one
>>>> monolithic job.
>>>>
>>>
>> For some reason I keep forgetting the answer to this question: are we
>> caching pypi immutable artifacts on every Jenkins worker?
>>
>
> I don't know.
>
>
>>
>>>
>>>> Benefits:
>>>>
>>>> - failures specific to a particular python version will be easier to
>>>> spot in the jenkins error summary, and cheaper to re-queue.  right now the
>>>> jenkins report mushes all of the failures together in a way that makes it
>>>> nearly impossible to tell which python version they correspond to.  only
>>>> the gradle scan gives you this insight, but it doesn't break the errors by
>>>> test.
>>>>
>>>
>>> I agree Jenkins handles duplicate test names pretty badly (reloading
>>> will periodically give you a different result).
>>>
>>
>> Saw this in Java too w/ ValidatesRunner suites when they ran in one
>> Jenkins job. Worthwhile to avoid.
>>
>> Kenn
>>
>>
>>> With pytest I've been able to set the suite name so that should help
>>> with identification. (I need to add pytest*.xml collection to the Jenkins
>>> job first)
>>>
>>>
>>>> - failures common to all python versions will be reported to the user
>>>> earlier, at which point they can cancel the other jobs if desired.  *this
>>>> is by far the biggest benefit. * why wait for 2 hours to see the same
>>>> failure reported for 5 versions of python?  if that had run on one version
>>>> of python I could maybe see that error in 30 minutes (while potentially
>>>> other python versions waited in the queue).  Repeat 

Re: [RELEASE] Tracking 2.18

2019-12-16 Thread Udi Meiri
The remaining 4 open blockers all have recently merged cherrypicks (at
least 1 blocker is waiting on verification since it's a release process
issue).

Will attempt an RC today.

On Thu, Dec 12, 2019 at 5:33 PM Udi Meiri  wrote:

> Also marked 3 Jiras from these cherrypicks as blockers .
> Current open blocker count: 7
> <https://issues.apache.org/jira/projects/BEAM/versions/12346383>.
>
> On Thu, Dec 12, 2019 at 5:21 PM Udi Meiri  wrote:
>
>> Just merged 6 PRs. :)
>>
>> On Thu, Dec 12, 2019 at 4:52 PM Udi Meiri  wrote:
>>
>>> Update: I'm accepting cherrypicks with failing tests if the
>>> corresponding PR have passed them on master.
>>>
>>> I recall (without proof) that in the past, even with released worker
>>> containers for the in-process release, that ITs against the release branch
>>> still fail.
>>>
>>> On Tue, Dec 10, 2019 at 10:58 AM Udi Meiri  wrote:
>>>
>>>> Re: cherrypicks on top of the release-2.18.0 branch
>>>> The precommit tests are failing most likely due to some integration
>>>> tests (wordcount, etc.) that are expecting the new 2.18 worker on Dataflow.
>>>> I'm working on building an initial version of that worker so that the
>>>> tests may pass.
>>>>
>>>> On Thu, Dec 5, 2019 at 4:39 PM Robert Bradshaw 
>>>> wrote:
>>>>
>>>>> Yeah, so I saw...
>>>>>
>>>>> On Thu, Dec 5, 2019 at 4:31 PM Udi Meiri  wrote:
>>>>> >
>>>>> > Sorry Robert the release was already cut yesterday.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Dec 5, 2019 at 8:37 AM Ismaël Mejía 
>>>>> wrote:
>>>>> >>
>>>>> >> Colm, I just merged your PR and cherry picked it into 2.18.0
>>>>> >> https://github.com/apache/beam/pull/10296
>>>>> >>
>>>>> >> On Thu, Dec 5, 2019 at 10:54 AM jincheng sun <
>>>>> sunjincheng...@gmail.com> wrote:
>>>>> >>>
>>>>> >>> Thanks for the Tracking Udi!
>>>>> >>>
>>>>> >>> I have updated the status of some release blockers issues as
>>>>> follows:
>>>>> >>>
>>>>> >>> - BEAM-8733 closed
>>>>> >>> - BEAM-8620 reset the fix version to 2.19
>>>>> >>> - BEAM-8618 reset the fix version to 2.19
>>>>> >>>
>>>>> >>> Best,
>>>>> >>> Jincheng
>>>>> >>>
>>>>> >>> Colm O hEigeartaigh  于2019年12月5日周四 下午5:38写道:
>>>>> >>>>
>>>>> >>>> Could we get this one in 2.18 as well?
>>>>> https://issues.apache.org/jira/browse/BEAM-8861
>>>>> >>>>
>>>>> >>>> Colm.
>>>>> >>>>
>>>>> >>>> On Wed, Dec 4, 2019 at 8:02 PM Udi Meiri 
>>>>> wrote:
>>>>> >>>>>
>>>>> >>>>> Following the release calendar, I plan on cutting the 2.18
>>>>> release branch today.
>>>>> >>>>>
>>>>> >>>>> There are currently 8 release blockers.
>>>>> >>>>>
>>>>>
>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Root logger configuration

2019-12-17 Thread Udi Meiri
Pablo, does the issue affect debuggability of pipelines?

On Mon, Dec 16, 2019 at 6:23 PM Chad Dombrova  wrote:

>
>
> On Mon, Dec 16, 2019 at 5:59 PM Pablo Estrada  wrote:
>
>> +chad...@gmail.com  is this consistent with behavior
>> that you observed?
>>
>
> I honestly can't recall, sorry.  I just remember that while I was testing
> I updated sdk version and some logging stopped.  I *think* I was missing
> the state/message stream, which would be on the client side after pipeline
> construction.
>
> -chad
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: PostCommit_Py_VR_Dataflow timing out

2019-12-18 Thread Udi Meiri
Yes, there are objections since this would take up a Jenkins slot for
longer.

An alternative would be to set timeouts on individual tests.
Debugging options: run the gradle tasks locally, try to pinpoint the
culprit PR

https://issues.apache.org/jira/browse/BEAM-8877

On Wed, Dec 18, 2019 at 1:25 PM Brian Hulette  wrote:

> It looks like beam_PostCommit_Py_VR_Dataflow has been timing out at 1h40m
> since Dec 4 [1]. Are there any objections to bumping up the timeout to
> alleviate this? Or any other thoughts on potential causes and/or solutions?
>
> Brian
>
> [1] https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/
>


smime.p7s
Description: S/MIME Cryptographic Signature


re: BEAM-8989 fix for 2.18.0 release

2019-12-19 Thread Udi Meiri
The JIRA issue was assigned to me, but I have no background in the issue.
Who would be the most suitable to take care of fixing, testing (Nemo
quickstart), and cherrypicking?


smime.p7s
Description: S/MIME Cryptographic Signature


Re: BEAM-8989 fix for 2.18.0 release

2019-12-19 Thread Udi Meiri
Thanks. I've reassigned the bug to Reuven and pushed the fix back to 2.19.0

On Thu, Dec 19, 2019 at 11:11 AM Luke Cwik  wrote:

> Either Salman Raza who developed the PR or Reuven Lax who reviewed it
> would have the most context. I don't know Salman's contact information
> though.
>
> On Thu, Dec 19, 2019 at 10:18 AM Udi Meiri  wrote:
>
>> The JIRA issue was assigned to me, but I have no background in the issue.
>> Who would be the most suitable to take care of fixing, testing (Nemo
>> quickstart), and cherrypicking?
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Unifying Build/contributing instructions

2019-12-19 Thread Udi Meiri
+1 for website focus

On Thu, Dec 19, 2019 at 10:22 AM Elliotte Rusty Harold 
wrote:

> That's two votes for
> https://beam.apache.org/contribute/contribution-guide/ and a lot of
> abstentions. I'll update the PR to move content to
> https://beam.apache.org/contribute/contribution-guide/
>
> On Thu, Dec 19, 2019 at 12:29 PM Luke Cwik  wrote:
> >
> > +1 on Kenn's suggestion.
> >
> > On Thu, Dec 12, 2019 at 8:17 PM Kenneth Knowles  wrote:
> >>
> >> Thanks for taking this on! My preference would be to have
> CONTRIBUTING.md link to
> https://beam.apache.org/contribute/contribution-guide/ and focus work on
> the latter.
> >>
> >> Kenn
> >>
> >> On Thu, Dec 12, 2019 at 12:38 PM Elliotte Rusty Harold <
> elh...@ibiblio.org> wrote:
> >>>
> >>> I've started work on updating and combine the four (or omre?)
> >>> different pages where build instructions are found. The initial PR is
> >>> here:
> >>>
> >>> https://github.com/apache/beam/pull/10366
> >>>
> >>> To put a stake in the ground, this PR chooses CONTRIBUTING.md as the
> >>> ultimate source of truth. A possible alternative is to unify around
> >>> https://beam.apache.org/contribute/contribution-guide/
> >>>
> >>> I'm not wedded to one or the other, but I do think we should pick one
> >>> and stick with it. If the community prefers to focus on
> >>> https://beam.apache.org/contribute/contribution-guide/ we can use that
> >>> instead.
> >>>
> >>> I've added some additional prerequisites to the instructions that were
> >>> not yet included. I don't have it all yet though. Any further
> >>> additions would be much appreciated.
> >>>
> >>> Please leave comments on the PR.
> >>>
> >>> --
> >>> Elliotte Rusty Harold
> >>> elh...@ibiblio.org
>
>
>
> --
> Elliotte Rusty Harold
> elh...@ibiblio.org
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: PostCommit_Py_VR_Dataflow timing out

2019-12-19 Thread Udi Meiri
On Thu, Dec 19, 2019 at 12:06 PM Kenneth Knowles  wrote:

> Is there a default timeout for all individual test methods? Can we make
> longer timeouts opt-in?
>

Long-term: yes, once they're migrated to run via pytest. (
https://pypi.org/project/pytest-timeout/)


>
> On Wed, Dec 18, 2019 at 2:27 PM Brian Hulette  wrote:
>
>> Ah thanks for the jira link. There's some critical context there - this
>> seems to be caused by a deadlock, so increasing the timeout won't make more
>> tests finish/pass, it will just consume a jenkins slot for longer.
>>
>> On Wed, Dec 18, 2019 at 1:43 PM Udi Meiri  wrote:
>>
>>> Yes, there are objections since this would take up a Jenkins slot for
>>> longer.
>>>
>>> An alternative would be to set timeouts on individual tests.
>>> Debugging options: run the gradle tasks locally, try to pinpoint the
>>> culprit PR
>>>
>>> https://issues.apache.org/jira/browse/BEAM-8877
>>>
>>> On Wed, Dec 18, 2019 at 1:25 PM Brian Hulette 
>>> wrote:
>>>
>>>> It looks like beam_PostCommit_Py_VR_Dataflow has been timing out at
>>>> 1h40m since Dec 4 [1]. Are there any objections to bumping up the timeout
>>>> to alleviate this? Or any other thoughts on potential causes and/or
>>>> solutions?
>>>>
>>>> Brian
>>>>
>>>> [1] https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/
>>>>
>>>


smime.p7s
Description: S/MIME Cryptographic Signature


[PROPOSAL] python precommit timeouts

2019-12-19 Thread Udi Meiri
Looking at this console log
,
it seems that some pytests got stuck (or slowed down considerably).
I'd like to put a 10 minute default timeout on all unit tests, using the
pytest-timeout  plugin.


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PROPOSAL] python precommit timeouts

2019-12-20 Thread Udi Meiri
https://issues.apache.org/jira/browse/BEAM-9009

On Fri, Dec 20, 2019 at 6:18 AM Maximilian Michels  wrote:

> +1 Good idea. We should also have this for Java if possible.
>
> On 20.12.19 02:59, Ahmet Altay wrote:
> > This sounds reasonable. Would this be configurable per-test if needed?
> >
> > On Thu, Dec 19, 2019 at 5:52 PM Udi Meiri  > <mailto:eh...@google.com>> wrote:
> >
> > Looking at this console log
> > <
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/timestamps/?time=HH:mm:ss&timeZone=GMT-8&appendLog&locale=en_US
> >,
> > it seems that some pytests got stuck (or slowed down considerably).
> > I'd like to put a 10 minute default timeout on all unit tests, using
> > the pytest-timeout <https://pypi.org/project/pytest-timeout/
> > plugin.
> >
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PROPOSAL] python precommit timeouts

2019-12-20 Thread Udi Meiri
ITs will have a different timeout, but they're still not migrated to pytest
so unaffected at the moment.

So I created a PR <https://github.com/apache/beam/pull/10437> and already
seemed to find an issue. One test timed out while scanning the local
filesystem.
It seems that it was scanning /tmp, which for apache-beam-jenkins-9 has
400k files (test output filenames are like:
/tmp/tmpnv2uyqas.result-chars-0-of-1).


On Fri, Dec 20, 2019 at 9:41 AM Pablo Estrada  wrote:

> big +1!
>
> As Ahmet suggested, the IT-marked tests may need to have a different
> timeout. But other than that, I think this is great.
>
> On Fri, Dec 20, 2019 at 9:39 AM Udi Meiri  wrote:
>
>> https://issues.apache.org/jira/browse/BEAM-9009
>>
>> On Fri, Dec 20, 2019 at 6:18 AM Maximilian Michels 
>> wrote:
>>
>>> +1 Good idea. We should also have this for Java if possible.
>>>
>>> On 20.12.19 02:59, Ahmet Altay wrote:
>>> > This sounds reasonable. Would this be configurable per-test if needed?
>>> >
>>> > On Thu, Dec 19, 2019 at 5:52 PM Udi Meiri >> > <mailto:eh...@google.com>> wrote:
>>> >
>>> > Looking at this console log
>>> > <
>>> https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/timestamps/?time=HH:mm:ss&timeZone=GMT-8&appendLog&locale=en_US
>>> >,
>>> > it seems that some pytests got stuck (or slowed down considerably).
>>> > I'd like to put a 10 minute default timeout on all unit tests,
>>> using
>>> > the pytest-timeout <https://pypi.org/project/pytest-timeout/
>>> > plugin.
>>> >
>>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] New committer: Kasia Kucharczyk

2019-12-23 Thread Udi Meiri
Congrats Kasia!

On Mon, Dec 23, 2019 at 1:23 PM Kyle Weaver  wrote:

> Congrats Kasia! And thanks for sharing, Pablo.
>
> On Mon, Dec 23, 2019 at 4:16 PM Pablo Estrada  wrote:
>
>> Hi everyone,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new committer:
>> Kasia Kucharczyk
>>
>> Kasia has contributed to Beam in many ways, including the performance
>> testing infrastructure, and has even spoken at events about Beam.
>>
>> In consideration of Kasia's contributions, the Beam PMC trusts her with
>> the responsibilities of a Beam committer[1].
>>
>> Thanks for your contributions Kasia!
>>
>> Pablo, on behalf of the Apache Beam PMC.
>>
>> [1] https://beam.apache.org/contribute/become-a-committer
>> /#an-apache-beam-committer
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Spark Structured Streaming runner] perfs and encoders

2019-12-23 Thread Udi Meiri
That's great! The optimizations seem to have helped a lot!

On Mon, Dec 23, 2019 at 6:49 AM Etienne Chauchot 
wrote:

> Hi all,
>
> good news !
>
> I did some refactoring of the encoders to improve maintenability and
> replace as much as possible string generated code with compiled code and
> the perf results are awesome !
>
> Best
>
> Etienne
>
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [DISCUSS] Bump the version of GRPC from 1.21.0 to 1.22.0+ (May be the latest 1.26.0?)

2019-12-26 Thread Udi Meiri
Probably best to take one of these existing update bugs:
https://issues.apache.org/jira/browse/BEAM-4938
There's no discussion on these bugs, so I'd go with 1.26.0 unless someone
else has an objection.

On Mon, Dec 23, 2019 at 10:04 PM jincheng sun 
wrote:

> Hi folks,
>
> When submitting a Python word count job to a Flink session/standalone
> cluster repeatedly, the meta space usage of the task manager of the Flink
> cluster will continuously increase (about 40MB each time). The reason is
> that the Beam classes are loaded with the user class loader(child-first by
> default) in Flink and there is a minor problem with the implementation of
> `ProcessManager`(from Beam) and `ThreadPoolCache`(from Netty) which may
> cause the user class loader could not be garbage collected even after the
> job finished which causes the meta space memory leak eventually. You can
> refer to FLINK-15338[1] for more information.
>
> Regarding to `ProcessManager`, I have created a JIRA BEAM-9006[2] to track
> it. Regarding to `ThreadPoolCache`, it is a Netty problem and has been
> fixed in NETTY#8955[3]. Netty 4.1.35 Final has already included this fix
> and GRPC 1.22.0 has already dependents on Netty 4.1.35 Final. So we need to
> bump the version of GRPC to 1.22.0+ (currently 1.21.0).
>
> My proposal is to upgrade the GRPC version to the 1.22.0+ (May be the
> latest 1.26.0?)
>
> I've created JIRA [4], but I'm not sure if there will be any other
> problems with the bump the version of GRPC up. So, I'd like to bring up
> this discussion and welcome your feedback !
>
> [1] https://issues.apache.org/jira/browse/FLINK-15338
> [2] https://issues.apache.org/jira/browse/BEAM-9006
> [3] https://github.com/netty/netty/pull/8955
> [4] https://issues.apache.org/jira/browse/BEAM-9030
>
> Best,
> Jincheng
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [BEAM-9015] Adding pyXX-cloud instead of pyXX-gcp and pyXX-aws

2019-12-26 Thread Udi Meiri
+1

On Mon, Dec 23, 2019, 17:28 Robert Bradshaw  wrote:

> Makes sense to me.
>
> On Mon, Dec 23, 2019 at 3:33 PM Pablo Estrada  wrote:
> >
> > Hi all,
> > a couple of contributors [1][2] have been kind enough to add support for
> s3 filesystem[3] for the Python SDK. Part of this involved adding a tox
> task called py37-aws, to install the relevant dependencies and run unit
> tests for it (in a mocked-out environment).
> >
> > To avoid running a full extra test suite, I thought we could add the new
> aws-related dependencies to the current pyXX-gcp suites, and perhaps rename
> to pyXX-cloud, to include all unit tests that require cloud-specific
> dependencies. What do others think?
> >
> > This is tracked here: https://jira.apache.org/jira/browse/BEAM-9015
> >
> > [1] https://github.com/tamera-lanham
> > [2] https://github.com/MattMorgis
> > [3] https://github.com/apache/beam/pull/9955
>


smime.p7s
Description: S/MIME Cryptographic Signature


[DISCUSS] Python static type checkers

2020-01-07 Thread Udi Meiri
Hi,
We recently added mypy to the Jenkins Lint job for PRs (currently ignores
errors). Mypy is a static type checker.

There's a JIRA for adding another static type checker named pytype
https://issues.apache.org/jira/browse/BEAM-9064

I wanted to ask the community their thoughts on this. (see JIRA issue
comments as well)

- Should PRs have to pass more than 1 static type checker? (in pre-commit
tests)
- If not, should the remaining type checkers be run as a post-commit tests?
- How much effort should be put into supporting more than 1 type checker?
(i.e. making sure that they all pass)


smime.p7s
Description: S/MIME Cryptographic Signature


Request for new dockerhub repos

2020-01-09 Thread Udi Meiri
Hi,
As part of the 2.18 release, we're adding 3 additional containers for Flink.
I have write access but since I am not an owner I cannot create new repos.

Could someone with access add these?

flink1.7_job_server
flink1.8_job_server
flink1.9_job_server

(go to https://hub.docker.com/repositories and select apachebeam from the
dropdown)

Thank you


smime.p7s
Description: S/MIME Cryptographic Signature


release scripts as interactive notebooks?

2020-01-10 Thread Udi Meiri
What does the community think about converting our release scripts
 to
be Jupyter notebooks using bash_kernel?

Since these scripts frequently fail (especially for first time releasers),
we often need to rerun parts manually. The notebook format lets you do that.

Certain steps require verification/inspection, such as before pushing
commits. This is naturally done by spitting into multiple notebook cells.

The notebook format also lends itself well to inline documentation and
on-the-fly modification.


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Request for new dockerhub repos

2020-01-10 Thread Udi Meiri
Thank you the pushes were successful.

On Fri, Jan 10, 2020 at 8:47 AM Hannah Jiang  wrote:

> Hi Udi
>
> The repositories are created. Were you added as a maintainer? If not, we
> need your docker hub user ID.
>
> Thanks,
> Hannah
>
> On Thu, Jan 9, 2020 at 5:48 PM Udi Meiri  wrote:
>
>> Hi,
>> As part of the 2.18 release, we're adding 3 additional containers for
>> Flink.
>> I have write access but since I am not an owner I cannot create new repos.
>>
>> Could someone with access add these?
>>
>> flink1.7_job_server
>> flink1.8_job_server
>> flink1.9_job_server
>>
>> (go to https://hub.docker.com/repositories and select apachebeam from
>> the dropdown)
>>
>> Thank you
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RELEASE] Tracking 2.18

2020-01-10 Thread Udi Meiri
RC1 is almost ready, but Nexus login is down due to LDAP issues with Apache.

On Mon, Dec 16, 2019 at 9:53 AM Udi Meiri  wrote:

> The remaining 4 open blockers all have recently merged cherrypicks (at
> least 1 blocker is waiting on verification since it's a release process
> issue).
>
> Will attempt an RC today.
>
> On Thu, Dec 12, 2019 at 5:33 PM Udi Meiri  wrote:
>
>> Also marked 3 Jiras from these cherrypicks as blockers .
>> Current open blocker count: 7
>> <https://issues.apache.org/jira/projects/BEAM/versions/12346383>.
>>
>> On Thu, Dec 12, 2019 at 5:21 PM Udi Meiri  wrote:
>>
>>> Just merged 6 PRs. :)
>>>
>>> On Thu, Dec 12, 2019 at 4:52 PM Udi Meiri  wrote:
>>>
>>>> Update: I'm accepting cherrypicks with failing tests if the
>>>> corresponding PR have passed them on master.
>>>>
>>>> I recall (without proof) that in the past, even with released worker
>>>> containers for the in-process release, that ITs against the release branch
>>>> still fail.
>>>>
>>>> On Tue, Dec 10, 2019 at 10:58 AM Udi Meiri  wrote:
>>>>
>>>>> Re: cherrypicks on top of the release-2.18.0 branch
>>>>> The precommit tests are failing most likely due to some integration
>>>>> tests (wordcount, etc.) that are expecting the new 2.18 worker on 
>>>>> Dataflow.
>>>>> I'm working on building an initial version of that worker so that the
>>>>> tests may pass.
>>>>>
>>>>> On Thu, Dec 5, 2019 at 4:39 PM Robert Bradshaw 
>>>>> wrote:
>>>>>
>>>>>> Yeah, so I saw...
>>>>>>
>>>>>> On Thu, Dec 5, 2019 at 4:31 PM Udi Meiri  wrote:
>>>>>> >
>>>>>> > Sorry Robert the release was already cut yesterday.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Dec 5, 2019 at 8:37 AM Ismaël Mejía 
>>>>>> wrote:
>>>>>> >>
>>>>>> >> Colm, I just merged your PR and cherry picked it into 2.18.0
>>>>>> >> https://github.com/apache/beam/pull/10296
>>>>>> >>
>>>>>> >> On Thu, Dec 5, 2019 at 10:54 AM jincheng sun <
>>>>>> sunjincheng...@gmail.com> wrote:
>>>>>> >>>
>>>>>> >>> Thanks for the Tracking Udi!
>>>>>> >>>
>>>>>> >>> I have updated the status of some release blockers issues as
>>>>>> follows:
>>>>>> >>>
>>>>>> >>> - BEAM-8733 closed
>>>>>> >>> - BEAM-8620 reset the fix version to 2.19
>>>>>> >>> - BEAM-8618 reset the fix version to 2.19
>>>>>> >>>
>>>>>> >>> Best,
>>>>>> >>> Jincheng
>>>>>> >>>
>>>>>> >>> Colm O hEigeartaigh  于2019年12月5日周四 下午5:38写道:
>>>>>> >>>>
>>>>>> >>>> Could we get this one in 2.18 as well?
>>>>>> https://issues.apache.org/jira/browse/BEAM-8861
>>>>>> >>>>
>>>>>> >>>> Colm.
>>>>>> >>>>
>>>>>> >>>> On Wed, Dec 4, 2019 at 8:02 PM Udi Meiri 
>>>>>> wrote:
>>>>>> >>>>>
>>>>>> >>>>> Following the release calendar, I plan on cutting the 2.18
>>>>>> release branch today.
>>>>>> >>>>>
>>>>>> >>>>> There are currently 8 release blockers.
>>>>>> >>>>>
>>>>>>
>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


[VOTE] Release 2.18.0, release candidate #1

2020-01-13 Thread Udi Meiri
Hi everyone,
Please review and vote on the release candidate #3 for the version 1.2.3,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which is signed with the key with fingerprint 8961 F3EF 8E79 6688 4067
 87CF 587B 049C 36DA AFE6 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "v1.2.3-RC3" [5],
* website pull request listing the release [6], publishing the API
reference manual [7], and the blog post [8].
* Java artifacts were built with Maven MAVEN_VERSION and OpenJDK/Oracle JDK
JDK_VERSION.
TODO: do these versions matter, and are they stamped into the artifacts?
* Python artifacts are deployed along with the source release to the
dist.apache.org [2].
* Validation sheet with a tab for 2.18.0 release to help with validation
[9].
* Docker images published to Docker Hub [10].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Release Manager

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346383&projectId=12319527
[2] https://dist.apache.org/repos/dist/dev/beam/2.18.0/
[3] https://dist.apache.org/repos/dist/release/beam/KEYS
[4] https://repository.apache.org/content/repositories/orgapachebeam-1090/
[5] https://github.com/apache/beam/tree/v2.18.0-RC1
[6] https://github.com/apache/beam/pull/10574
[7] https://github.com/apache/beam-site/pull/595
[8] https://github.com/apache/beam/pull/10575
[9]
https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1178617819
[10] https://hub.docker.com/u/apachebeam


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [DISCUSS] Python static type checkers

2020-01-13 Thread Udi Meiri
The most important gain would be compatibility with Google internal code.
TLDR: I don't expect non-Googlers to fix pytype issues in Beam, nor would
they have access to internal code that is validated against pytype with
Beam.

Pytype seems to detect attribute errors that mypy has not, so it acts as a
kind-of linter in this case.
Examples:
https://github.com/apache/beam/pull/10528/files#diff-0cb34b4622b0b7d7256d28b1ee1d52fc
https://github.com/apache/beam/pull/10528/files#diff-7e4ad8c086414399957cdbea711ebd36
https://github.com/apache/beam/pull/10528/files#diff-d5c3f4f603204c5c5917d89e90dba53d
(it also makes pytype more strict in a sense)

Pytype has bugs. I've filed/found these so far:
https://github.com/google/pytype/issues/491 (isinstance bug)
https://github.com/google/pytype/issues/445 (@overload bug)
https://github.com/google/pytype/issues/488 (builtins.object != object bug)
https://github.com/google/pytype/issues/485 (# type: ignore[...] mypy
compatibility issue)
https://github.com/google/pytype/issues/480 (alternate comment syntax -
wontfix)


On Mon, Jan 13, 2020 at 3:49 PM Kyle Weaver  wrote:

> Udi, what would we gain by using pytype?
>
> Also, has anyone tried running pytype against Beam? If it's not too much
> trouble, it might be helpful to diff the pytype and mypy results to get a
> feel for exactly how big the discrepancy is.
>
> On Mon, Jan 13, 2020 at 3:26 PM Kenneth Knowles  wrote:
>
>> Looking at this from the outside, it seems like mypy is the obvious
>> choice. Also running pytype could potentially be informative in some cases
>> but only if there is a specific gap. What about maintenance/governance of
>> the two projects?
>>
>> Kenn
>>
>> On Sun, Jan 12, 2020 at 7:48 PM Chad Dombrova  wrote:
>>
>>> Hi folks,
>>> I agree with Robert that we need to wait and see before making any
>>> decisions, but I do have some opinions about the probable/desired outcome.
>>>
>>> I haven't used pytype, but my experience working with mypy over the past
>>> few years -- and following various issues and peps related to it and typing
>>> in general -- has taught me there's still a lot of room for interpretation
>>> and thus variation between type checkers.
>>>
>>> Here's a simple example: ignoring errors.  Both tools support ignoring
>>> errors using a `type: ignore` comment, but only mypy (to my knowledge)
>>> supports specifying an error type so that only that error is suppressed,
>>> e.g. `type: ignore[error-code-here]`.   There's even room for differences
>>> with regard to the line number where the error is emitted and thus where
>>> the ignore comment must be placed (end of statement, site of open paren,
>>> site of close paren, etc).  I know this because mypy has actually made
>>> adjustments to this once or twice over the years, which necessitated moving
>>> existing ignore comments.  So just imagine having to ignore the same error
>>> separately for each type checker.  It's not the end of the world, but it's
>>> ugly and frustrating.
>>>
>>> As a user, it can be quite challenging to solve certain typing issues,
>>> and there's a fairly steep learning curve –  I wouldn't want to burden
>>> users with *two* type checker, each with its own idiosyncrasies.  That
>>> said, a linter that doesn't actually prevent merges when an error occurs
>>> will be ignored by users and quickly become less-than-useful.  Post-commit
>>> would not be a good idea for all the reasons that a post-commit lint check
>>> would be annoying (user's will trip it often and feel
>>> surprised/blind-sided).
>>>
>>> In the little exposure that I've had with pytype it seems to lag behind
>>> mypy in terms of features, especially wrt typing-related peps (it never
>>> fully supported pep484 multi-line type comments and it still doesn't
>>> support pep561, I see no mention of pep589/TypedDict in the docs, but then
>>> again they are *incredibly* light).  I've gotten mypy completely
>>> passing, and I know it very well, so I'm pretty biased towards making it
>>> the one and only type checker that generates pre-commit errors.  I see
>>> little advantage to most end users in supporting pytype, except y'know,
>>> Google has kind of an important presence in Apache Beam project  :)
>>>
>>> Some quick pypi download figures to back that up:
>>>
>>> Downloads last month:
>>> pytype: 24,864
>>> mypy: 1,502,582
>>>
>>> So to sum up this email in a sentence: runnin

Re: [VOTE] Release 2.18.0, release candidate #1

2020-01-14 Thread Udi Meiri
Sorry about the messiness.
The links at the bottom should be correct though.

I intentionally did not replace MAVEN_VERSION because I didn't know how to
get it (I didn't execute mvn for the release).
As for JDK_VERSION, do we still need that? (If so, what about Python
versions, such as the ones used for testing?)
javac -version on my machine is 1.8.0_181-google-v7


On Mon, Jan 13, 2020 at 7:37 PM Valentyn Tymofieiev 
wrote:

> There are some issues in this message, part of the message is still a
> template (1.2.3, TODO, MAVEN_VERSION).
> Before I noticed these issues, I ran a few Batch and Streaming Python 3.7
> pipelines using Direct and Dataflow runners, and they all succeeded.
>
> On Mon, Jan 13, 2020 at 4:09 PM Udi Meiri  wrote:
>
>> Hi everyone,
>> Please review and vote on the release candidate #3 for the version 1.2.3,
>> as follows:
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>>
>>
>> The complete staging area is available for your review, which includes:
>> * JIRA release notes [1],
>> * the official Apache source release to be deployed to dist.apache.org
>> [2], which is signed with the key with fingerprint 8961 F3EF 8E79 6688 4067
>>  87CF 587B 049C 36DA AFE6 [3],
>> * all artifacts to be deployed to the Maven Central Repository [4],
>> * source code tag "v1.2.3-RC3" [5],
>> * website pull request listing the release [6], publishing the API
>> reference manual [7], and the blog post [8].
>> * Java artifacts were built with Maven MAVEN_VERSION and OpenJDK/Oracle
>> JDK JDK_VERSION.
>> TODO: do these versions matter, and are they stamped into the artifacts?
>> * Python artifacts are deployed along with the source release to the
>> dist.apache.org [2].
>> * Validation sheet with a tab for 2.18.0 release to help with validation
>> [9].
>> * Docker images published to Docker Hub [10].
>>
>> The vote will be open for at least 72 hours. It is adopted by majority
>> approval, with at least 3 PMC affirmative votes.
>>
>> Thanks,
>> Release Manager
>>
>> [1]
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346383&projectId=12319527
>> [2] https://dist.apache.org/repos/dist/dev/beam/2.18.0/
>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>> [4]
>> https://repository.apache.org/content/repositories/orgapachebeam-1090/
>> [5] https://github.com/apache/beam/tree/v2.18.0-RC1
>> [6] https://github.com/apache/beam/pull/10574
>> [7] https://github.com/apache/beam-site/pull/595
>> [8] https://github.com/apache/beam/pull/10575
>> [9]
>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1178617819
>> [10] https://hub.docker.com/u/apachebeam
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [VOTE] Release 2.18.0, release candidate #1

2020-01-14 Thread Udi Meiri
Please don't do any Dataflow-based verifications yet, because we'll have to
redo them once new Dataflow containers are built.

On Tue, Jan 14, 2020 at 6:27 PM Ahmet Altay  wrote:

> I verified python 2 quickstarts with batch and streaming pipelines, wheel
> files, and reviewed changes to the blog/website.
>
> Udi, could you send an updated version of the voting text with TODOs,
> template pieces removed? We can discuss changes to the template separately.
> My vote is +1 pending an updated vote text.
>
> On Tue, Jan 14, 2020 at 4:47 PM Udi Meiri  wrote:
>
>> Sorry about the messiness.
>> The links at the bottom should be correct though.
>>
>> I intentionally did not replace MAVEN_VERSION because I didn't know how
>> to get it (I didn't execute mvn for the release).
>> As for JDK_VERSION, do we still need that? (If so, what about Python
>> versions, such as the ones used for testing?)
>> javac -version on my machine is 1.8.0_181-google-v7
>>
>
> I believe we can drop MAVEN_VERSION now that it is no longer used. I do
> not think it is needed to add a Gradle version either because the version
> itself is part of the repo anyway.
>
> I do not know if java, python etc. versions are helpful. Maybe others can
> comment. I would prefer to reduce the load on the release manager and drop
> this if this is not particularly important.
>
>
>>
>>
>> On Mon, Jan 13, 2020 at 7:37 PM Valentyn Tymofieiev 
>> wrote:
>>
>>> There are some issues in this message, part of the message is still a
>>> template (1.2.3, TODO, MAVEN_VERSION).
>>> Before I noticed these issues, I ran a few Batch and Streaming Python
>>> 3.7 pipelines using Direct and Dataflow runners, and they all succeeded.
>>>
>>> On Mon, Jan 13, 2020 at 4:09 PM Udi Meiri  wrote:
>>>
>>>> Hi everyone,
>>>> Please review and vote on the release candidate #3 for the version
>>>> 1.2.3, as follows:
>>>> [ ] +1, Approve the release
>>>> [ ] -1, Do not approve the release (please provide specific comments)
>>>>
>>>>
>>>> The complete staging area is available for your review, which includes:
>>>> * JIRA release notes [1],
>>>> * the official Apache source release to be deployed to dist.apache.org
>>>> [2], which is signed with the key with fingerprint 8961 F3EF 8E79 6688 4067
>>>>  87CF 587B 049C 36DA AFE6 [3],
>>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>>> * source code tag "v1.2.3-RC3" [5],
>>>>
>>>
> Tag is "v2.18.0-RC1". This is correct in the referenced link.
>
>
>> * website pull request listing the release [6], publishing the API
>>>> reference manual [7], and the blog post [8].
>>>> * Java artifacts were built with Maven MAVEN_VERSION and OpenJDK/Oracle
>>>> JDK JDK_VERSION.
>>>> TODO: do these versions matter, and are they stamped into the artifacts?
>>>> * Python artifacts are deployed along with the source release to the
>>>> dist.apache.org [2].
>>>> * Validation sheet with a tab for 2.18.0 release to help with
>>>> validation [9].
>>>> * Docker images published to Docker Hub [10].
>>>>
>>>> The vote will be open for at least 72 hours. It is adopted by majority
>>>> approval, with at least 3 PMC affirmative votes.
>>>>
>>>> Thanks,
>>>> Release Manager
>>>>
>>>> [1]
>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346383&projectId=12319527
>>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.18.0/
>>>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>>> [4]
>>>> https://repository.apache.org/content/repositories/orgapachebeam-1090/
>>>> [5] https://github.com/apache/beam/tree/v2.18.0-RC1
>>>> [6] https://github.com/apache/beam/pull/10574
>>>> [7] https://github.com/apache/beam-site/pull/595
>>>> [8] https://github.com/apache/beam/pull/10575
>>>> [9]
>>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1178617819
>>>> [10] https://hub.docker.com/u/apachebeam
>>>>
>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [VOTE] Release 2.18.0, release candidate #1

2020-01-14 Thread Udi Meiri
Here my second take:

Hi everyone,
Please review and vote on the release candidate #1 for the version 2.18.0,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which is signed with the key with fingerprint 8961 F3EF 8E79 6688 4067
 87CF 587B 049C 36DA AFE6 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "v2.18.0-RC1" [5],
* website pull request listing the release [6], publishing the API
reference manual [7], and the blog post [8].
* Java artifacts were built with Maven N/A and OpenJDK 1.8.0_181-google-v7.
* Python artifacts are deployed along with the source release to the
dist.apache.org [2].
* Validation sheet with a tab for 2.18.0 release to help with validation
[9].
* Docker images published to Docker Hub [10].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.
NOTE: The vote will start once new Dataflow containers are built.

Thanks,
Release Manager

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346383&projectId=12319527
[2] https://dist.apache.org/repos/dist/dev/beam/2.18.0/
[3] https://dist.apache.org/repos/dist/release/beam/KEYS
[4] https://repository.apache.org/content/repositories/orgapachebeam-1090/
[5] https://github.com/apache/beam/tree/v2.18.0-RC1
[6] https://github.com/apache/beam/pull/10574
[7] https://github.com/apache/beam-site/pull/595
[8] https://github.com/apache/beam/pull/10575
[9]
https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1178617819
[10] https://hub.docker.com/u/apachebeam


On Tue, Jan 14, 2020 at 6:34 PM Udi Meiri  wrote:

> Please don't do any Dataflow-based verifications yet, because we'll have
> to redo them once new Dataflow containers are built.
>
> On Tue, Jan 14, 2020 at 6:27 PM Ahmet Altay  wrote:
>
>> I verified python 2 quickstarts with batch and streaming pipelines, wheel
>> files, and reviewed changes to the blog/website.
>>
>> Udi, could you send an updated version of the voting text with TODOs,
>> template pieces removed? We can discuss changes to the template separately.
>> My vote is +1 pending an updated vote text.
>>
>> On Tue, Jan 14, 2020 at 4:47 PM Udi Meiri  wrote:
>>
>>> Sorry about the messiness.
>>> The links at the bottom should be correct though.
>>>
>>> I intentionally did not replace MAVEN_VERSION because I didn't know how
>>> to get it (I didn't execute mvn for the release).
>>> As for JDK_VERSION, do we still need that? (If so, what about Python
>>> versions, such as the ones used for testing?)
>>> javac -version on my machine is 1.8.0_181-google-v7
>>>
>>
>> I believe we can drop MAVEN_VERSION now that it is no longer used. I do
>> not think it is needed to add a Gradle version either because the version
>> itself is part of the repo anyway.
>>
>> I do not know if java, python etc. versions are helpful. Maybe others can
>> comment. I would prefer to reduce the load on the release manager and drop
>> this if this is not particularly important.
>>
>>
>>>
>>>
>>> On Mon, Jan 13, 2020 at 7:37 PM Valentyn Tymofieiev 
>>> wrote:
>>>
>>>> There are some issues in this message, part of the message is still a
>>>> template (1.2.3, TODO, MAVEN_VERSION).
>>>> Before I noticed these issues, I ran a few Batch and Streaming Python
>>>> 3.7 pipelines using Direct and Dataflow runners, and they all succeeded.
>>>>
>>>> On Mon, Jan 13, 2020 at 4:09 PM Udi Meiri  wrote:
>>>>
>>>>> Hi everyone,
>>>>> Please review and vote on the release candidate #3 for the version
>>>>> 1.2.3, as follows:
>>>>> [ ] +1, Approve the release
>>>>> [ ] -1, Do not approve the release (please provide specific comments)
>>>>>
>>>>>
>>>>> The complete staging area is available for your review, which includes:
>>>>> * JIRA release notes [1],
>>>>> * the official Apache source release to be deployed to dist.apache.org
>>>>> [2], which is signed with the key with fingerprint 8961 F3EF 8E79 6688 
>>>>> 4067
>>>>>  87CF 587B 049C 36DA AFE6 [3],
>>>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>>>> * source code tag "v1.2.3-RC3" [5],
>>>>>
&g

Re: [PROPOSAL] Transition released containers to the official ASF dockerhub organization

2020-01-15 Thread Udi Meiri
SG +1

On Wed, Jan 15, 2020 at 12:59 PM Hannah Jiang 
wrote:

> I have done some research about images released under apache namespace at
> docker hub, and here is my proposal.
>
> Currently, we are using apachebeam as our namespace and each image has its
> own repository. Version number is used to tag the images.
> ie: apachebeam/python2.7_sdk:2.19.0, apachebeam/flink1.9_job_server:2.19.0
>
> Now we are migrating to apache namespace and docker hub doesn't support
> nested repository names, so we cannot use
> apache/beam/{image-desc}:{version}.
> Instead, I propose to use *apache/beam-{image_desc}:{version}* as our
> repository name.
> ie: apache/beam-python2.7_sdk:2.19.0,
> apache/beam-flink1.9_job_server:2.19.0
> => When a user searches for *apache/beam* at docker hub, it will list all
> the repositories we deployed with apache/beam-, so no concerns that some
> released images are missed by users.
> => Repository names give insights to the users which repositories they
> should use.
> => A downside with this approach is we need to create a new repository
> whenever we release a new image, time and effort needed for this is
> pending, I am contacting apache docker hub management team.
>
> I have considered using beam as repository name and moving image name and
> version to tags, (ie: apache/beam:python3.7_sdk_2.19.0), which means put
> all images to a single repository, however, this approach has some
> downsides.
> => When a user searches for apache/beam, only one repository is returned.
> Users need to use tags to identify which images they should use. Since we
> release images with new tags for each version, it will overwhelm the users
> and give them an impression that the images are not organized well. It's
> also difficult to know what kind of images we deployed.
> => With both image name and version included at tags, it is a little bit
> more complicated to maintain the code.
> => There is no correct answer which image the latest tag should point to.
>
> Are there any concerns with this proposal?
>
> Thanks,
> Hannah
>
>
>
>
> On Fri, Jan 10, 2020 at 4:19 PM Ahmet Altay  wrote:
>
>>
>>
>> On Fri, Jan 10, 2020 at 3:33 PM Ahmet Altay  wrote:
>>
>>>
>>>
>>> On Fri, Jan 10, 2020 at 3:32 PM Ankur Goenka  wrote:
>>>
 Also curious to know if apache provide any infra support fro projects
 under Apache umbrella and any quota limits they might have.

>>>
>> Maybe Hannah can ask with an infra ticket?
>>
>>
>>>
 On Fri, Jan 10, 2020, 2:26 PM Robert Bradshaw 
 wrote:

> One downside is that, unlike many of these projects, we release a
> dozen or so containers. Is there exactly (and only) one level of
> namespacing/nesting we can leverage here? (This isn't a blocker, but
> something to consider.)
>

>>> After a quick search, I could not find a way to use more than one level
>>> of repositories. We can use the naming scheme we currently use to help
>>> with. Our repositories are named as apachebeam/X, we could start using
>>> apache/beam/X.
>>>
>>>

> On Fri, Jan 10, 2020 at 2:06 PM Hannah Jiang 
> wrote:
> >
> > Thanks Ahmet for proposing it.
> > I will take it and work towards v2.19.
>

>> Missed this part. Thank you Hannah!
>>
>>
>>> >
> > Hannah
> >
> > On Fri, Jan 10, 2020 at 1:50 PM Kyle Weaver 
> wrote:
> >>
> >> It'd be nice to have the clout/official sheen of apache attached to
> our containers. Although getting the required permissions might add some
> small overhead to the release process. For example, yesterday, when we
> needed to create new repositories (not just update existing ones), since 
> we
> have top-level ownership of the apachebeam organization, it was quick and
> easy to add them. I imagine we'd have had to get approval from someone
> outside the project to do that under the apache org. But this won't need 
> to
> happen very often, so it's probably not that big a deal.
> >>
> >> On Fri, Jan 10, 2020 at 1:40 PM Ahmet Altay 
> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> I saw recent progress on the containers and wanted to bring this
> question to the attention of the dev list.
> >>>
> >>> Would it be possible to use the official ASF dockerhub
> organization for new Beam container releases? Concretely, starting from
> 2.19 could we release Beam containers to
> https://hub.docker.com/u/apache instead of
> https://hub.docker.com/u/apachebeam ?
> >>>
> >>> Ahmet
>



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [VOTE] Release 2.18.0, release candidate #1

2020-01-15 Thread Udi Meiri
Dataflow containers have been updated. Test away.

On Tue, Jan 14, 2020 at 6:37 PM Udi Meiri  wrote:

> Here my second take:
>
> Hi everyone,
> Please review and vote on the release candidate #1 for the version 2.18.0,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release to be deployed to dist.apache.org
> [2], which is signed with the key with fingerprint 8961 F3EF 8E79 6688 4067
>  87CF 587B 049C 36DA AFE6 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "v2.18.0-RC1" [5],
> * website pull request listing the release [6], publishing the API
> reference manual [7], and the blog post [8].
> * Java artifacts were built with Maven N/A and OpenJDK 1.8.0_181-google-v7.
> * Python artifacts are deployed along with the source release to the
> dist.apache.org [2].
> * Validation sheet with a tab for 2.18.0 release to help with validation
> [9].
> * Docker images published to Docker Hub [10].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
> NOTE: The vote will start once new Dataflow containers are built.
>
> Thanks,
> Release Manager
>
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346383&projectId=12319527
> [2] https://dist.apache.org/repos/dist/dev/beam/2.18.0/
> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> [4] https://repository.apache.org/content/repositories/orgapachebeam-1090/
> [5] https://github.com/apache/beam/tree/v2.18.0-RC1
> [6] https://github.com/apache/beam/pull/10574
> [7] https://github.com/apache/beam-site/pull/595
> [8] https://github.com/apache/beam/pull/10575
> [9]
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1178617819
> [10] https://hub.docker.com/u/apachebeam
>
>
> On Tue, Jan 14, 2020 at 6:34 PM Udi Meiri  wrote:
>
>> Please don't do any Dataflow-based verifications yet, because we'll have
>> to redo them once new Dataflow containers are built.
>>
>> On Tue, Jan 14, 2020 at 6:27 PM Ahmet Altay  wrote:
>>
>>> I verified python 2 quickstarts with batch and streaming pipelines,
>>> wheel files, and reviewed changes to the blog/website.
>>>
>>> Udi, could you send an updated version of the voting text with TODOs,
>>> template pieces removed? We can discuss changes to the template separately.
>>> My vote is +1 pending an updated vote text.
>>>
>>> On Tue, Jan 14, 2020 at 4:47 PM Udi Meiri  wrote:
>>>
>>>> Sorry about the messiness.
>>>> The links at the bottom should be correct though.
>>>>
>>>> I intentionally did not replace MAVEN_VERSION because I didn't know how
>>>> to get it (I didn't execute mvn for the release).
>>>> As for JDK_VERSION, do we still need that? (If so, what about Python
>>>> versions, such as the ones used for testing?)
>>>> javac -version on my machine is 1.8.0_181-google-v7
>>>>
>>>
>>> I believe we can drop MAVEN_VERSION now that it is no longer used. I do
>>> not think it is needed to add a Gradle version either because the version
>>> itself is part of the repo anyway.
>>>
>>> I do not know if java, python etc. versions are helpful. Maybe others
>>> can comment. I would prefer to reduce the load on the release manager and
>>> drop this if this is not particularly important.
>>>
>>>
>>>>
>>>>
>>>> On Mon, Jan 13, 2020 at 7:37 PM Valentyn Tymofieiev <
>>>> valen...@google.com> wrote:
>>>>
>>>>> There are some issues in this message, part of the message is still a
>>>>> template (1.2.3, TODO, MAVEN_VERSION).
>>>>> Before I noticed these issues, I ran a few Batch and Streaming Python
>>>>> 3.7 pipelines using Direct and Dataflow runners, and they all succeeded.
>>>>>
>>>>> On Mon, Jan 13, 2020 at 4:09 PM Udi Meiri  wrote:
>>>>>
>>>>>> Hi everyone,
>>>>>> Please review and vote on the release candidate #3 for the version
>>>>>> 1.2.3, as follows:
>>>>>> [ ] +1, Approve the release
>>>>>> [ ] -1, Do not approve the release (please provide specific comments)
>>>>>>
>>>>>>
>>>>>

Re: [BEAM-9015] Adding pyXX-cloud instead of pyXX-gcp and pyXX-aws

2020-01-15 Thread Udi Meiri
We would like to run unit tests using these dependencies (gcp+aws) in
presubmits.
Having separate tox environments for each would require running tox for
each, increasing presubmit time and duplicating work (since most tests
don't depend on aws or gcp).


On Wed, Jan 15, 2020 at 1:54 PM Kyle Weaver  wrote:

> Just now seeing this -- are we sure we want to mix the signal from what
> are logically two totally separate test suites?
> Or from the opposite perspective, what's the motivation for wanting one
> test suite instead of two?
>
> On Tue, Jan 14, 2020 at 3:25 PM Pablo Estrada  wrote:
>
>> now back from the holidays, I intend to do this - one of these days.
>>
>> On Thu, Dec 26, 2019 at 12:51 PM Udi Meiri  wrote:
>>
>>> +1
>>>
>>> On Mon, Dec 23, 2019, 17:28 Robert Bradshaw  wrote:
>>>
>>>> Makes sense to me.
>>>>
>>>> On Mon, Dec 23, 2019 at 3:33 PM Pablo Estrada 
>>>> wrote:
>>>> >
>>>> > Hi all,
>>>> > a couple of contributors [1][2] have been kind enough to add support
>>>> for s3 filesystem[3] for the Python SDK. Part of this involved adding a tox
>>>> task called py37-aws, to install the relevant dependencies and run unit
>>>> tests for it (in a mocked-out environment).
>>>> >
>>>> > To avoid running a full extra test suite, I thought we could add the
>>>> new aws-related dependencies to the current pyXX-gcp suites, and perhaps
>>>> rename to pyXX-cloud, to include all unit tests that require cloud-specific
>>>> dependencies. What do others think?
>>>> >
>>>> > This is tracked here: https://jira.apache.org/jira/browse/BEAM-9015
>>>> >
>>>> > [1] https://github.com/tamera-lanham
>>>> > [2] https://github.com/MattMorgis
>>>> > [3] https://github.com/apache/beam/pull/9955
>>>>
>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [VOTE] Release 2.18.0, release candidate #1

2020-01-21 Thread Udi Meiri
I was not aware of https://issues.apache.org/jira/browse/BEAM-9123 or the
PR on the release branch.

On Tue, Jan 21, 2020 at 11:16 AM Robert Bradshaw 
wrote:

> The source tarball seems to be missing the commit at
>
> https://github.com/apache/beam/commit/a61dfbf4570e3adb30e15315c116751faeda897e
>
> On Tue, Jan 21, 2020 at 9:49 AM Ahmet Altay  wrote:
> >
> > All, could you help with validations and voting?
> >
> > On Wed, Jan 15, 2020 at 6:14 PM Ahmet Altay  wrote:
> >>
> >> +1, validated the same things, they still work. Thank you.
> >>
> >> On Wed, Jan 15, 2020 at 5:01 PM Udi Meiri  wrote:
> >>>
> >>> Dataflow containers have been updated. Test away.
> >>>
> >>> On Tue, Jan 14, 2020 at 6:37 PM Udi Meiri  wrote:
> >>>>
> >>>> Here my second take:
> >>>>
> >>>> Hi everyone,
> >>>> Please review and vote on the release candidate #1 for the version
> 2.18.0, as follows:
> >>>> [ ] +1, Approve the release
> >>>> [ ] -1, Do not approve the release (please provide specific comments)
> >>>>
> >>>> The complete staging area is available for your review, which
> includes:
> >>>> * JIRA release notes [1],
> >>>> * the official Apache source release to be deployed to
> dist.apache.org [2], which is signed with the key with fingerprint 8961
> F3EF 8E79 6688 4067  87CF 587B 049C 36DA AFE6 [3],
> >>>> * all artifacts to be deployed to the Maven Central Repository [4],
> >>>> * source code tag "v2.18.0-RC1" [5],
> >>>> * website pull request listing the release [6], publishing the API
> reference manual [7], and the blog post [8].
> >>>> * Java artifacts were built with Maven N/A and OpenJDK
> 1.8.0_181-google-v7.
> >>>> * Python artifacts are deployed along with the source release to the
> dist.apache.org [2].
> >>>> * Validation sheet with a tab for 2.18.0 release to help with
> validation [9].
> >>>> * Docker images published to Docker Hub [10].
> >>>>
> >>>> The vote will be open for at least 72 hours. It is adopted by
> majority approval, with at least 3 PMC affirmative votes.
> >>>> NOTE: The vote will start once new Dataflow containers are built.
> >>>>
> >>>> Thanks,
> >>>> Release Manager
> >>>>
> >>>> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346383&projectId=12319527
> >>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.18.0/
> >>>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> >>>> [4]
> https://repository.apache.org/content/repositories/orgapachebeam-1090/
> >>>> [5] https://github.com/apache/beam/tree/v2.18.0-RC1
> >>>> [6] https://github.com/apache/beam/pull/10574
> >>>> [7] https://github.com/apache/beam-site/pull/595
> >>>> [8] https://github.com/apache/beam/pull/10575
> >>>> [9]
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1178617819
> >>>> [10] https://hub.docker.com/u/apachebeam
> >>>>
> >>>>
> >>>> On Tue, Jan 14, 2020 at 6:34 PM Udi Meiri  wrote:
> >>>>>
> >>>>> Please don't do any Dataflow-based verifications yet, because we'll
> have to redo them once new Dataflow containers are built.
> >>>>>
> >>>>> On Tue, Jan 14, 2020 at 6:27 PM Ahmet Altay 
> wrote:
> >>>>>>
> >>>>>> I verified python 2 quickstarts with batch and streaming pipelines,
> wheel files, and reviewed changes to the blog/website.
> >>>>>>
> >>>>>> Udi, could you send an updated version of the voting text with
> TODOs, template pieces removed? We can discuss changes to the template
> separately. My vote is +1 pending an updated vote text.
> >>>>>>
> >>>>>> On Tue, Jan 14, 2020 at 4:47 PM Udi Meiri  wrote:
> >>>>>>>
> >>>>>>> Sorry about the messiness.
> >>>>>>> The links at the bottom should be correct though.
> >>>>>>>
> >>>>>>> I intentionally did not replace MAVEN_VERSION because I didn't
> know how to get it (I didn't execute mvn for the release).
> >>>>>>> As for JDK_VERSION, do we still need that? (If so, what about
> Python ver

Re: [DISCUSS] Autoformat python code with Black

2020-01-22 Thread Udi Meiri
+1 to autoformatting

On Wed, Jan 22, 2020 at 9:57 AM Luke Cwik  wrote:

> +1 to autoformatters. Also the Beam Java SDK went through a one time pass
> to apply the spotless formatting.
>
> On Tue, Jan 21, 2020 at 9:52 PM Ahmet Altay  wrote:
>
>> +1 to autoformatters and yapf. It appears to be a well maintained
>> project. I do support making a one time pass to apply formatting the whole
>> code base.
>>
>> On Tue, Jan 21, 2020 at 5:38 PM Chad Dombrova  wrote:
>>
>>>
 It'd be good if there was a way to only apply to violating (or at
 least changed) lines.
>>>
>>>
>>> I assumed the first thing we’d do is convert all of the code in one go,
>>> since it’s a very safe operation. Did you have something else in mind?
>>>
>>> -chad
>>>
>>>
>>>
>>>
>>>

 On Tue, Jan 21, 2020 at 1:56 PM Chad Dombrova 
 wrote:
 >
 > +1 to autoformatting
 >
 > Let me add some nuance to that.
 >
 > The way I see it there are 2 varieties of formatters:  those which
 take the original formatting into consideration (autopep8) and those which
 disregard it (yapf, black).
 >
 > I much prefer yapf to black, because you have plenty of options to
 tweak with yapf (enough to make the output a pretty close match to the
 current Beam style), and you can mark areas to preserve the original
 formatting, which could be very useful with Pipeline building with pipe
 operators.  Please don't pick black.
 >
 > autopep8 is more along the lines of spotless in Java -- it only
 corrects code that breaks the project's style rules.  The big problem with
 Beam's current style is that it is so esoteric that autopep8 can't enforce
 it -- and I'm not just talking about 2-spaces, which I don't really have a
 problem with -- the problem is the use of either 2 or 4 spaces depending on
 context (expression start vs hanging indent, etc).  This is my *biggest*
 gripe about the current style.  PyCharm doesn't have enough control
 either.  So, if we can choose a style that can be expressed by flake8 or
 pycodestyle then we can use autopep8 to enforce it.
 >
 > I'd prefer autopep8 to yapf because I like having a little wiggle
 room to influence the style, but on a big project like Beam all that wiggle
 room ends up to minor but noticeable inconsistencies in style throughout
 the project.  yapf ensures completely consistent style, but the tradeoff is
 that it's sometimes ugly, especially in scenarios with similar repeated
 entries like argparse, where yapf might insert line breaks in visually
 inconsistent and unappealing ways depending on the lengths of the keywords
 and expressions involved.
 >
 > Either way (but especially if we choose yapf) I think it'd be a nice
 addition to setup a pre-commit [1] config so that people can opt in to
 running *lightweight* autofixers prior to commit.  This will not only
 reduce dev frustration but will also reduce the amount of cpu cycles that
 Jenkins spends pointing out lint errors.
 >
 > [1] https://pre-commit.com/
 >
 > -chad
 >
 >
 >
 >
 > On Tue, Jan 21, 2020 at 12:52 PM Ismaël Mejía 
 wrote:
 >>
 >> Last time we discussed this there seems not to be much progress into
 autoformatting.
 >> This tool looks more tweakable, so maybe it could be more
 appropriate for Beam's use case.
 >> https://github.com/google/yapf/
 >> WDYT?
 >>
 >>
 >> On Thu, May 30, 2019 at 10:50 AM Łukasz Gajowy 
 wrote:
 >>>
 >>> +1 for any autoformatter for Python SDK that does the job. My
 experience is that since spotless in Java SDK I would never start a new
 Java project without it. So many great benefits not only for one person
 coding but for all community.
 >>>
 >>> It is a GitHub UI issue that you cannot easily browse past the
 reformat. It is not actually that hard, but does take a couple extra clicks
 to get GitHub to display blame before a reformat. It is easier with the
 command line. I do a lot of code history digging and the global Java
 reformat is not really a problem.
 >>>
 >>> It's actually one more click on Github but I agree it's not the
 best way to search the history. The most convenient and clear one I've
 found so far is in Jetbrains IDEs (Intelij) where you can:
 >>>
 >>> right click on line number -> "annotate" -> click again ->
 "annotate previous revision" -> ...
 >>>
 >>> You can also use "compare with" to see the diff between two
 revisions.
 >>>
 >>> Łukasz
 >>>
 >>>
 >>>
 >>>
 >>>
 >>> czw., 30 maj 2019 o 06:15 Kenneth Knowles 
 napisał(a):
 
  +1 pending good enough tooling (I can't quite tell - seems there
 are some issues?)
 
  On Wed, May 29, 2019 at 2:40 PM Katarzyna Kucharczyk <
 ka.kucharc...@gmail.com> wrot

Re: [VOTE] Release 2.18.0, release candidate #1

2020-01-22 Thread Udi Meiri
Thomas, please let us know if you learn more about possible root causes to
the regression you're seeing.
Also, if you believe this should block the release then please vote -1.

Does Beam have performance tests for the Python Flink portable streaming
case?

On Wed, Jan 22, 2020 at 8:08 AM Jean-Baptiste Onofré 
wrote:

> +1 (binding)
>
> Quickly tested on beam-samples.
>
> Regards
> JB
>
> On 22/01/2020 16:33, Ismaël Mejía wrote:
> > +1 (binding)
> >
> > - Validated signatures
> > - Run Python wordcount on Direct runner (from wheels)
> > - Run Python wordcount on Flink runner with job-server image (via wheels)
> > - Run Python wordcount on Spark runner with job-server from source (via
> > wheels)
> > - Validate no regressions on Nexmark for Spark classic runner
> > - Validate provided artifacts in two external projects beam-samples +
> > one internal company project
> >
> > Thanks Kyle, I run it as you said and everything worked but there was a
> > weird exception on removal of a directory, but everything worked.
> > +1 to update release validation guide/script looks worth for this case.
> >
> >
> > On Wed, Jan 22, 2020 at 1:59 AM Kyle Weaver  > <mailto:kcwea...@google.com>> wrote:
> >
> > > Also, does anyone know how can I (we) validate the new docker
> > image for Flink's job server included in this release?
> >
> > To start the job server:
> >
> > docker run --net=host apachebeam/flink1.9_job_server:2.18.0_rc1
> >
> > Then you can run any Beam Java/Python/Go job with job endpoint
> > localhost:8099 to validate.
> >
> > I can update the release validation script/guide with more
> instructions.
> >
> > On Tue, Jan 21, 2020 at 1:48 PM Robert Bradshaw  > <mailto:rober...@google.com>> wrote:
> >
> > On Tue, Jan 21, 2020 at 12:04 PM Ahmet Altay  > <mailto:al...@google.com>> wrote:
> > >
> > > This change (https://github.com/apache/beam/pull/10625) was
> > merged after the RC1 email was out. IMO, we do not need to block
> > RC1 vote for this. If there will be an RC2 the change will be
> > included.
> >
> > Agreed, we do not need to block RC1 due to this PR that didn't
> make
> > it. Just wanted to confirm that it wasn't an oversight.
> >
> > The signatures and wheels look good to me. +1 (binding).
> >
> > > I recall we had a similar thread before. Please, include the
> > release managers in the PRs that are/will be merged into the
> > release branch and tag JIRA issues with the all relevant
> > releases that should be blocked on it.
> > >
> > > Ahmet
> > >
> > > On Tue, Jan 21, 2020 at 11:36 AM Udi Meiri  > <mailto:eh...@google.com>> wrote:
> > >>
> > >> I was not aware of
> > https://issues.apache.org/jira/browse/BEAM-9123 or the PR on the
> > release branch.
> > >>
> > >> On Tue, Jan 21, 2020 at 11:16 AM Robert Bradshaw
> > mailto:rober...@google.com>> wrote:
> > >>>
> > >>> The source tarball seems to be missing the commit at
> > >>>
> >
> https://github.com/apache/beam/commit/a61dfbf4570e3adb30e15315c116751faeda897e
> > >>>
> > >>> On Tue, Jan 21, 2020 at 9:49 AM Ahmet Altay
> > mailto:al...@google.com>> wrote:
> > >>> >
> > >>> > All, could you help with validations and voting?
> > >>> >
> > >>> > On Wed, Jan 15, 2020 at 6:14 PM Ahmet Altay
> > mailto:al...@google.com>> wrote:
> > >>> >>
> > >>> >> +1, validated the same things, they still work. Thank you.
> > >>> >>
> > >>> >> On Wed, Jan 15, 2020 at 5:01 PM Udi Meiri
> > mailto:eh...@google.com>> wrote:
> > >>> >>>
> > >>> >>> Dataflow containers have been updated. Test away.
> > >>> >>>
> > >>> >>> On Tue, Jan 14, 2020 at 6:37 PM Udi Meiri
> > mailto:eh...@google.com>> wrote:
> > >>> >>>>
> > >>> >>&g

Re: [DISCUSS] Autoformat python code with Black

2020-01-22 Thread Udi Meiri
It sounds like there's a consensus for yapf. I volunteer to take this on

On Wed, Jan 22, 2020, 10:31 Udi Meiri  wrote:

> +1 to autoformatting
>
> On Wed, Jan 22, 2020 at 9:57 AM Luke Cwik  wrote:
>
>> +1 to autoformatters. Also the Beam Java SDK went through a one time pass
>> to apply the spotless formatting.
>>
>> On Tue, Jan 21, 2020 at 9:52 PM Ahmet Altay  wrote:
>>
>>> +1 to autoformatters and yapf. It appears to be a well maintained
>>> project. I do support making a one time pass to apply formatting the whole
>>> code base.
>>>
>>> On Tue, Jan 21, 2020 at 5:38 PM Chad Dombrova  wrote:
>>>
>>>>
>>>>> It'd be good if there was a way to only apply to violating (or at
>>>>> least changed) lines.
>>>>
>>>>
>>>> I assumed the first thing we’d do is convert all of the code in one go,
>>>> since it’s a very safe operation. Did you have something else in mind?
>>>>
>>>> -chad
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> On Tue, Jan 21, 2020 at 1:56 PM Chad Dombrova 
>>>>> wrote:
>>>>> >
>>>>> > +1 to autoformatting
>>>>> >
>>>>> > Let me add some nuance to that.
>>>>> >
>>>>> > The way I see it there are 2 varieties of formatters:  those which
>>>>> take the original formatting into consideration (autopep8) and those which
>>>>> disregard it (yapf, black).
>>>>> >
>>>>> > I much prefer yapf to black, because you have plenty of options to
>>>>> tweak with yapf (enough to make the output a pretty close match to the
>>>>> current Beam style), and you can mark areas to preserve the original
>>>>> formatting, which could be very useful with Pipeline building with pipe
>>>>> operators.  Please don't pick black.
>>>>> >
>>>>> > autopep8 is more along the lines of spotless in Java -- it only
>>>>> corrects code that breaks the project's style rules.  The big problem with
>>>>> Beam's current style is that it is so esoteric that autopep8 can't enforce
>>>>> it -- and I'm not just talking about 2-spaces, which I don't really have a
>>>>> problem with -- the problem is the use of either 2 or 4 spaces depending 
>>>>> on
>>>>> context (expression start vs hanging indent, etc).  This is my *biggest*
>>>>> gripe about the current style.  PyCharm doesn't have enough control
>>>>> either.  So, if we can choose a style that can be expressed by flake8 or
>>>>> pycodestyle then we can use autopep8 to enforce it.
>>>>> >
>>>>> > I'd prefer autopep8 to yapf because I like having a little wiggle
>>>>> room to influence the style, but on a big project like Beam all that 
>>>>> wiggle
>>>>> room ends up to minor but noticeable inconsistencies in style throughout
>>>>> the project.  yapf ensures completely consistent style, but the tradeoff 
>>>>> is
>>>>> that it's sometimes ugly, especially in scenarios with similar repeated
>>>>> entries like argparse, where yapf might insert line breaks in visually
>>>>> inconsistent and unappealing ways depending on the lengths of the keywords
>>>>> and expressions involved.
>>>>> >
>>>>> > Either way (but especially if we choose yapf) I think it'd be a nice
>>>>> addition to setup a pre-commit [1] config so that people can opt in to
>>>>> running *lightweight* autofixers prior to commit.  This will not only
>>>>> reduce dev frustration but will also reduce the amount of cpu cycles that
>>>>> Jenkins spends pointing out lint errors.
>>>>> >
>>>>> > [1] https://pre-commit.com/
>>>>> >
>>>>> > -chad
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Jan 21, 2020 at 12:52 PM Ismaël Mejía 
>>>>> wrote:
>>>>> >>
>>>>> >> Last time we discussed this there seems not to be much progress
>>>>> into autoformatting.
>>>>> >> This tool looks more tweakable, so maybe it could be more
>>>>> appropriate for Beam's use c

Re: [DISCUSS] Autoformat python code with Black

2020-01-22 Thread Udi Meiri
Sorry, backing off on this due to time constraints.

On Wed, Jan 22, 2020 at 3:39 PM Udi Meiri  wrote:

> It sounds like there's a consensus for yapf. I volunteer to take this on
>
> On Wed, Jan 22, 2020, 10:31 Udi Meiri  wrote:
>
>> +1 to autoformatting
>>
>> On Wed, Jan 22, 2020 at 9:57 AM Luke Cwik  wrote:
>>
>>> +1 to autoformatters. Also the Beam Java SDK went through a one time
>>> pass to apply the spotless formatting.
>>>
>>> On Tue, Jan 21, 2020 at 9:52 PM Ahmet Altay  wrote:
>>>
>>>> +1 to autoformatters and yapf. It appears to be a well maintained
>>>> project. I do support making a one time pass to apply formatting the whole
>>>> code base.
>>>>
>>>> On Tue, Jan 21, 2020 at 5:38 PM Chad Dombrova 
>>>> wrote:
>>>>
>>>>>
>>>>>> It'd be good if there was a way to only apply to violating (or at
>>>>>> least changed) lines.
>>>>>
>>>>>
>>>>> I assumed the first thing we’d do is convert all of the code in one
>>>>> go, since it’s a very safe operation. Did you have something else in mind?
>>>>>
>>>>> -chad
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> On Tue, Jan 21, 2020 at 1:56 PM Chad Dombrova 
>>>>>> wrote:
>>>>>> >
>>>>>> > +1 to autoformatting
>>>>>> >
>>>>>> > Let me add some nuance to that.
>>>>>> >
>>>>>> > The way I see it there are 2 varieties of formatters:  those which
>>>>>> take the original formatting into consideration (autopep8) and those 
>>>>>> which
>>>>>> disregard it (yapf, black).
>>>>>> >
>>>>>> > I much prefer yapf to black, because you have plenty of options to
>>>>>> tweak with yapf (enough to make the output a pretty close match to the
>>>>>> current Beam style), and you can mark areas to preserve the original
>>>>>> formatting, which could be very useful with Pipeline building with pipe
>>>>>> operators.  Please don't pick black.
>>>>>> >
>>>>>> > autopep8 is more along the lines of spotless in Java -- it only
>>>>>> corrects code that breaks the project's style rules.  The big problem 
>>>>>> with
>>>>>> Beam's current style is that it is so esoteric that autopep8 can't 
>>>>>> enforce
>>>>>> it -- and I'm not just talking about 2-spaces, which I don't really have 
>>>>>> a
>>>>>> problem with -- the problem is the use of either 2 or 4 spaces depending 
>>>>>> on
>>>>>> context (expression start vs hanging indent, etc).  This is my *biggest*
>>>>>> gripe about the current style.  PyCharm doesn't have enough control
>>>>>> either.  So, if we can choose a style that can be expressed by flake8 or
>>>>>> pycodestyle then we can use autopep8 to enforce it.
>>>>>> >
>>>>>> > I'd prefer autopep8 to yapf because I like having a little wiggle
>>>>>> room to influence the style, but on a big project like Beam all that 
>>>>>> wiggle
>>>>>> room ends up to minor but noticeable inconsistencies in style throughout
>>>>>> the project.  yapf ensures completely consistent style, but the tradeoff 
>>>>>> is
>>>>>> that it's sometimes ugly, especially in scenarios with similar repeated
>>>>>> entries like argparse, where yapf might insert line breaks in visually
>>>>>> inconsistent and unappealing ways depending on the lengths of the 
>>>>>> keywords
>>>>>> and expressions involved.
>>>>>> >
>>>>>> > Either way (but especially if we choose yapf) I think it'd be a
>>>>>> nice addition to setup a pre-commit [1] config so that people can opt in 
>>>>>> to
>>>>>> running *lightweight* autofixers prior to commit.  This will not only
>>>>>> reduce dev frustration but will also reduce the amount of cpu cycles that
>>>>>> Jenkins spends pointing out lint errors.
>>>>>> >
>>>>>> >

[RESULT] [VOTE] Release 2.18.0, release candidate #1

2020-01-23 Thread Udi Meiri
I'm happy to announce that we have unanimously approved this release.

There are 5 approving votes, 4 of which are binding:
* Ahmet Altay
* Robert Bradshaw
* Ismaël Mejía
* Jean-Baptiste Onofré

There are no disapproving votes.

Thanks everyone!


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [DISCUSS] Autoformat python code with Black

2020-01-27 Thread Udi Meiri
I've done a pass on the PR on code I'm familiar with.
Please make a pass and add your suggestions on the PR.

On Fri, Jan 24, 2020 at 7:15 AM Ismaël Mejía  wrote:

> Java build fails on any unformatted code so python probably should be like
> that.
> We have to ensure however that it fails early on that.
> As Robert said time to debate the knobs :)
>
> On Fri, Jan 24, 2020 at 3:19 PM Kamil Wasilewski <
> kamil.wasilew...@polidea.com> wrote:
>
>> PR is ready: https://github.com/apache/beam/pull/10684. Please share
>> your comments ;-) I've managed to reduce the impact a bit:
>> 501 files changed, 18245 insertions(+), 19495 deletions(-)
>>
>> We still need to consider how to enforce the usage of autoformatter.
>> Pre-commit sounds like a nice addition, but it still needs to be installed
>> manually by a developer. On the other hand, Jenkins precommit job that
>> fails if any unformatted code is detected looks like too strict. What do
>> you think?
>>
>> On Thu, Jan 23, 2020 at 8:37 PM Robert Bradshaw 
>> wrote:
>>
>>> Thanks! Now we get to debate what knobs to twiddle :-P
>>>
>>> FYI, I did a simple run (just pushed to
>>> https://github.com/apache/beam/compare/master...robertwb:yapf) to see
>>> the impact. The diff is
>>>
>>> $ git diff --stat master
>>> ...
>>>  547 files changed, 22118 insertions(+), 21129 deletions(-)
>>>
>>> For reference
>>>
>>> $ find sdks/python/apache_beam -name '*.py' | xargs wc
>>> ...
>>> 200424  612002 7431637 total
>>>
>>> which means a little over 10% of lines get touched. I think there are
>>> some options, such as SPLIT_ALL_TOP_LEVEL_COMMA_SEPARATED_VALUES and
>>> COALESCE_BRACKETS, that will conform more to the style we are already
>>> (mostly) following.
>>>
>>>
>>> On Thu, Jan 23, 2020 at 1:59 AM Kamil Wasilewski
>>>  wrote:
>>> >
>>> > Thank you Michał for creating the ticket. I have some free time and
>>> I'd like to volunteer myself for this task.
>>> > Indeed, it looks like there's consensus for `yapf`, so I'll try `yapf`
>>> first.
>>> >
>>> > Best,
>>> > Kamil
>>> >
>>> >
>>> > On Thu, Jan 23, 2020 at 10:37 AM Michał Walenia <
>>> michal.wale...@polidea.com> wrote:
>>> >>
>>> >> Hi all,
>>> >> I created a JIRA issue for this and summarized the available tools
>>> >>
>>> >> https://issues.apache.org/jira/browse/BEAM-9175
>>> >>
>>> >> Cheers,
>>> >> Michal
>>> >>
>>> >> On Thu, Jan 23, 2020 at 1:49 AM Udi Meiri  wrote:
>>> >>>
>>> >>> Sorry, backing off on this due to time constraints.
>>> >>>
>>> >>> On Wed, Jan 22, 2020 at 3:39 PM Udi Meiri  wrote:
>>> >>>>
>>> >>>> It sounds like there's a consensus for yapf. I volunteer to take
>>> this on
>>> >>>>
>>> >>>> On Wed, Jan 22, 2020, 10:31 Udi Meiri  wrote:
>>> >>>>>
>>> >>>>> +1 to autoformatting
>>> >>>>>
>>> >>>>> On Wed, Jan 22, 2020 at 9:57 AM Luke Cwik 
>>> wrote:
>>> >>>>>>
>>> >>>>>> +1 to autoformatters. Also the Beam Java SDK went through a one
>>> time pass to apply the spotless formatting.
>>> >>>>>>
>>> >>>>>> On Tue, Jan 21, 2020 at 9:52 PM Ahmet Altay 
>>> wrote:
>>> >>>>>>>
>>> >>>>>>> +1 to autoformatters and yapf. It appears to be a well
>>> maintained project. I do support making a one time pass to apply formatting
>>> the whole code base.
>>> >>>>>>>
>>> >>>>>>> On Tue, Jan 21, 2020 at 5:38 PM Chad Dombrova 
>>> wrote:
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> It'd be good if there was a way to only apply to violating (or
>>> at
>>> >>>>>>>>> least changed) lines.
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> I assumed the first thing we’d do is co

Re: [ANNOUNCE] New committer: Michał Walenia

2020-01-27 Thread Udi Meiri
Congratulations Michał!

On Mon, Jan 27, 2020 at 3:49 PM Chamikara Jayalath 
wrote:

> Congrats Michał!
>
> On Mon, Jan 27, 2020 at 2:59 PM Reza Rokni  wrote:
>
>> Congratulations buddy!
>>
>> On Tue, 28 Jan 2020, 06:52 Valentyn Tymofieiev, 
>> wrote:
>>
>>> Congratulations, Michał!
>>>
>>> On Mon, Jan 27, 2020 at 2:24 PM Austin Bennett <
>>> whatwouldausti...@gmail.com> wrote:
>>>
 Nice -- keep up the good work!

 On Mon, Jan 27, 2020 at 2:02 PM Mikhail Gryzykhin 
 wrote:
 >
 > Congratulations Michal!
 >
 > --Mikhail
 >
 > On Mon, Jan 27, 2020 at 1:01 PM Kyle Weaver 
 wrote:
 >>
 >> Congratulations Michał! Looking forward to your future contributions
 :)
 >>
 >> Thanks,
 >> Kyle
 >>
 >> On Mon, Jan 27, 2020 at 12:47 PM Pablo Estrada 
 wrote:
 >>>
 >>> Hi everyone,
 >>>
 >>> Please join me and the rest of the Beam PMC in welcoming a new
 committer: Michał Walenia
 >>>
 >>> Michał has contributed to Beam in many ways, including the
 performance testing infrastructure, and has even spoken at events about
 Beam.
 >>>
 >>> In consideration of his contributions, the Beam PMC trusts him with
 the responsibilities of a Beam committer[1].
 >>>
 >>> Thanks for your contributions Michał!
 >>>
 >>> Pablo, on behalf of the Apache Beam PMC.
 >>>
 >>> [1]
 https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer

>>>


smime.p7s
Description: S/MIME Cryptographic Signature


[ANNOUNCE] Beam 2.18.0 Released

2020-01-28 Thread Udi Meiri
The Apache Beam team is pleased to announce the release of version 2.18.0.

Apache Beam is an open source unified programming model to define and
execute data processing pipelines, including ETL, batch and stream
(continuous) processing. See https://beam.apache.org

You can download the release here:

https://beam.apache.org/get-started/downloads/

This release includes bug fixes, features, and improvements detailed on
the Beam blog: https://beam.apache.org/blog/2020/01/13/beam-2.18.0.html

Thanks to everyone who contributed to this release, and we hope you enjoy
using Beam 2.18.0.
-- Udi Meiri, on behalf of The Apache Beam team


anyone working on updating com.google.cloud:google-cloud-spanner?

2020-01-28 Thread Udi Meiri
It's currently at 1.6.0, which a year old.
https://github.com/googleapis/java-spanner/releases?after=1.9.0

I would appreciate any help with this.

Tracking issues:
https://issues.apache.org/jira/browse/BEAM-8758
https://issues.apache.org/jira/browse/BEAM-8682


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] New committer: Hannah Jiang

2020-01-28 Thread Udi Meiri
Welcome and congrats Hannah!

On Tue, Jan 28, 2020 at 4:52 PM Robin Qiu  wrote:

> Congratulations, Hannah!
>
> On Tue, Jan 28, 2020 at 4:50 PM Alan Myrvold  wrote:
>
>> Congrats, Hannah
>>
>> On Tue, Jan 28, 2020 at 4:46 PM Connell O'Callaghan 
>> wrote:
>>
>>> Thank you for sharing Luke!!!
>>>
>>> Well done and congratulations Hannah!!
>>>
>>> On Tue, Jan 28, 2020 at 4:45 PM Heejong Lee  wrote:
>>>
 Congratulations! :)

 On Tue, Jan 28, 2020 at 4:43 PM Yichi Zhang  wrote:

> Congrats Hannah!
>
> On Tue, Jan 28, 2020 at 3:57 PM Yifan Zou  wrote:
>
>> Congratulations Hannah!!
>>
>> On Tue, Jan 28, 2020 at 3:55 PM Boyuan Zhang 
>> wrote:
>>
>>> Thanks for all your contributions! Congratulations~
>>>
>>> On Tue, Jan 28, 2020 at 3:44 PM Pablo Estrada 
>>> wrote:
>>>
 yoooho : D

 On Tue, Jan 28, 2020 at 3:21 PM Luke Cwik  wrote:

> Hi everyone,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Hannah Jiang
>
> Hannah has contributed to Beam in many ways, including work on
> building and releasing the Apache Beam SDK containers.
>
> In consideration of their contributions, the Beam PMC trusts them
> with the responsibilities of a Beam committer[1].
>
> Thanks for your contributions Hannah!
>
> Luke, on behalf of the Apache Beam PMC.
>
> [1]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Jenkins jobs not running for my PR 10438

2020-01-31 Thread Udi Meiri
done

On Fri, Jan 31, 2020 at 9:07 AM Tomo Suzuki  wrote:

> HI Beam committers,
>
> Would you re-trigger the 2 failed checks in
> https://github.com/apache/beam/pull/10714 ?
> Run Java PreCommit
> Run Java_Examples_Dataflow PreCommit
>
>
> On Fri, Jan 31, 2020 at 7:51 AM Rehman Murad Ali <
> rehman.murad...@venturedive.com> wrote:
>
>> Hi,
>>
>> I appreciate if someone could trigger the jobs for this PR:
>>
>> https://github.com/apache/beam/pull/10627
>>
>> *Thanks*
>>
>> *Rehman Murad Ali*
>> Software Engineer
>> Mobile: +92 3452076766 <+92%20345%202076766>
>> Skype: rehman.muradali
>>
>>
>> On Thu, Jan 30, 2020 at 10:19 PM Luke Cwik  wrote:
>>
>>> done
>>>
>>> On Thu, Jan 30, 2020 at 12:28 AM Shoaib Zafar <
>>> shoaib.za...@venturedive.com> wrote:
>>>
 Hi Beam Committer,

 I appreciate if someone could trigger jobs for
 https://github.com/apache/beam/pull/10712.

 Thanks!

 *Shoaib Zafar*

 Software Engineering Lead
 Mobile: +92 333 274 6242
 Skype: live:shoaibzafar_1

 


 On Thu, Jan 30, 2020 at 9:09 AM Boyuan Zhang 
 wrote:

> Done : )
>
> On Wed, Jan 29, 2020 at 7:52 PM Tomo Suzuki 
> wrote:
>
>> HI Beam committers:
>> (Thanks, Luke!)
>>
>> Can somebody retrigger the following 2 failed checks for
>> https://github.com/apache/beam/pull/10714 ?
>> Run Java PreCommit
>> Run Java_Examples_Dataflow PreCommit
>>
>> On Wed, Jan 29, 2020 at 4:48 PM Luke Cwik  wrote:
>> >
>> > done
>> >
>> > On Wed, Jan 29, 2020 at 11:07 AM Tomo Suzuki 
>> wrote:
>> >>
>> >> Hi Beam committers,
>> >>
>> >> I appreciate if you can trigger the precommit checks for
>> >> https://github.com/apache/beam/pull/10714
>> >>
>> >> with following 6 additional commands (one command per comment):
>> >>
>> >> Run Java PostCommit
>> >> Run Java HadoopFormatIO Performance Test
>> >> Run BigQueryIO Streaming Performance Test Java
>> >> Run Dataflow ValidatesRunner
>> >> Run Spark ValidatesRunner
>> >> Run SQL Postcommit
>> >>
>> >> Regards,
>> >> Tomo
>> >>
>> >>
>>
>>
>> --
>> Regards,
>> Tomo
>>
>
>
> --
> Regards,
> Tomo
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [DISCUSSION] Improve release notes by adding a change list file

2020-02-03 Thread Udi Meiri
+1 to add this to the checklist

On Mon, Feb 3, 2020 at 4:57 PM Robert Bradshaw  wrote:

> On Mon, Feb 3, 2020 at 4:49 PM Ahmet Altay  wrote:
> >
> > On Mon, Feb 3, 2020 at 2:09 PM Robert Bradshaw 
> wrote:
> >>
> >> I would suggest we start with the simpler single file. If merge
> >> conflicts become an issue, we could look at other options, but I think
> >> it's worth keeping in mind that what we're trying to produce here is a
> >> single, higher-level, cohesive summary of the release rather than a
> >> 1:1 listing of commits, pull request, or jira entries (which we can
> >> link to). While new features often merit their own bullet points, this
> >> will allow for entries such as "Several improvements to portability
> >> including ..."
> >
> > I agree. If there are no objections I will go ahead with the PR I
> proposed. It adds a single change log file to begin with.
> >
> > We would need all committers to help after that by asking PR authors to
> update this file whenever it makes sense.
>
> Yes. Should we add it to the PR template checklist?
>
> >> On Mon, Feb 3, 2020 at 1:55 PM Ahmet Altay  wrote:
> >> >
> >> >
> >> >
> >> > On Sat, Feb 1, 2020 at 9:22 AM Chad Dombrova 
> wrote:
> >> >>
> >> >> In case it's of any use, there's a tool called towncrier[1] to help
> compile changelog fragments and compile them at time of delivery.
> >> >
> >> >
> >> > I would prefer not to have the complexity of multiple files and an
> added tool to the release process. I do not have a strong opinion though.
> If others prefer we can switch to this tool. One nice benefit of this tool
> would be to avoid merge conflicts if many different PRs edit the change log
> file all at the same time in a conflicting way.
> >> >
> >> >>
> >> >>
> >> >> I came across this when working on the python-attrs[2] project,
> which has some good documentation for contributors on how to use it:
> https://www.attrs.org/en/stable/contributing.html#changelog
> >> >>
> >> >>
> >> >> [1] https://github.com/hawkowl/towncrier
> >> >> [2] https://github.com/python-attrs/attrs
> >> >>
> >> >>
> >> >> On Fri, Jan 31, 2020 at 5:09 PM Ahmet Altay 
> wrote:
> >> >>>
> >> >>> Thank you for the quick responses. I sent out
> https://github.com/apache/beam/pull/10743 to make this change. Please
> provide feedback or directly edit the PR.
> >> >>>
> >> >>>
> >> >>> On Fri, Jan 31, 2020 at 3:58 PM Robert Bradshaw <
> rober...@google.com> wrote:
> >> 
> >>  Yes, yes, yes! This is the one model of release notes that I've
> >>  actually seen work well at scale.
> >> 
> >> 
> https://lists.apache.org/thread.html/41e03ace17dbcccf7e267ba6d538736b2a99a8e73e7fb45702766b17%40%3Cdev.beam.apache.org%3E
> >> 
> >>  Let's make it happen.
> >> 
> >>  On Fri, Jan 31, 2020 at 3:47 PM Robert Burke 
> wrote:
> >>  >
> >>  > I like this suggestion, Jira titles and commit summaries don't
> necessarily reflect the user impact for a given change (or set of changes).
> Being able to see the Forest instead of the trees.
> >>  >
> >>  > On Fri, Jan 31, 2020, 3:37 PM Kenneth Knowles 
> wrote:
> >>  >>
> >>  >> +1
> >>  >>
> >>  >> This is a great idea. Hope it can lead to higher-value view of
> relevant changes.
> >>  >>
> >>  >> I like it being in the root of the repo, so it lives next to
> the code.
> >>  >>
> >>  >> Since the website is also markdown, it could be copied over
> directly at release time, so it can be browsed there, too.
> >>  >>
> >>  >> Kenn
> >>  >>
> >>  >> On Fri, Jan 31, 2020 at 3:16 PM Ahmet Altay 
> wrote:
> >>  >>>
> >>  >>> Hi all,
> >>  >>>
> >>  >>> We currently have two major ways to communicate changes in a
> release:
> >>  >>> - A blog post, to highlight major changes in the release.
> (Example for 2.17: [1])
> >>  >>> - JIRA release notes pages listing all issues tagged for a
> specific release. (Example for 2.17 [2]).
> >>  >>>
> >>  >>> There are a few issues with this process:
> >>  >>> - It is difficult for the release manager to know what is
> important, what is a breaking change, what is dependency change etc. For
> example, there were more than 150 Jira issues tagged for 2.17 release.
> >>  >>> - Release blog has many items, and does not necessarily
> communicate important changes. It is difficult for users to discover major
> changes short of going through a large list.
> >>  >>> - People involved in authoring or reviewing a PRs usually have
> the most context about the change, and they are not necessarily involved in
> the release process to provide this additional information.
> >>  >>>
> >>  >>> Would it be helpful if we maintain a simple change list file
> and update it as part of the PRs with noteworthy changes? Release managers
> could use this information as is in their blog posts (or link to it). Users
> will have a single place to find highlights from various versions.
> >>  >>>
> >> >

Re: [RELEASE VOTE RESULT] Release 2.19.0, release candidate #1

2020-02-03 Thread Udi Meiri
Thank you Boyuan!

On Mon, Feb 3, 2020 at 3:40 PM Ahmet Altay  wrote:

> On Mon, Feb 3, 2020 at 1:22 PM Thomas Weise  wrote:
>
>> Impressive, probably the fastest/smoothest Beam release so far.
>>
>
> I agree! Thank you, Boyuan!
>
>
>>
>> On Mon, Feb 3, 2020 at 10:45 AM Boyuan Zhang  wrote:
>>
>>> I'm happy to announce that we have unanimously approved this release.
>>>
>>> There are 5 approving votes, 4 of which are binging:
>>> * Ahmet Altay
>>> * Ismaël Mejía
>>> * Jean-Baptiste Onofré
>>> * Robert Bradshaw
>>>
>>> There are no disapproving votes.
>>>
>>> Thanks for everyone's help! I'm going to finalize the release and send
>>> out the official release announcement later.
>>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [DISCUSS] Autoformat python code with Black

2020-02-07 Thread Udi Meiri
t;>> On Thu, Feb 6, 2020 at 1:45 PM Kamil Wasilewski <
>> kamil.wasilew...@polidea.com> wrote:
>> >>>>
>> >>>> Thanks to everyone involved in the discussion.
>> >>>>
>> >>>> I've taken a look at the first 50 recently updated Pull Requests.
>> Only few of them were affected. I hope it wouldn't be too hard to fix them.
>> >>>>
>> >>>> In any case, here you can find instructions on how to run formatter:
>> https://cwiki.apache.org/confluence/display/BEAM/Python+Tips (section
>> "Formatting").
>> >>>>
>> >>>> On Thu, Feb 6, 2020 at 12:42 PM Michał Walenia <
>> michal.wale...@polidea.com> wrote:
>> >>>>>
>> >>>>> Hi,
>> >>>>> the PR is merged, all checks were green :)
>> >>>>> Enjoy prettier Python!
>> >>>>>
>> >>>>> On Thu, Feb 6, 2020 at 11:11 AM Ismaël Mejía 
>> wrote:
>> >>>>>>
>> >>>>>> Agree no need for vote for this because the consensus is clear and
>> the sole
>> >>>>>> impact I can think of are pending PRs that will be broken. In the
>> Java case
>> >>>>>> what we did was to just notice every PR that was affected by the
>> change.
>> >>>>>> And clearly document how to validate and autoformat the code.
>> >>>>>>
>> >>>>>> So the earlier the better, go go autoformat!
>> >>>>>>
>> >>>>>> On Thu, Feb 6, 2020 at 1:38 AM Robert Bradshaw <
>> rober...@google.com> wrote:
>> >>>>>>>
>> >>>>>>> No, perhaps not. I agree there's consensus, just wondering what
>> the
>> >>>>>>> next steps should be to get this in. (The presubmits look like
>> they're
>> >>>>>>> all passing, with the exception of some breakage in java that
>> should
>> >>>>>>> be completely unrelated. Of course there's already merge
>> conflicts...)
>> >>>>>>>
>> >>>>>>> On Wed, Feb 5, 2020 at 3:55 PM Ahmet Altay 
>> wrote:
>> >>>>>>> >
>> >>>>>>> > Do we need a formal vote? There is consensus on this thread and
>> on the PR.
>> >>>>>>> >
>> >>>>>>> > On Wed, Feb 5, 2020 at 3:37 PM Robert Bradshaw <
>> rober...@google.com> wrote:
>> >>>>>>> >>
>> >>>>>>> >> The PR is looking good. Should we call a vote?
>> >>>>>>> >>
>> >>>>>>> >> On Mon, Jan 27, 2020 at 11:03 AM Robert Bradshaw <
>> rober...@google.com> wrote:
>> >>>>>>> >> >
>> >>>>>>> >> > Thanks. I commented on the PR. I think if we're going this
>> route we
>> >>>>>>> >> > should add a pre-commit, plus instructions on how to run the
>> tool
>> >>>>>>> >> > (similar to spotless).
>> >>>>>>> >> >
>> >>>>>>> >> > On Mon, Jan 27, 2020 at 10:00 AM Udi Meiri 
>> wrote:
>> >>>>>>> >> > >
>> >>>>>>> >> > > I've done a pass on the PR on code I'm familiar with.
>> >>>>>>> >> > > Please make a pass and add your suggestions on the PR.
>> >>>>>>> >> > >
>> >>>>>>> >> > > On Fri, Jan 24, 2020 at 7:15 AM Ismaël Mejía <
>> ieme...@gmail.com> wrote:
>> >>>>>>> >> > >>
>> >>>>>>> >> > >> Java build fails on any unformatted code so python
>> probably should be like that.
>> >>>>>>> >> > >> We have to ensure however that it fails early on that.
>> >>>>>>> >> > >> As Robert said time to debate the knobs :)
>> >>>>>>> >> > >>
>> >>>>>>> >> > >> On Fri, Jan 24, 2020 at 3:19 PM Kamil Wasilewski <
>> kamil.wasilew...@polidea.com> wrote:
>> >>>>>>> >> &g

Re: Labels on PR

2020-02-10 Thread Udi Meiri
Cool!

On Mon, Feb 10, 2020 at 9:27 AM Robert Burke  wrote:

> +1 to autolabeling
>
> On Mon, Feb 10, 2020, 9:21 AM Luke Cwik  wrote:
>
>> Nice
>>
>> On Mon, Feb 10, 2020 at 2:52 AM Alex Van Boxel  wrote:
>>
>>> Ha, cool. I'll have a look at the autolabeler. The infra stuff is not
>>> something I've looked at... I'll dive into that.
>>>
>>>  _/
>>> _/ Alex Van Boxel
>>>
>>>
>>> On Mon, Feb 10, 2020 at 11:49 AM Ismaël Mejía  wrote:
>>>
 +1

 You don't need to write your own action, there is already one
 autolabeler action [1].
 INFRA can easily configure it for Beam (as they did for Avro [2]) if
 we request it.
 The plugin is quite easy to configure and works like a charm [3].

 [1] https://github.com/probot/autolabeler
 [1] https://issues.apache.org/jira/browse/INFRA-17367
 [2] https://github.com/apache/avro/blob/master/.github/autolabeler.yml


 On Mon, Feb 10, 2020 at 11:20 AM Alexey Romanenko <
 aromanenko@gmail.com> wrote:

> Great initiative, thanks Alex! I was thinking to add such labels into
> PR title but I believe that GitHub labels are better since it can be used
> easily for filtering, for example.
>
> Maybe it could be useful to add more granulation for labels, like
> “release”, “runners”, “website”, etc but I’m afraid to make the titles too
> heavy because of this.
>
> > On 10 Feb 2020, at 08:35, Alex Van Boxel  wrote:
> >
> > I've started putting labels on PR's. I've done the first page for
> now (as I'm afraid putting them on older once could affect the stale bot. 
> I
> hope this is ok.
> >
> > For now I'm only focussing on language and I'm going to see if I can
> write a GitLab action for it. I hope this is useful. Other kind of
> suggestions for labels, that can be automated, are welcome.
> >
> > 
> >  _/
> > _/ Alex Van Boxel
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Sphinx Docs Command Error (:sdks:python:test-suites:tox:pycommon:docs)

2020-02-10 Thread Udi Meiri
I don't have those issues (running on Linux), but a possible workaround
could be to remove the "-j 8" flags (2 locations) in generate_pydoc.sh.


On Mon, Feb 10, 2020 at 11:06 AM Shoaib Zafar 
wrote:

> Hello Beamers.
>
> Just curious does anyone having trouble running
> ':sdks:python:test-suites:tox:pycommon:docs' command locally?
>
> After rebasing with master recently, I am facing sphinx thread fork error
> with on my macos mojave, using python 3.7.0.
> I Tried to add system variable "export
> OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES" (which I found on google) but no
> luck!
>
> Any suggestions/help?
>
> Thanks!
>
> Console Log:
> --
> 
> Creating file target/docs/source/apache_beam.utils.proto_utils.rst.
> Creating file target/docs/source/apache_beam.utils.retry.rst.
> Creating file target/docs/source/apache_beam.utils.subprocess_server.rst.
> Creating file
> target/docs/source/apache_beam.utils.thread_pool_executor.rst.
> Creating file target/docs/source/apache_beam.utils.timestamp.rst.
> Creating file target/docs/source/apache_beam.utils.urns.rst.
> Creating file target/docs/source/apache_beam.utils.rst.
> objc[8384]: +[__NSCFConstantString initialize] may have been in progress
> in another thread when fork() was called.
> objc[8384]: +[__NSCFConstantString initialize] may have been in progress
> in another thread when fork() was called. We cannot safely call it or
> ignore it in the fork() child process. Crashing instead. Set a breakpoint
> on objc_initializeAfterForkError to debug.
>
> Traceback (most recent call last):
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/cmd/build.py",
> line 304, in build_main
> app.build(args.force_all, filenames)
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/application.py",
> line 335, in build
> self.builder.build_all()
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/builders/__init__.py",
> line 305, in build_all
> self.build(None, summary=__('all source files'), method='all')
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/builders/__init__.py",
> line 360, in build
> updated_docnames = set(self.read())
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/builders/__init__.py",
> line 466, in read
> self._read_parallel(docnames, nproc=self.app.parallel)
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/builders/__init__.py",
> line 521, in _read_parallel
> tasks.join()
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/util/parallel.py",
> line 114, in join
> self._join_one()
>   File
> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/util/parallel.py",
> line 120, in _join_one
> exc, logs, result = pipe.recv()
>   File
> "/Users/shoaib/.pyenv/versions/3.7.0/lib/python3.7/multiprocessing/connection.py",
> line 250, in recv
> buf = self._recv_bytes()
>   File
> "/Users/shoaib/.pyenv/versions/3.7.0/lib/python3.7/multiprocessing/connection.py",
> line 407, in _recv_bytes
> buf = self._recv(4)
>   File
> "/Users/shoaib/.pyenv/versions/3.7.0/lib/python3.7/multiprocessing/connection.py",
> line 383, in _recv
> raise EOFError
> EOFError
>
> Exception occurred:
>   File
> "/Users/shoaib/.pyenv/versions/3.7.0/lib/python3.7/multiprocessing/connection.py",
> line 383, in _recv
> raise EOFError
> EOFError
> The full traceback has been saved in
> /Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/tmp/sphinx-err-mphtfnei.log,
> if you want to report the issue to the developers.
> Please also report this if it was a user error, so that a better error
> message can be provided next time.
> A bug report can be filed in the tracker at <
> https://github.com/sphinx-doc/sphinx/issues>. Thanks!
> objc[8385]: +[__NSCFConstantString initialize] may have been in progress
> in another thread when fork() was called.
> objc[8385]: +[__NSCFConstantString initialize] may have been in progress
> in another thread when fork() was called. W

Re: Sphinx Docs Command Error (:sdks:python:test-suites:tox:pycommon:docs)

2020-02-11 Thread Udi Meiri
For me the difference was about 20s longer (40s -> 60s approx). Not
significant IMO

On Tue, Feb 11, 2020 at 9:59 AM Ahmet Altay  wrote:

> Should we remove the "-j 8" option by default? Sphinx docs says this is an
> experimental option [1]. I do not recall docs generation taking a long
> time, does this increase significantly without this option?
>
> [1] http://www.sphinx-doc.org/en/stable/man/sphinx-build.html
>
> On Tue, Feb 11, 2020 at 1:16 AM Shoaib Zafar 
> wrote:
>
>> Thanks, Udi and Jincheng for the response.
>> The suggested solution worked for me as well.
>>
>> Regards,
>>
>> *Shoaib Zafar*
>> Software Engineering Lead
>> Mobile: +92 333 274 6242
>> Skype: live:shoaibzafar_1
>>
>> <http://venturedive.com/>
>>
>>
>> On Tue, Feb 11, 2020 at 1:17 PM jincheng sun 
>> wrote:
>>
>>> I have verified that this issue could be reproduced in my local
>>> environment (MacOS) and the solution suggested by Udi could work!
>>>
>>> Best,
>>> Jincheng
>>>
>>> Udi Meiri  于2020年2月11日周二 上午8:51写道:
>>>
>>>> I don't have those issues (running on Linux), but a possible workaround
>>>> could be to remove the "-j 8" flags (2 locations) in generate_pydoc.sh.
>>>>
>>>>
>>>> On Mon, Feb 10, 2020 at 11:06 AM Shoaib Zafar <
>>>> shoaib.za...@venturedive.com> wrote:
>>>>
>>>>> Hello Beamers.
>>>>>
>>>>> Just curious does anyone having trouble running
>>>>> ':sdks:python:test-suites:tox:pycommon:docs' command locally?
>>>>>
>>>>> After rebasing with master recently, I am facing sphinx thread fork
>>>>> error with on my macos mojave, using python 3.7.0.
>>>>> I Tried to add system variable "export
>>>>> OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES" (which I found on google)
>>>>> but no luck!
>>>>>
>>>>> Any suggestions/help?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Console Log:
>>>>> --
>>>>> 
>>>>> Creating file target/docs/source/apache_beam.utils.proto_utils.rst.
>>>>> Creating file target/docs/source/apache_beam.utils.retry.rst.
>>>>> Creating file
>>>>> target/docs/source/apache_beam.utils.subprocess_server.rst.
>>>>> Creating file
>>>>> target/docs/source/apache_beam.utils.thread_pool_executor.rst.
>>>>> Creating file target/docs/source/apache_beam.utils.timestamp.rst.
>>>>> Creating file target/docs/source/apache_beam.utils.urns.rst.
>>>>> Creating file target/docs/source/apache_beam.utils.rst.
>>>>> objc[8384]: +[__NSCFConstantString initialize] may have been in
>>>>> progress in another thread when fork() was called.
>>>>> objc[8384]: +[__NSCFConstantString initialize] may have been in
>>>>> progress in another thread when fork() was called. We cannot safely call 
>>>>> it
>>>>> or ignore it in the fork() child process. Crashing instead. Set a
>>>>> breakpoint on objc_initializeAfterForkError to debug.
>>>>>
>>>>> Traceback (most recent call last):
>>>>>   File
>>>>> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/cmd/build.py",
>>>>> line 304, in build_main
>>>>> app.build(args.force_all, filenames)
>>>>>   File
>>>>> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/application.py",
>>>>> line 335, in build
>>>>> self.builder.build_all()
>>>>>   File
>>>>> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-packages/sphinx/builders/__init__.py",
>>>>> line 305, in build_all
>>>>> self.build(None, summary=__('all source files'), method='all')
>>>>>   File
>>>>> "/Users/shoaib/Projects/beam/newbeam/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py37-docs/py37-docs/lib/python3.7/site-package

Re: jira search in chrome omnibox

2020-02-14 Thread Udi Meiri
The JIRA tips page already has the instructions for Chrome, Ismaël. Feel
free to add the same for Firefox.
https://cwiki.apache.org/confluence/display/BEAM/Jira+Tips

On Fri, Feb 14, 2020 at 8:20 AM Ismaël Mejía  wrote:

> For Firefox users:
>
> You can replicate the same behavior but it requires a bit more work:
>
> Firefox uses a format called OpenSearch so you have to generate an
> installable XML via this page.
> https://ready.to/search/en/#
>
> the search name: Beam Issues
> the front search term: https://issues.apache.org/jira/browse/BEAM-
> Click then on Make search plugin to generate an URL:
>
> https://ready.to/search/en/?sna=Beam%20issues&prf=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FBEAM-&in=utf&ou=ono&mod=pn#
> then you open the URL and click on `Opensearch plug-in Beam Issues` to
> install it
>
> Now you configure the keyword on Firefox preferences
>
> Preferences > Search on the left panel.
> Scroll down to One-Click Search Engines.
> Double-click on the Keyword column for the search engine you want to
> assign a shortcut to.
> Enter @ followed by your search shortcut keyword. For example: @beam
>
> Another user URL (to search on github open PRs):
>
> http://ready.to/search/en/?sna=Beam%20PRs&prf=https%3A%2F%2Fgithub.com%2Fapache%2Fbeam%2Fpulls%3Futf8%3D%E2%9C%93%26amp%3Bq%3Dis%3Apr%2B&in=utf&ou=ono&mod=pn
>
> Enjoy!
>
> ps. This is so useful that probably is worth to put in cwiki, create the
> page Udi and I will add the Firefox section if you agree.
>
>
>
>
> On Fri, Aug 31, 2018 at 2:31 AM Udi Meiri  wrote:
>
>> Correction: this is the correct URL:
>> https://issues.apache.org/jira/secure/QuickSearch.jspa?searchString=%s
>>
>> It uses smart querying. Ex: Searching for "beam open pubsub" will search
>> for open bugs in project BEAM with the keyword "pubsub".
>>
>> On Tue, Aug 28, 2018 at 4:49 PM Valentyn Tymofieiev 
>> wrote:
>>
>>> Thanks for sharing.
>>>
>>> I have also found useful following custom search query for PRs:
>>> https://github.com/apache/beam/pulls?q=is%3Apr%20%s
>>>
>>> Sample usage: type 'pr', space, type: 'author:tvalentyn'.
>>>
>>> You could also incorporate 'author:' into the query:
>>> https://github.com/apache/beam/pulls?q=is%3Apr%20author%3A
>>>
>>> On Tue, Aug 28, 2018 at 4:26 PM Daniel Oliveira 
>>> wrote:
>>>
>>>> This seems pretty useful. Thanks Udi!
>>>>
>>>> On Mon, Aug 27, 2018 at 3:54 PM Udi Meiri  wrote:
>>>>
>>>>> In case you want to quickly look up JIRA tickets, e.g., typing 'j',
>>>>> space, 'BEAM-4696'.
>>>>> Search URL:
>>>>> https://issues.apache.org/jira/QuickSearch.jspa?searchString=%s
>>>>>
>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] New committer: Chad Dombrova

2020-02-24 Thread Udi Meiri
Congrats and welcome, Chad!

On Mon, Feb 24, 2020 at 1:21 PM Pablo Estrada  wrote:

> Hi everyone,
>
> Please join me and the rest of the Beam PMC in welcoming a new committer:
> Chad Dombrova
>
> Chad has contributed to the project in multiple ways, including
> improvements to the testing infrastructure, and adding type annotations
> throughout the Python SDK, as well as working closely with the community on
> these improvements.
>
> In consideration of his contributions, the Beam PMC trusts him with the
> responsibilities of a Beam Committer[1].
>
> Thanks Chad for your contributions!
>
> -Pablo, on behalf of the Apache Beam PMC.
>
> [1]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

2020-02-26 Thread Udi Meiri
I agree with having low-frequency tests for low-priority versions.
Low-priority versions could be determined according to least usage.



On Wed, Feb 26, 2020 at 4:06 PM Robert Bradshaw  wrote:

> On Wed, Feb 26, 2020 at 3:29 PM Kenneth Knowles  wrote:
> >
> > Are these divergent enough that they all need to consume testing
> resources? For example can lower priority versions be daily runs or some
> such?
>
> For the 3.x series, I think we will get the most signal out of the
> lowest and highest version, and can get by with smoke tests +
> infrequent post-commits for the ones between.
>
> > Kenn
> >
> > On Wed, Feb 26, 2020 at 3:25 PM Robert Bradshaw 
> wrote:
> >>
> >> +1 to consulting users. Currently 3.5 downloads sit at 3.7%, or about
> >> 20% of all Python 3 downloads.
> >>
> >> I would propose getting in warnings about 3.5 EoL well ahead of time,
> >> at the very least as part of the 2.7 warning.
> >>
> >> Fortunately, supporting multiple 3.x versions is significantly easier
> >> than spanning 2.7 and 3.x. I would rather not impose an ordering on
> >> dropping 3.5 and adding 3.8 but consider their merits independently.
> >>
> >>
> >> On Wed, Feb 26, 2020 at 3:16 PM Kyle Weaver 
> wrote:
> >> >
> >> > 5 versions is too many IMO. We've had issues with Python precommit
> resource usage in the past, and adding another version would surely
> exacerbate those issues. And we have also already had to leave out certain
> features on 3.5 [1]. Therefore, I am in favor of dropping 3.5 before adding
> 3.8. After dropping Python 2 and adding 3.8, that will leave us with the
> latest three minor versions (3.6, 3.7, 3.8), which I think is closer to the
> "sweet spot." Though I would be interested in hearing if there are any
> users who would prefer we continue supporting 3.5.
> >> >
> >> > [1]
> https://github.com/apache/beam/blob/8658b95545352e51f35959f38334f3c7df8b48eb/sdks/python/apache_beam/runners/portability/flink_runner.py#L55
> >> >
> >> > On Wed, Feb 26, 2020 at 3:00 PM Valentyn Tymofieiev <
> valen...@google.com> wrote:
> >> >>
> >> >> I would like to start a discussion about identifying a guideline for
> answering questions like:
> >> >>
> >> >> 1. When will Beam support a new Python version (say, Python 3.8)?
> >> >> 2. When will Beam drop support for an old Python version (say,
> Python 3.5)?
> >> >> 3. How many Python versions should we aim to support concurrently
> (investigate issues, have continuous integration tests)?
> >> >> 4. What comes first: adding support for a new version (3.8) or
> deprecating older one (3.5)? This may affect the max load our test
> infrastructure needs to sustain.
> >> >>
> >> >> We are already getting requests for supporting Python 3.8 and there
> were some good reasons[1] to drop support for Python 3.5 (at least, early
> versions of 3.5). Answering these questions would help set expectations in
> Beam user community, Beam dev community, and  may help us establish
> resource requirements for test infrastructure and plan efforts.
> >> >>
> >> >> PEP-0602 [2] establishes a yearly release cycle for Python versions
> starting from 3.9. Each release is a long-term support release and is
> supported for 5 years: first 1.5 years allow for general bug fix support,
> remaining 3.5 years have security fix support.
> >> >>
> >> >> At every point, there may be up to 5 Python minor versions that did
> not yet reach EOL, see "Release overlap with 12 month diagram" [3]. We can
> try to support all of them, but that may come at a cost of velocity: we
> will have more tests to maintain, and we will have to develop Beam against
> a lower version for a longer period. Supporting less versions will have
> implications for user experience. It also may be difficult to ensure
> support of the most recent version early, since our  dependencies (e.g.
> picklers) may not be supporting them yet.
> >> >>
> >> >> Currently we support 4 Python versions (2.7, 3.5, 3.6, 3.7).
> >> >>
> >> >> Is 4 versions a sweet spot? Too much? Too little? What do you think?
> >> >>
> >> >> [1] https://github.com/apache/beam/pull/10821#issuecomment-590167711
> >> >> [2] https://www.python.org/dev/peps/pep-0602/
> >> >> [3] https://www.python.org/dev/peps/pep-0602/#id17
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] New Committer: Kamil Wasilewski

2020-02-28 Thread Udi Meiri
Welcome Kamil!

On Fri, Feb 28, 2020 at 12:53 PM Mark Liu  wrote:

> Congrats, Kamil!
>
> On Fri, Feb 28, 2020 at 12:23 PM Ismaël Mejía  wrote:
>
>> Congratulations Kamil!
>>
>> On Fri, Feb 28, 2020 at 7:09 PM Yichi Zhang  wrote:
>>
>>> Congrats, Kamil!
>>>
>>> On Fri, Feb 28, 2020 at 9:53 AM Valentyn Tymofieiev 
>>> wrote:
>>>
 Congratulations, Kamil!

 On Fri, Feb 28, 2020 at 9:34 AM Pablo Estrada 
 wrote:

> Hi everyone,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Kamil Wasilewski
>
> Kamil has contributed to Beam in many ways, including the performance
> testing infrastructure, and a custom BQ source, along with other
> contributions.
>
> In consideration of his contributions, the Beam PMC trusts him with
> the responsibilities of a Beam committer[1].
>
> Thanks for your contributions Kamil!
>
> Pablo, on behalf of the Apache Beam PMC.
>
> [1] https://beam.apache.org/contribute/become-a-committer
> /#an-apache-beam-committer
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Python Static Typing: Next Steps

2020-03-02 Thread Udi Meiri
Let's go forward with this and see. I volunteer to help as well.

I believe that mypy via pre-commit hook will be faster than 10s since it
only applies to modified files.

On Mon, Mar 2, 2020 at 10:53 AM Robert Bradshaw  wrote:

> +1
>
> We should enable this on jenkins, plus trivial instructions (ideally a
> one-liner tox command) to run it locally. Hopefully the errors will be
> easy enough for contributors to figure out (in particular local to and
> commensurate in complexity with the code that they're editing), and I
> agree it's the only way to keep them accurate (which is a net positive
> for tooling and developers).
>
> Running it as part of a pre-commit hook could be discussed once we
> have a bit more experience (but 10s is certainly on the long side).
>
> On Mon, Mar 2, 2020 at 10:01 AM Luke Cwik  wrote:
> >
> > +1
> >
> > The typing information has really helped me several times figuring out
> that API contracts and expected types.
> >
> > On Mon, Mar 2, 2020 at 9:54 AM Pablo Estrada  wrote:
> >>
> >> I am in favor of enabling the test, and also am happy to start
> answering questions too.
> >> Thanks so much Chad for leading this.
> >> Best
> >> -P.
> >>
> >> On Mon, Mar 2, 2020 at 9:44 AM Chad Dombrova  wrote:
> >>>
> >>> Good news everyone!
> >>> We nearly have the full beam codebase passing in mypy.
> >>>
> >>> As we are now approaching the zero-error event horizon, I'd like to
> open up a discussion around enabling mypy in the PythonLint job.  Every day
> or so a PR is merged that introduces some new mypy errors, so enabling this
> test is the only way I see to keep the annotations accurate and thus useful.
> >>>
> >>> Developer fatigue is a real concern here, since static typing has a a
> steep learning curve, and there are still not a lot of experts to help
> consult on PRs.  Here are some things that I hope will mitigate those
> concerns:
> >>>
> >>> We have a lot of tying coverage, so that means plenty of examples of
> how to solve different types of problems
> >>> Running mypy only takes 10 seconds to complete (if you execute it
> outside of gradle / tox), and that will get better when we get to 0
> errors.  Also, running mypy in daemon mode should speed that up even more
> >>> I have a PR[1] to allow developers to easily (and optionally) setup
> yapf to run in a local git pre-commit hook;  I'd like to do the same for
> mypy.
> >>> I will make myself and members of my team available to help out with
> typing questions in PRs
> >>>
> >>> Is there anyone else on the list who is knowledgable about python
> static typing who would like to volunteer to be flagged on typing questions?
> >>>
> >>> What else can we do to make this transition easier?
> >>>
> >>> [1] https://github.com/apache/beam/pull/10810
> >>>
> >>> -chad
> >>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Python Static Typing: Next Steps

2020-03-02 Thread Udi Meiri
Off-topic: Python lint via pre-commit should be much faster. (I wrote my
own modified-file-only lint in the past)

On Mon, Mar 2, 2020 at 2:08 PM Kyle Weaver  wrote:

> > Python lint takes 4-5mins to complete. I think if the mypy analysis is
> really on the order of 10s, the additional time won't matter and could
> always be enabled.
>
> +1 of course it would be nice to make mypy as fast as possible, but I
> don't think speed needs to be a blocker. The productivity gains we'd get
> from reliable type analysis more than offset the cost IMO.
>
> On Mon, Mar 2, 2020 at 2:03 PM Luke Cwik  wrote:
>
>> Python lint takes 4-5mins to complete. I think if the mypy analysis is
>> really on the order of 10s, the additional time won't matter and could
>> always be enabled.
>>
>> On Mon, Mar 2, 2020 at 1:21 PM Chad Dombrova  wrote:
>>
>>> I believe that mypy via pre-commit hook will be faster than 10s since it
 only applies to modified files.

>>>
>>> Correct, with a few caveats:
>>>
>>>- pre-commit can be setup to only run if a python file changes.  so
>>>modifying a java file won't trigger mypy to run.
>>>- if *any* python file changes mypy has to run on the whole
>>>codebase, because a change to one file can affect the others (i.e. a
>>>function arg type changes).  it's not really meaningful to run mypy on a
>>>single file.
>>>- the mypy daemon tracks which files have changed, and runs
>>>incremental updates.  so if we setup the precommit hook to run the 
>>> daemon,
>>>we should see that get appreciably faster.  I'll do some tests and report
>>>back.
>>>
>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Install Jenkins AnsiColor plugin

2020-03-12 Thread Udi Meiri
https://github.com/apache/beam/pull/2 is out, attempts to colorize
pytest output


On Sun, Mar 8, 2020 at 10:56 AM Chad Dombrova  wrote:

> I don’t believe that it was ever resolved.  I have a PR with a bunch of
> attempts to get it working but I never did figure it out.  IIRC there did
> seem to be some ansi plugin already installed but I couldn’t get it to
> work.
>
> -chad
>
>
> On Sun, Mar 8, 2020 at 10:52 AM Ismaël Mejía  wrote:
>
>> Did this ever happen? If not what is blocking it?
>>
>>
>>
>> On Tue, Oct 22, 2019 at 10:13 PM Udi Meiri  wrote:
>> >
>> > Your proposal will only affect the seed job (which doesn't do color
>> outputs AFAIK).
>> > I think you want to add colorizeOutput() here:
>> >
>> https://github.com/apache/beam/blob/bfebbd0d16361f61fa40bfdec2f0cb6f943f7c9a/.test-infra/jenkins/CommonJobProperties.groovy#L79-L95
>> >
>> > Otherwise no concerns from me.
>> >
>> > On Tue, Oct 22, 2019 at 12:01 PM Chad Dombrova 
>> wrote:
>> >>
>> >> thanks, so IIUC, I’m going to update job_00_seed.groovy like this:
>> >>
>> >>   wrappers {
>> >> colorizeOutput()
>> >> timeout {
>> >>   absolute(60)
>> >>   abortBuild()
>> >> }
>> >>   }
>> >>
>> >> Then add the comment run seed job
>> >>
>> >> Does anyone have any concerns with me trying this out now?
>> >>
>> >> -chad
>> >>
>> >>
>> >> On Tue, Oct 22, 2019 at 11:42 AM Udi Meiri  wrote:
>> >>>
>> >>> Also note that changing the job DSL doesn't take effect until the
>> "seed" job runs. (use the "run seed job" phrase)
>> >>>
>> >>> On Tue, Oct 22, 2019 at 11:06 AM Chad Dombrova 
>> wrote:
>> >>>>
>> >>>> Thanks, I'll look into this.  I have a PR I'm building up with a
>> handful of minor changes related to this.
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, Oct 22, 2019 at 10:45 AM Yifan Zou 
>> wrote:
>> >>>>>
>> >>>>> Thanks, Udi! The ansicolor plugin was applied to ASF Jenkins
>> universally. You might need to explicitly enable the coloroutput in your
>> jenkins dsl.
>> >>>>>
>> >>>>> On Tue, Oct 22, 2019 at 10:33 AM Udi Meiri 
>> wrote:
>> >>>>>>
>> >>>>>> Seems to be already installed:
>> https://issues.apache.org/jira/browse/INFRA-16944
>> >>>>>> Do we just need to enable it somehow?
>> >>>>>> This might work:
>> https://jenkinsci.github.io/job-dsl-plugin/#method/javaposse.jobdsl.dsl.helpers.wrapper.WrapperContext.colorizeOutput
>> >>>>>>
>> >>>>>> BTW, our Jenkins is maintained by ASF's Infrastructure team:
>> https://cwiki.apache.org/confluence/display/INFRA/Jenkins
>> >>>>>>
>> >>>>>> On Tue, Oct 22, 2019 at 10:23 AM Chad Dombrova 
>> wrote:
>> >>>>>>>
>> >>>>>>> Hi all,
>> >>>>>>> As a user trying to grok failures in jenkins I think it would be
>> a huge help to have color output support.  This is something that works out
>> of the box for CI tools like gitlab and travis, and it really helps bring
>> that 21st century feel to your logs :)
>> >>>>>>>
>> >>>>>>> There's a Jenkins plugin for colorizing ansi escape sequences
>> here:
>> >>>>>>> https://plugins.jenkins.io/ansicolor
>> >>>>>>>
>> >>>>>>> I think this is something that has to be deployed by a Jenkins
>> admin.
>> >>>>>>>
>> >>>>>>> -chad
>> >>>>>>>
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Hello Beam Community!

2020-03-13 Thread Udi Meiri
Welcome!


On Fri, Mar 13, 2020 at 9:47 AM Yichi Zhang  wrote:

> Welcome!
>
> On Fri, Mar 13, 2020 at 9:40 AM Ahmet Altay  wrote:
>
>> Welcome Brittany!
>>
>> On Thu, Mar 12, 2020 at 6:32 PM Brittany Hermann 
>> wrote:
>>
>>> Hello Beam Community!
>>>
>>> My name is Brittany Hermann and I recently joined the Open Source team
>>> in Data Analytics at Google. As a Program Manager, I will be focusing on
>>> community engagement while getting to work on Apache Beam and Airflow
>>> projects! I have always thrived on creating healthy, diverse, and overall
>>> happy communities and am excited to bring that to the team. For a fun fact,
>>> I am a big Wisconsin Badgers Football fan and have a goldendoodle puppy
>>> named Ollie!
>>>
>>> I look forward to collaborating with you all!
>>>
>>> Kind regards,
>>>
>>> Brittany Hermann
>>>
>>>
>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Are docker image tags shared within a jenkins worker?

2020-03-26 Thread Udi Meiri
The Python HDFS IT uses the jenkins BUILD_TAG to create unique names:
PROJECT_NAME=$(echo hdfs_IT-${BUILD_TAG:-non-jenkins})

The BUILD_TAG is unique and easily traced back to the Jenkins job that made
it.
It might need some sanitizing though if it contains any invalid characters.

On Tue, Mar 24, 2020 at 1:50 PM Hannah Jiang  wrote:

> This can be done by 1). passing "-Pdocker-tag=xxx" to the test and 2).
> make sure to specify the custom tag when using docker images.
> For example, *:sdks:python:test-suites:portable:py35:preCommitPy35
> -Pdocker-tag=20200324 *will create an image with a tag 20200324.
> *--environment_config=path/to/container/image* pipeline option can be
> used for Python pipeline to pass custom docker images.
>
>
>
> On Tue, Mar 24, 2020 at 11:42 AM Brian Hulette 
> wrote:
>
>> Failing run:
>> https://builds.apache.org/job/beam_PostCommit_XVR_Flink_PR/65/
>> Passing run:
>> https://builds.apache.org/job/beam_PostCommit_XVR_Flink_PR/66/
>>
>> On Tue, Mar 24, 2020 at 11:33 AM Hannah Jiang 
>> wrote:
>>
>>> Hi Brian
>>>
>>> I think that's possible if we use the default tag for the Jenkins tests.
>>> To prevent this, we can use a customized tag, for example, timestamp, for
>>> each build.
>>> Can you please point me to the failing tests? I will check more details.
>>>
>>> Thanks,
>>> Hannah
>>>
>>>
>>> On Tue, Mar 24, 2020 at 10:11 AM Brian Hulette 
>>> wrote:
>>>
 I ran into a test failure on the XVR tests in [1] which looked like the
 test was executing with a python docker container that did _not_ include
 the python changes in my PR. The test ran successfully after a second run.

 It seems likely that the initial failure occurred because some other
 job was running concurrently on the same jenkins worker and overwrote the `
 apache/beam_python2.7_sdk:2.21.0.dev` image that my run had generated.
 Is this possible? If so, is there something we should do to isolate these
 images?

 [1] https://github.com/apache/beam/pull/10055

>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [BEAM-9322] Python SDK discussion on correct output tag names

2020-03-26 Thread Udi Meiri
On Thu, Mar 26, 2020 at 10:13 AM Luke Cwik  wrote:

> The issue seems to be that a PCollection can have a "tag" associated with
> it and PTransform expansion can return an arbitrary nested dictionary/tuple
> yet we need to figure out what the user wanted as the local name for the
> PCollection from all this information.
>
> Will this break people who rely on the generated PCollection output tags?
> One big question is whether a composite transform cares about the name
> that is used. For primitive transforms such as ParDo, this is very much a
> yes because the pickled code likely references that name in some way. Some
> composites could have the same need where the payload that is stored as
> part of the composite references these local names and hence we have to
> tell people how to instruct the SDK during transform expansion about what
> name will be used unambiguously (as long as we document and have tests
> around this we can choose from many options). Finally, in the XLang world,
> we need to preserve the names that were provided to us and not change them;
> which is more about making the Python SDK handle XLang transform expansion
> carefully.
>
> Am I missing edge cases?
> Concatenation of strings leads to collisions if the delimiter character is
> used within the tags or map keys. You could use an escaping encoding to
> guarantee that the concatenation always generates unique names.
>
> Some alternatives I thought about were:
> * Don't allow arbitrary nestings returned during expansion, force
> composite transforms to always provide an unambiguous name (either a tuple
> with PCollections with unique tags or a dictionary with untagged
> PCollections or a singular PCollection (Java and Go SDKs do this)).
>

I believe that aligning with Java and Go would be the right way to go here.
I don't know if this would limit expressiveness.


> * Have a "best" effort naming system (note the example I give can have
> many of the "rules" re-ordered) e.g. if all the PCollection tags are unique
> then use only them, followed by if a flat dictionary is returned then use
> only the keys as names, followed by if a flat tuple is returned then use
> indices, and finally fallback to the hierarchical naming scheme.
>
>
> On Tue, Mar 24, 2020 at 1:07 PM Sam Rohde  wrote:
>
>> Hi All,
>>
>> *Problem*
>> I would like to discuss BEAM-9322
>>  and the
>> correct way to set the output tags of a transform with nested PCollections,
>> e.g. a dict of PCollections, a tuple of dicts of PCollections. Before the
>> fixing of BEAM-1833 ,
>> the Python SDK when applying a PTransform would auto-generate the output
>> tags for the output PCollections even if they are manually set by the user:
>>
>> class MyComposite(beam.PTransform):
>>   def expand(self, pcoll):
>> a = PCollection.from_(pcoll)
>> a.tag = 'a'
>>
>> b = PCollection.from_(pcoll)
>> b.tag = 'b'
>> return (a, b)
>>
>> would yield a PTransform with two output PCollection and output tags with
>> 'None' and '0' instead of 'a' and 'b'. This was corrected for simple cases
>> like this. However, this fails when the PCollections share the same output
>> tag (of course). This can happen like so:
>>
>> class MyComposite(beam.PTransform):
>>   def expand(self, pcoll):
>> partition_1 = beam.Partition(pcoll, ...)
>> partition_2 = beam.Partition(pcoll, ...)
>> return (partition_1[0], partition_2[0])
>>
>> With the new code, this leads to an error because both output
>> PCollections have an output tag of '0'.
>>
>> *Proposal*
>> When applying PTransforms to a pipeline (pipeline.py:550) we name the
>> PCollections according to their position in the tree concatenated with the
>> PCollection tag and a delimiter. From the first example, the output
>> PCollections of the applied transform will be: '0.a' and '1.b' because it
>> is a tuple of PCollections. In the second example, the outputs should be:
>> '0.0' and '1.0'. In the case of a dict of PCollections, it should simply be
>> the keys of the dict.
>>
>> What do you think? Am I missing edge cases? Will this be unexpected to
>> users? Will this break people who rely on the generated PCollection output
>> tags?
>>
>> Regards,
>> Sam
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Implementing type hints on multi-output PTransforms

2020-03-31 Thread Udi Meiri
Hi Joshua,
I've been working on type hints recently.
Your issue is similar to this:
https://issues.apache.org/jira/browse/BEAM-8782 (exactly the same if tags
are passed to with_outputs() in the example).
There's also this related bug about type inference:
https://issues.apache.org/jira/browse/BEAM-4132

I agree with Luke that it would be helpful to point to a workaround in the
error message (such as removing with_output_types).

>From what I remember, we'll need to formalize how multi-output type hints
are provided to Beam.
For example, by passing keywords to with_output_types: main=type, TAG=type,
etc.

On Tue, Mar 31, 2020 at 9:55 AM Luke Cwik  wrote:

> I can see that argument but what does a user need to do in this case if we
> raise NotImplementedError? Would the need to disable type checking
> everywhere?
>
> Over the long term users will need to deal with improvements to type
> checking and will need to fix typing errors when they change Apache Beam
> versions.
>
>
> On Tue, Mar 31, 2020 at 9:34 AM Joshua B. Harrison <
> josh.harri...@gmail.com> wrote:
>
>> The current code errors out with a cryptic message around tag types in
>> the multi-output. Adding a NotImplementedError was just an attempt to make
>> the failure reason more clear.
>>
>> I would be worried about trivially passing because then the user might
>> think they have type checking safety when they don't, which could cause
>> failures at later stages and might be hard to debug. Do you agree?
>>
>> Best,
>> Joshua
>>
>> On Tue, Mar 31, 2020 at 10:16 AM Luke Cwik  wrote:
>>
>>> Would the NotImplementedError cause users pipeline errors or is that a
>>> signal to the type checking mechanism to ignore it?
>>> If this would cause failures I would rather make the unsupported case
>>> return something that would be trivially true.
>>>
>>> On Mon, Mar 30, 2020 at 12:01 PM Joshua B. Harrison <
>>> josh.harri...@gmail.com> wrote:
>>>
 Hey all,

 I brought up an issue recently on the user forums noting issues around
 type hints and multi-output PTransforms:
 https://lists.apache.org/thread.html/r94bf2e43f09a290dbe87d5a8d7eedb34ea215e0bea861521cbdb0c1c%40%3Cuser.beam.apache.org%3E

 As mentioned there, I think that a NotImplementedError should be raised
 when attaching type hints to multi-output PTransforms while the correct
 implementation is figured out. And that a 'correct' implementation would
 look something like the Union typehints that are expected on multi-input
 PTransforms.

 I am happy to help out and wanted to get the discussion started around
 what the community would like to see here. Thank you all for a great
 product.

 Best,
 Joshua

 --
 Joshua Harrison |  Software Engineer |  joshharri...@gmail.com
  |  404-433-0242 <(404)%20433-0242>

>>>
>>
>> --
>> Joshua Harrison |  Software Engineer |  joshharri...@gmail.com
>>  |  404-433-0242 <(404)%20433-0242>
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [DISCUSS] Let's establish a guideline for using Python type annotations in Beam codebase

2020-04-13 Thread Udi Meiri
I agree with Robert to only put Any where it really can be any type.

I'm not sure how much typing we should add. At minimum: external APIs and
wherever mypy complains.
Ideally I would like to have annotations everywhere, because this reduces
uncertainty when modifying existing code.
You are increasing test coverage (e.g. via mypy) when you add type
annotations.

OTOH, the tradeoff is that adding types changes structural (duck) typing to
nominal.
To keep using structural typing you can define a custom Protocol type,
which slightly increases code complexity and may be a barrier to entry for
new developers.


On Mon, Apr 13, 2020 at 12:11 PM Robert Bradshaw 
wrote:

> On Mon, Apr 13, 2020 at 11:48 AM Valentyn Tymofieiev 
> wrote:
>
>>
>> On Mon, Apr 13, 2020 at 10:53 AM Robert Bradshaw 
>> wrote:
>>
>>> On Mon, Apr 13, 2020 at 10:38 AM Valentyn Tymofieiev <
>>> valen...@google.com> wrote:
>>>
 To clarify, I don't suggest that every variable should have a defined
 type that doesn't change. However, I'd like to establish a culture where we
 consistently add type annotations when we write new code. Where type is
 defined gradually, we can use flexible annotations: "# type: (Any) -> Any"
 or something like that.

>>>
>>> IMHO we should use such annotations when the inputs/outputs are truly
>>> Any, or at least a wide enough variety of types that it's not worth the
>>> effort to be more explicit.
>>>
>> Agreed.
>>
>>>
>>>
 We can argue that there is a point of diminishing returns as well, and
 this is a valid point too. A possible  tradeoff may be to
 require  annotations, docstrings or both in *most* functions/methods.
 Possible definition of 'most' - all functions unless they meet three of the
 following criteria[1]:
 - not externally visible
 - very short
 - obvious

 [1]
 http://google.github.io/styleguide/pyguide.html#383-functions-and-methods
 

>>>
>>> While I'd be open to this. Let's get the type checkers enabled in
>>> presubmit and see what it takes to keep those happy before establishing
>>> more strict criterea.
>>>
>> That's reasonable, thanks for your feedback. Is there a JIRA issue
>> tracking this effort?
>>
>
> https://issues.apache.org/jira/browse/BEAM-7746
>
>
>> (It does sound like we have consensus on using type comments until 2.7 is
>>> dropped.)
>>>
>>>
 On Fri, Apr 10, 2020 at 4:56 PM Robert Bradshaw 
 wrote:

> On Fri, Apr 10, 2020 at 4:00 PM Valentyn Tymofieiev <
> valen...@google.com> wrote:
>
>> My preference is also for type-comments for now.
>>
>> Is it possible to configure the type checkers that we use to require
>> type-comments in new code?
>>
>
> My personal opinion is that there comes a point where there's
> diminishing return on explicitly typing everything (there's a reason 
> people
> choose Python over Java) which is one of the big selling points of gradual
> typing, but before we can consider this the first step is to simply enable
> the type checkers on presubmit (IIRC we're really close).
>
>
>> On Fri, Apr 10, 2020 at 1:46 PM Robert Bradshaw 
>> wrote:
>>
>>> I prefer type-comments, as they can be validated by type checkers.
>>> Once we drop 2.7, we can go with actual type annotations (and the 
>>> comments
>>> can be automatically converted over).
>>>
>>> On Fri, Apr 10, 2020 at 11:17 AM Valentyn Tymofieiev <
>>> valen...@google.com> wrote:
>>>
 I am seeing several styles we use to annotate non-pipeline code in
 Beam codebase:

 - informal docstring comments:
 file_pattern (str): the file glob to read,
 assign_context: Instance of AssignContext,
 - type comments like # type: (...) -> iobase.RestrictionTracker
 - pydoc-style annotation: A :class:`PTransform` object .

 It may be  a good idea to create a guideline which style to use
 when, that we can point at in code reviews, and be more consistent.

 Please suggest your opinions and preferences.

 Thanks

>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Implementing type hints on multi-output PTransforms

2020-04-13 Thread Udi Meiri
On Tue, Mar 31, 2020 at 10:16 AM Joshua B. Harrison 
wrote:

> Ok - that makes sense. My specific workaround was to remove the
> with_output_types for now, so advising the user on this in the error
> message would be nice. I was just worried about silently passing.
>
> As for the formalization:
>
>1. I am a little confused on how this is different than passing
>multiple tagged inputs to a PTransform that does a CoGroupBy*. In this
>case, with_input_types seems to expect a union of all the types for the
>keyed values. Why would the same not work for output types?
>
> Since the outputs are distinct PCollections (as in the example
in BEAM-4132) they each have their own element type. If one of these
PCollections is then passed as an input to a transform, our type checking
is more precise if we use the type of just that pcoll instead of the union
of all pcoll types returned.

>
>1. What is the process for proposing a formalized solution? Should I
>start a document, or does one already exist? Or does this kind of thing get
>tracked via Jira issues?
>
> In this case I think an email titled "[PROPSAL] ..." to this mailing list
describing what you want to change should be enough. A document could also
work; I'm not aware of one that touches on this.
Your PR would need to have an associated JIRA (feel free to take over any
of the ones I've mentioned).


> Best,
> Joshua
>
> On Tue, Mar 31, 2020 at 11:07 AM Udi Meiri  wrote:
>
>> Hi Joshua,
>> I've been working on type hints recently.
>> Your issue is similar to this:
>> https://issues.apache.org/jira/browse/BEAM-8782 (exactly the same if
>> tags are passed to with_outputs() in the example).
>> There's also this related bug about type inference:
>> https://issues.apache.org/jira/browse/BEAM-4132
>>
>> I agree with Luke that it would be helpful to point to a workaround in
>> the error message (such as removing with_output_types).
>>
>> From what I remember, we'll need to formalize how multi-output type hints
>> are provided to Beam.
>> For example, by passing keywords to with_output_types: main=type,
>> TAG=type, etc.
>>
>> On Tue, Mar 31, 2020 at 9:55 AM Luke Cwik  wrote:
>>
>>> I can see that argument but what does a user need to do in this case if
>>> we raise NotImplementedError? Would the need to disable type checking
>>> everywhere?
>>>
>>> Over the long term users will need to deal with improvements to type
>>> checking and will need to fix typing errors when they change Apache Beam
>>> versions.
>>>
>>>
>>> On Tue, Mar 31, 2020 at 9:34 AM Joshua B. Harrison <
>>> josh.harri...@gmail.com> wrote:
>>>
>>>> The current code errors out with a cryptic message around tag types in
>>>> the multi-output. Adding a NotImplementedError was just an attempt to make
>>>> the failure reason more clear.
>>>>
>>>> I would be worried about trivially passing because then the user might
>>>> think they have type checking safety when they don't, which could cause
>>>> failures at later stages and might be hard to debug. Do you agree?
>>>>
>>>> Best,
>>>> Joshua
>>>>
>>>> On Tue, Mar 31, 2020 at 10:16 AM Luke Cwik  wrote:
>>>>
>>>>> Would the NotImplementedError cause users pipeline errors or is that a
>>>>> signal to the type checking mechanism to ignore it?
>>>>> If this would cause failures I would rather make the unsupported case
>>>>> return something that would be trivially true.
>>>>>
>>>>> On Mon, Mar 30, 2020 at 12:01 PM Joshua B. Harrison <
>>>>> josh.harri...@gmail.com> wrote:
>>>>>
>>>>>> Hey all,
>>>>>>
>>>>>> I brought up an issue recently on the user forums noting issues
>>>>>> around type hints and multi-output PTransforms:
>>>>>> https://lists.apache.org/thread.html/r94bf2e43f09a290dbe87d5a8d7eedb34ea215e0bea861521cbdb0c1c%40%3Cuser.beam.apache.org%3E
>>>>>>
>>>>>> As mentioned there, I think that a NotImplementedError should be
>>>>>> raised when attaching type hints to multi-output PTransforms while the
>>>>>> correct implementation is figured out. And that a 'correct' 
>>>>>> implementation
>>>>>> would look something like the Union typehints that are expected on
>>>>>> multi-input PTransforms.
>>>>>>
>>>>>> I am happy to help out and wanted to get the discussion started
>>>>>> around what the community would like to see here. Thank you all for a 
>>>>>> great
>>>>>> product.
>>>>>>
>>>>>> Best,
>>>>>> Joshua
>>>>>>
>>>>>> --
>>>>>> Joshua Harrison |  Software Engineer |  joshharri...@gmail.com
>>>>>>  |  404-433-0242 <(404)%20433-0242>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Joshua Harrison |  Software Engineer |  joshharri...@gmail.com
>>>>  |  404-433-0242 <(404)%20433-0242>
>>>>
>>>
>
> --
> Joshua Harrison |  Software Engineer |  joshharri...@gmail.com
>  |  404-433-0242 <(404)%20433-0242>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Website publish jobs fail recently

2020-04-14 Thread Udi Meiri
Hey, I was looking at this today but could not figure it out.
The machines we run the publish jobs probably vary from our regular
apache-beam-testing Jenkins ones.
I tried researching all the reasons why this might be happening but came up
empty.

On Tue, Apr 14, 2020 at 10:19 AM Kyle Weaver  wrote:

> I think Udi is fixing it. Jira:
> https://issues.apache.org/jira/browse/BEAM-9737
>
> On Tue, Apr 14, 2020 at 1:11 PM Mikhail Gryzykhin 
> wrote:
>
>> Hi all,
>>
>> Have anyone seen the following error of website publish?
>> 
>>
>> *16:33:47* jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir -
>> /repo/build/website/generated-local-content/security
>>
>> I tried the failed target locally and it succeeded. Seems there's some
>> issue with jenkins configuration.
>>
>> Regards,
>> Mikhail.
>>
>>
>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: sdks:java:container:generateThirdPartyLicenses effect on build time / stability

2020-04-15 Thread Udi Meiri
If this process is used in releases we would benefit from running it
regularly to ensure it isn't broken and thus delay releases (and add work
for the release manager).
Does it make sense to put it in postcommit?

On Wed, Apr 15, 2020 at 2:30 PM Kyle Weaver  wrote:

> Looks like the same error as this Jira:
> https://issues.apache.org/jira/browse/BEAM-9764
>
> Even if/when we are able to fix this particular issue, I agree it is best
> not to run this job except for releases because of the inherent network
> cost and possible reliability issues. +Hannah Jiang
>  What do you think?
>
> On Wed, Apr 15, 2020 at 5:20 PM Thomas Weise  wrote:
>
>> The new feature to assemble licenses is very useful but appears to add
>> several minutes (7-8?)  build time to jobs that need to build a container.
>>
>> Does it also seem to cause occasional build failures?
>>
>> https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Phrase/131/
>>
>> Would it be possible to perform this task only during release builds?
>>
>> Thanks,
>> Thomas
>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Website publish jobs fail recently

2020-04-16 Thread Udi Meiri
I emailed them yes

On Thu, Apr 16, 2020, 11:09 Ahmet Altay  wrote:

> Did we end up emailing builds@ or should this be filed as a ticket to
> infra ?
>
> +Rui Wang  -- This is preventing Rui from updating the
> website post 2.20 release.
>
> On Tue, Apr 14, 2020 at 9:01 PM Kenneth Knowles  wrote:
>
>> Indeed, publish jobs have write access to things, which normal builds do
>> not.
>>
>> I suggest reaching out to bui...@apache.org
>>
>> Kenn
>>
>> On Tue, Apr 14, 2020 at 5:58 PM Udi Meiri  wrote:
>>
>>> Hey, I was looking at this today but could not figure it out.
>>> The machines we run the publish jobs probably vary from our regular
>>> apache-beam-testing Jenkins ones.
>>> I tried researching all the reasons why this might be happening but came
>>> up empty.
>>>
>>> On Tue, Apr 14, 2020 at 10:19 AM Kyle Weaver 
>>> wrote:
>>>
>>>> I think Udi is fixing it. Jira:
>>>> https://issues.apache.org/jira/browse/BEAM-9737
>>>>
>>>> On Tue, Apr 14, 2020 at 1:11 PM Mikhail Gryzykhin 
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Have anyone seen the following error of website publish?
>>>>> <https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PostCommit_Website_Publish/6018/console>
>>>>>
>>>>> *16:33:47* jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir -
>>>>> /repo/build/website/generated-local-content/security
>>>>>
>>>>> I tried the failed target locally and it succeeded. Seems there's some
>>>>> issue with jenkins configuration.
>>>>>
>>>>> Regards,
>>>>> Mikhail.
>>>>>
>>>>>
>>>>>
>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Website publish jobs fail recently

2020-04-16 Thread Udi Meiri
I did get a response but I didn't see it because I'm not subscribed to
builds@.

This is the original announcement of the breaking change:

https://lists.apache.org/thread.html/r00c669dd82bbde47958e81ecb330116de131d774e2d4df26a06fe92f@%3Cbuilds.apache.org%3E


On Thu, Apr 16, 2020 at 11:14 AM Udi Meiri  wrote:

> I emailed them yes
>
> On Thu, Apr 16, 2020, 11:09 Ahmet Altay  wrote:
>
>> Did we end up emailing builds@ or should this be filed as a ticket to
>> infra ?
>>
>> +Rui Wang  -- This is preventing Rui from updating
>> the website post 2.20 release.
>>
>> On Tue, Apr 14, 2020 at 9:01 PM Kenneth Knowles  wrote:
>>
>>> Indeed, publish jobs have write access to things, which normal builds do
>>> not.
>>>
>>> I suggest reaching out to bui...@apache.org
>>>
>>> Kenn
>>>
>>> On Tue, Apr 14, 2020 at 5:58 PM Udi Meiri  wrote:
>>>
>>>> Hey, I was looking at this today but could not figure it out.
>>>> The machines we run the publish jobs probably vary from our regular
>>>> apache-beam-testing Jenkins ones.
>>>> I tried researching all the reasons why this might be happening but
>>>> came up empty.
>>>>
>>>> On Tue, Apr 14, 2020 at 10:19 AM Kyle Weaver 
>>>> wrote:
>>>>
>>>>> I think Udi is fixing it. Jira:
>>>>> https://issues.apache.org/jira/browse/BEAM-9737
>>>>>
>>>>> On Tue, Apr 14, 2020 at 1:11 PM Mikhail Gryzykhin 
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Have anyone seen the following error of website publish?
>>>>>> <https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PostCommit_Website_Publish/6018/console>
>>>>>>
>>>>>> *16:33:47* jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir -
>>>>>> /repo/build/website/generated-local-content/security
>>>>>>
>>>>>> I tried the failed target locally and it succeeded. Seems there's
>>>>>> some issue with jenkins configuration.
>>>>>>
>>>>>> Regards,
>>>>>> Mikhail.
>>>>>>
>>>>>>
>>>>>>
>>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] Beam 2.20.0 Released

2020-04-24 Thread Udi Meiri
You'll need to add --tags

On Fri, Apr 24, 2020 at 11:53 AM Jan Lukavský  wrote:

> Hm, that is strange:
>
> ~/git/apache/beam$ git status
> On branch master
> Your branch is up to date with 'origin/master'.
> ~/git/apache/beam$ git pull
> Already up to date.
> ~/git/apache/beam$ git tag | grep v2.19.0
> v2.19.0
> v2.19.0-RC1
> ~/git/apache/beam$ git tag | grep v2.20.0
> ~/git/apache/beam$
>
> I'm obviously missing something.
>
> Jan
> On 4/24/20 7:01 PM, Thomas Weise wrote:
>
> Here is the release tag:
> https://github.com/apache/beam/releases/tag/v2.20.0
>
>
> On Fri, Apr 24, 2020 at 9:28 AM Kyle Weaver  wrote:
>
>> > Is is possible we are missing git tag for this release? I cannot find
>> it.
>>
>> You mean https://github.com/apache/beam/tree/release-2.20.0?
>>
>> On Fri, Apr 24, 2020 at 9:04 AM Jan Lukavský  wrote:
>>
>>> Hi Rui,
>>>
>>> thanks making for this release! Is is possible we are missing git tag
>>> for this release? I cannot find it.
>>>
>>> Thanks,
>>>
>>>  Jan
>>> On 4/16/20 8:47 PM, Rui Wang wrote:
>>>
>>> Note that due to a bug on infrastructure, the website change failed to
>>> publish. But 2.20.0 artifacts are available to use right now.
>>>
>>>
>>>
>>> -Rui
>>>
>>> On Thu, Apr 16, 2020 at 11:45 AM Rui Wang  wrote:
>>>
 The Apache Beam team is pleased
 to announce the release of version 2.20.0.

 Apache Beam is an open source unified programming model to define and
 execute data processing pipelines, including ETL, batch and stream
 (continuous) processing. See https://beam.apache.org

 You can download the release here:

 https://beam.apache.org/get-started/downloads/

 This release includes bug fixes, features, and improvements detailed on
 the Beam blog: https://beam.apache.org/blog/2020/04/15/beam-2.20.0.html

 Thanks to everyone who contributed to this release, and we hope you
 enjoy
 using Beam 2.20.0.
 -- Rui Wang, on behalf of The Apache Beam team

>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Jira PR links not being generated?

2020-04-27 Thread Udi Meiri
We had such a file for a short while but it was removed:
https://github.com/apache/beam/pull/10645
I don't believe it contained any PR link settings though
+Pablo Estrada 

On Mon, Apr 27, 2020 at 1:56 PM Kyle Weaver  wrote:

> I went ahead and filed https://issues.apache.org/jira/browse/BEAM-9833 since
> it looks like this is how things will be done from now on. Which raises the
> question, does anyone know how Beam managed these settings before? Or were
> there previously no project-level controls?
>
> On Mon, Apr 27, 2020 at 4:39 PM Kyle Weaver  wrote:
>
>> Thanks for the pointer Kenn. I searched existing INFRA issues and found
>> [1] (among others). Looks like we may need to add a .asf.yaml file [2]. I
>> guess infra must have changed this recently without us picking up on it?
>> 
>>
>> [1] https://issues.apache.org/jira/browse/INFRA-20171
>> [2]
>> https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories#id-.asf.yamlfeaturesforgitrepositories-Notificationsettingsforrepositories
>>
>> On Mon, Apr 27, 2020 at 4:25 PM Kenneth Knowles  wrote:
>>
>>> I suggest filing an issue with INFRA.
>>>
>>> Kenn
>>>
>>> On Fri, Apr 24, 2020 at 10:12 AM Kyle Weaver 
>>> wrote:
>>>
 Hi all,

 I've noticed links from Jira issues to related Github PRs have not been
 generated the past few days. Does anyone know why?

 Kyle

>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: JIRA Committer Permissions

2020-04-27 Thread Udi Meiri
Should this step be added to our new committer guide?

On Fri, Apr 24, 2020 at 6:21 PM Luke Cwik  wrote:

> I noticed that several committers only had contributor level permissions
> and I went and updated your account permissions for the Beam project to be
> committer level. Feel free to let me know If you run into any issues.
>
> There were about ~25 accounts like this.
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Jira PR links not being generated?

2020-04-27 Thread Udi Meiri
I think it's a recent change. The page https://s.apache.org/asfyaml-notify
was updated last week but I didn't see an announcement.

On Mon, Apr 27, 2020 at 3:34 PM Kyle Weaver  wrote:

> I made a PR for this, though I still haven't found sufficient explanation
> as to why we did not need this file last week, and now we do this week.
> https://github.com/apache/beam/pull/11541
>
> On Mon, Apr 27, 2020 at 6:24 PM Udi Meiri  wrote:
>
>> We had such a file for a short while but it was removed:
>> https://github.com/apache/beam/pull/10645
>> I don't believe it contained any PR link settings though
>> +Pablo Estrada 
>>
>> On Mon, Apr 27, 2020 at 1:56 PM Kyle Weaver  wrote:
>>
>>> I went ahead and filed https://issues.apache.org/jira/browse/BEAM-9833 since
>>> it looks like this is how things will be done from now on. Which raises the
>>> question, does anyone know how Beam managed these settings before? Or were
>>> there previously no project-level controls?
>>>
>>> On Mon, Apr 27, 2020 at 4:39 PM Kyle Weaver  wrote:
>>>
>>>> Thanks for the pointer Kenn. I searched existing INFRA issues and found
>>>> [1] (among others). Looks like we may need to add a .asf.yaml file [2]. I
>>>> guess infra must have changed this recently without us picking up on it?
>>>> <https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories#id-.asf.yamlfeaturesforgitrepositories-Notificationsettingsforrepositories>
>>>>
>>>> [1] https://issues.apache.org/jira/browse/INFRA-20171
>>>> [2]
>>>> https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories#id-.asf.yamlfeaturesforgitrepositories-Notificationsettingsforrepositories
>>>>
>>>> On Mon, Apr 27, 2020 at 4:25 PM Kenneth Knowles 
>>>> wrote:
>>>>
>>>>> I suggest filing an issue with INFRA.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Fri, Apr 24, 2020 at 10:12 AM Kyle Weaver 
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I've noticed links from Jira issues to related Github PRs have not
>>>>>> been generated the past few days. Does anyone know why?
>>>>>>
>>>>>> Kyle
>>>>>>
>>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Jenkins jobs not running for my PR 10438

2020-04-28 Thread Udi Meiri
Alexey, what you're doing should be working (commits should trigger tests,
as should "retest this please" and other phrases).

https://issues.apache.org/jira/browse/INFRA-19836 tracks this issue

On Tue, Apr 28, 2020 at 10:04 AM Alexey Romanenko 
wrote:

> Does anyone know the “golden rule” how to trigger Jenkins tests?
>
> For example:
> https://github.com/apache/beam/pull/11341
> I tried several times and it’s still not triggered.
>
> On 28 Apr 2020, at 13:33, Ismaël Mejía  wrote:
>
> done
>
> On Tue, Apr 28, 2020 at 12:47 PM Shoaib Zafar <
> shoaib.za...@venturedive.com> wrote:
>
>> Hello Beam Committers,
>>
>> I would appreciate if you could trigger precommit checks for the PR:
>> https://github.com/apache/beam/pull/11210 along with the python
>> post-commit check (Run Python 3.5 PostCommit).
>>
>> Thanks and Regards.
>>
>> *Shoaib Zafar*
>> Software Engineering Lead
>> Mobile: +92 333 274 6242
>> Skype: live:shoaibzafar_1
>>
>> 
>>
>>
>> On Wed, Apr 22, 2020 at 9:40 PM Rehman Murad Ali <
>> rehman.murad...@venturedive.com> wrote:
>>
>>> Hello Beam Committers.
>>>
>>> Would you please trigger basic tests as well as all *validatesRunner*
>>> test on this PR:
>>> https://github.com/apache/beam/pull/11154 
>>> 
>>>
>>>
>>> *Thanks & Regards*
>>>
>>>
>>>
>>> *Rehman Murad Ali*
>>> Software Engineer
>>> Mobile: +92 3452076766 <+92%20345%202076766>
>>> Skype: rehman.muradali
>>>
>>>
>>> On Wed, Apr 22, 2020 at 9:25 PM Yoshiki Obata 
>>> wrote:
>>>
 Hello Beam Committers,

 I would appreciate if you could trigger precommit checks for these PRs;
 https://github.com/apache/beam/pull/11493
 https://github.com/apache/beam/pull/11494

 Regards
 yoshiki

 2020年4月21日(火) 1:11 Luke Cwik :

> The precommits started and I provided the comments for the postcommits
> as you have requested but they have yet to start.
>
> On Mon, Apr 20, 2020 at 8:31 AM Shoaib Zafar <
> shoaib.za...@venturedive.com> wrote:
>
>> Hello Beam Committers.
>>
>> Would you please trigger the pre-commit checks on the PR:
>> https://github.com/apache/beam/pull/11210 along with the python
>> post-commit checks (Run Python PostCommit, Run Python 3.5 PostCommit)?
>>
>> Thanks! Regards,
>>
>> *Shoaib Zafar*
>> Software Engineering Lead
>> Mobile: +92 333 274 6242
>> Skype: live:shoaibzafar_1
>>
>> 
>>
>>
>> On Fri, Apr 17, 2020 at 1:19 PM Ismaël Mejía 
>> wrote:
>>
>>> done
>>>
>>> On Thu, Apr 16, 2020 at 4:32 PM Rehman Murad Ali <
>>> rehman.murad...@venturedive.com> wrote:
>>>
 Hello Beam Committers.

 Would you please trigger basic tests as well as validatesRunner
 test on this PR:

 
 https://github.com/apache/beam/pull/11350


 *Thanks & Regards*



 *Rehman Murad Ali*
 Software Engineer
 Mobile: +92 3452076766 <+92%20345%202076766>
 Skype: rehman.muradali


 On Mon, Apr 13, 2020 at 10:16 PM Ahmet Altay 
 wrote:

> Done.
>
> On Mon, Apr 13, 2020 at 8:52 AM Shoaib Zafar <
> shoaib.za...@venturedive.com> wrote:
>
>> Hello Beam Committers.
>>
>> Would you please trigger the pre-commit checks on the PR:
>> https://github.com/apache/beam/pull/11210 along with the python
>> post-commit checks (Run Python PostCommit, Run Python 3.5 
>> PostCommit)?
>>
>> Thanks!
>>
>> *Shoaib Zafar*
>> Software Engineering Lead
>> Mobile: +92 333 274 6242
>> Skype: live:shoaibzafar_1
>>
>> 
>>
>>
>> On Mon, Apr 13, 2020 at 4:00 PM Ismaël Mejía 
>> wrote:
>>
>>> done
>>>
>>> On Mon, Apr 13, 2020 at 12:42 PM Rehman Murad Ali
>>>  wrote:
>>> >
>>> > Hi Beam Committers!
>>> >
>>> > Thanks( Ismael )
>>> >
>>> > I appreciate if someone could trigger these tests on this PR
>>> https://github.com/apache/beam/pull/11154
>>> >
>>> > run dataflow validatesrunner
>>> > run flink validatesrunner
>>> > Run Java Flink PortableValidatesRunner Streaming
>>> >
>>> > Thanks
>>> >
>>> >
>>> >
>>> > Rehman Murad Ali
>>> > Software Engineer
>>> > Mobile: +92 3452076766 <+92%20345%202076766>
>>> > Skype: rehman.muradali
>>> >
>>> >
>>> >
>>> > On Wed, Apr 1, 2020 at 1:19 PM Ismaël Mejía 
>>> wrote:
>>> >>
>>> >> done
>>> >>
>>> >> On Wed, Ap

Re: How to submit PRs for dependant changes?

2020-04-28 Thread Udi Meiri
(a) or (c) should work. (c) is preferred if you want faster reviews.

For multiple JIRAs, I've seen both [BEAM-123,BEAM-456] and
[BEAM-123][BEAM-456] formats. One of them works but I'm not sure which. :D
You can always manually add a PR to a JIRA.



On Sun, Apr 26, 2020 at 2:49 PM Reuven Lax  wrote:

> For c), I don't think you need merge resolutions. You can submit each
> commit in a separate PR, and rebase your branch after each one.
>
> On Sun, Apr 26, 2020 at 10:25 AM Niel Markwick  wrote:
>
>>
>> Hey Beam devs...
>>
>> I have 4 changes to submit as PRs to fix 4 independent issues in the
>> io.gcp.SpannerIO class.
>>
>> The PRs are notionally independent, but will cause merge conflicts if
>> submitted separately, as the fix for each issue will change code related to
>> the fix for some of the others.
>>
>> How do you prefer the PRs to be submitted?
>>
>> a) one single PR with 4 sequential commits within it
>> b) one single PR with all changes squashed.
>> c) 4 separate conflicting PRs which will have to be merged separately,
>> and a merge conflict resolution after each one.
>>
>> a) is how it is in my repo.
>> b) would be easy, but less clear what the changes were for.
>> c) I guess would be clearest in the Beam changelog.
>>
>> If the answer is a) or b), how would I specify multiple JIRA tickets in
>> the PR title?
>>
>> Thanks!
>>
>> --
>> 
>> * •  **Niel Markwick*
>> * •  *Cloud Solutions Architect 
>> * •  *Google Belgium
>> * •  *ni...@google.com
>> * •  *+32 2 894 6771
>>
>>
>> Google Belgium NV/SA, Steenweg op Etterbeek 180, 1040 Brussel, Belgie. RPR: 
>> 0878.065.378
>>
>> If you have received this communication by mistake, please don't forward
>> it to anyone else (it may contain confidential or privileged information),
>> please erase all copies of it, including all attachments, and please let
>> the sender know it went to the wrong person. Thanks
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Jenkins test triggers

2020-04-28 Thread Udi Meiri
Hi,
Has anyone noticed any changes today (since ~6h ago) to how tests are
triggered on PRs?

Are they triggering always, some of the time, not at all?
Are phrase comment triggers working?

Thanks.
(background: https://issues.apache.org/jira/browse/INFRA-19836)


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Python 3.7 docker container fails to build

2020-04-30 Thread Udi Meiri
I checked node 8 and it had over 40GB space available. Does your job
require more than that?

Long term, I'm thinking we could clean up workspaces for successful jobs.
This should free up additional space (I guess at least 100GB).
https://plugins.jenkins.io/ws-cleanup/ - we already use this plugin to
clean workspaces at job start.


On Thu, Apr 30, 2020, 07:33 Maximilian Michels  wrote:

> *It's working again, probably because it's running on a different
> machine now.
>
> Who can check the disk space of the Jenkins hosts?
>
> Thanks,
> Max
>
> On 30.04.20 11:55, Maximilian Michels wrote:
> > Sorry, I meant to include the Jenkins log:
> >
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/5/console
> >
> > Thanks for investigating Hannah! Indeed, I can see the no space left on
> > device in the following but not in the log above:
> >
> https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/473/console
> >
> > I'm going to try running the build again. Do you think we could add more
> > storage to our Jenkins hosts or delete old build data?
> >
> > Thanks,
> > Max
> >
> > On 30.04.20 08:43, Hannah Jiang wrote:
> >> Max, I found a link from your PR and noticed below errors. This would be
> >> the true error.
> >>
> >> *07:57:03* >*Task :sdks:python:container:py37:docker*
> >> *07:57:03*  [91mERROR: Could not install packages due to an
> EnvironmentError: [Errno 28] No space left on device
> >> *07:57:03*
> >> *07:57:03*  [0m
> >> *07:57:03* >*Task :sdks:python:container:py35:docker*
> >> *07:57:03*  [91mERROR: Could not install packages due to an
> EnvironmentError: [Errno 28] No space left on device
> >>
> >>
> >>
> >> On Wed, Apr 29, 2020 at 5:59 PM Hannah Jiang  >> > wrote:
> >>
> >> There is a PythonDocker Precommit test running for PRs with Python
> >> changes. It seems running well.[1]
> >> Max, can you please give me a link so I can check more details? Do
> >> other images with different Python versions fail as well?
> >>
> >> 1.
> https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/
> >>
> >>
> >> On Wed, Apr 29, 2020 at 2:44 PM Ahmet Altay  >> > wrote:
> >>
> >> +Valentyn Tymofieiev  +Hannah Jiang
> >>  -- in case they have relevant
> >> information.
> >>
> >> On Wed, Apr 29, 2020 at 12:35 PM Maximilian Michels
> >> mailto:m...@apache.org>> wrote:
> >>
> >> Hi,
> >>
> >> has anyone noticed the Python 3.7 Docker container fails to
> >> build? I
> >> haven't been able to build the Python 3.7 container, neither
> >> locally nor
> >> on Jenkins.
> >>
> >> I get:
> >>
> >> 17:48:10 > Task :sdks:python:container:py37:docker
> >> 17:49:36 The command '/bin/sh -c pip install -r
> >> /tmp/base_image_requirements.txt && python -c "from
> >> google.protobuf.internal import api_implementation; assert
> >> api_implementation._default_implementation_type == 'cpp';
> print
> >> ('Verified fast protobuf used.')" && rm -rf
> >> /root/.cache/pip' returned a
> >> non-zero code: 1
> >> 17:49:36
> >> 17:49:36 > Task :sdks:python:container:py37:docker FAILED
> >>
> >>
> >> Cheers,
> >> Max
> >>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Python 3.7 docker container fails to build

2020-04-30 Thread Udi Meiri
I summarized my idea here: https://issues.apache.org/jira/browse/BEAM-9865


On Thu, Apr 30, 2020 at 2:01 PM Maximilian Michels  wrote:

> On 30.04.20 21:48, Hannah Jiang wrote:
> > --info tag was passed to docker image build commands with PythonDocker
> > Precommit to capture more logs. Without the tag, errors from
> > DockerFile step are not printed out to the console.
>
> Thanks for the info (pun intended).
>
> On 30.04.20 21:48, Hannah Jiang wrote:
> > Indeed, I can see the no space left on device in the following but
> > not in the log above:
> >
> > --info tag was passed to docker image build commands with PythonDocker
> > Precommit to capture more logs. Without the tag, errors from DockerFile
> > step are not printed out to the console.
> >
> > On Thu, Apr 30, 2020 at 11:19 AM Udi Meiri  > <mailto:eh...@google.com>> wrote:
> >
> > I checked node 8 and it had over 40GB space available. Does your job
> > require more than that?
> >
> > Long term, I'm thinking we could clean up workspaces for successful
> > jobs. This should free up additional space (I guess at least 100GB).
> > https://plugins.jenkins.io/ws-cleanup/ - we already use this plugin
> > to clean workspaces at job start.
> >
> >
> > On Thu, Apr 30, 2020, 07:33 Maximilian Michels  > <mailto:m...@apache.org>> wrote:
> >
> > *It's working again, probably because it's running on a different
> > machine now.
> >
> > Who can check the disk space of the Jenkins hosts?
> >
> > Thanks,
> > Max
> >
> > On 30.04.20 11:55, Maximilian Michels wrote:
> > > Sorry, I meant to include the Jenkins log:
> > >
> >
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/5/console
> > >
> > > Thanks for investigating Hannah! Indeed, I can see the no
> > space left on
> > > device in the following but not in the log above:
> > >
> >
> https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/473/console
> > >
> > > I'm going to try running the build again. Do you think we
> > could add more
> > > storage to our Jenkins hosts or delete old build data?
> > >
> > > Thanks,
> > > Max
> > >
> > > On 30.04.20 08:43, Hannah Jiang wrote:
> > >> Max, I found a link from your PR and noticed below errors.
> > This would be
> > >> the true error.
> > >>
> > >> *07:57:03* >*Task :sdks:python:container:py37:docker*
> > >> *07:57:03*  [91mERROR: Could not install packages due to an
> > EnvironmentError: [Errno 28] No space left on device
> > >> *07:57:03*
> > >> *07:57:03*  [0m
> > >> *07:57:03* >*Task :sdks:python:container:py35:docker*
> > >> *07:57:03*  [91mERROR: Could not install packages due to an
> > EnvironmentError: [Errno 28] No space left on device
> > >>
> > >>
> > >>
> > >> On Wed, Apr 29, 2020 at 5:59 PM Hannah Jiang
> > mailto:hannahji...@google.com>
> > >> <mailto:hannahji...@google.com
> > <mailto:hannahji...@google.com>>> wrote:
> > >>
> > >> There is a PythonDocker Precommit test running for PRs
> > with Python
> > >> changes. It seems running well.[1]
> > >> Max, can you please give me a link so I can check more
> > details? Do
> > >> other images with different Python versions fail as well?
> > >>
> > >>
> >  1.
> https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/
> > >>
> > >>
> > >> On Wed, Apr 29, 2020 at 2:44 PM Ahmet Altay
> > mailto:al...@google.com>
> > >> <mailto:al...@google.com <mailto:al...@google.com>>>
> wrote:
> > >>
> > >> +Valentyn Tymofieiev <mailto:valen...@google.com
> > <mailto:valen...@google.com>> +Hannah Jiang
> > >> <mailto:hannahji...@google.com
> > <mailto:hannahji...@g

Re: Beam 2.21 release update

2020-05-07 Thread Udi Meiri
Probably not the issue, but double checking: are you running "pip install
-r sdks/python/build-requirements.txt" first?

On Wed, May 6, 2020 at 7:22 PM Thomas Weise  wrote:

> I'm working on rebasing our fork to 2.21.0 and run into a problem
> installing grpcio-tools that leads to *ModuleNotFoundError: No module
> named 'grpc_tools'  *(see details below)
>
> I cannot reproduce this locally.
>
> Any =suggestions on what to look for?
>
> Thanks,
> Thomas
>
> [?25hBuilding wheels for collected packages: future, gr
>   Running setup.py bdist_wheel for future ... [?25l- \
> [?25h  Stored in directory:
> /root/.cache/pip/wheels/bf/c9/a3/c538d90ef17cf7823fa51fc701a7a7a910a80f6a405bf1
>   Running setup.py bdist_wheel for grpcio ... [?25l- \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
> | / - \ | / - \ |
> [?25h  Stored in directory:
> /root/.cache/pip/wheels/00/4d/5f/07d0d4283911d2b917b867a11b1622d9d2cc8c286eefd1
> Successfully built future grpcio
> Installing collected packages: six, grpcio, setuptools, protobuf,
> grpcio-tools, future, mypy-protobuf
> Successfully installed future-0.16.0 grpcio-1.28.1 grpcio-tools-1.14.2
> mypy-protobuf-1.18 protobuf-3.11.3 setuptools-46.1.3 six-1.14.0
> WARNING:root:Installing grpcio-tools took 305.39 seconds.
> INFO:gen_protos:Regenerating Python proto definitions (no output files).
> Process Process-1:
> Traceback (most recent call last):
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 292, in generate_proto_files
> from grpc_tools import protoc
> ModuleNotFoundError: No module named 'grpc_tools'
>
> On Fri, Apr 10, 2020 at 10:01 AM Kyle Weaver  wrote:
>
>> Hi everyone,
>>
>> Just a heads up that the Beam 2.21 release branch [1] is cut.
>> - If you find any important issues that you think should be addressed in
>> the release, please tag the jira with fix version 2.21.0 and cc me
>> (username `ibzib`).
>> - Make sure to update the change log [2] with any significant changes if
>> you haven't already. Send a PR with the change and tag me. (I imagine I'm
>> not the only one who forgot to do this :).)
>>
>> Thanks,
>> Kyle
>>
>> [1] https://github.com/apache/beam/blob/release-2.21.0
>> [2] https://github.com/apache/beam/blob/master/CHANGES.md
>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Beam 2.21 release update

2020-05-07 Thread Udi Meiri
It's hard to say without more details what's going on. Ahmet you're right
that it installs build-requirements.txt and retries calling
generate_proto_files().

Thomas, were there additional stacktraces? (after a "During handling of the
above exception, another exception occurred:" message?)


On Thu, May 7, 2020 at 11:59 AM Ahmet Altay  wrote:

>
>
> On Thu, May 7, 2020 at 11:56 AM Thomas Weise  wrote:
>
>> Thanks Udi! This is the issue. I'm trying to upgrade from 2.18 where
>> build-requirements.txt didn't exist.
>>
>> Is there a reason why this cannot happen automatically when
>> running python3.6 setup.py sdist bdist_wheel ?
>>
>
> I _believe_ this should happen automatically here:
> https://github.com/apache/beam/blob/master/sdks/python/gen_protos.py#L365.
> Maybe there is a problem there?
>
>
>>
>> Thomas
>>
>>
>> On Thu, May 7, 2020 at 11:07 AM Udi Meiri  wrote:
>>
>>> Probably not the issue, but double checking: are you running "pip
>>> install -r sdks/python/build-requirements.txt" first?
>>>
>>> On Wed, May 6, 2020 at 7:22 PM Thomas Weise  wrote:
>>>
>>>> I'm working on rebasing our fork to 2.21.0 and run into a problem
>>>> installing grpcio-tools that leads to *ModuleNotFoundError: No module
>>>> named 'grpc_tools'  *(see details below)
>>>>
>>>> I cannot reproduce this locally.
>>>>
>>>> Any =suggestions on what to look for?
>>>>
>>>> Thanks,
>>>> Thomas
>>>>
>>>> [?25hBuilding wheels for collected packages: future, gr
>>>>   Running setup.py bdist_wheel for future ... [?25l- \
>>>> [?25h  Stored in directory:
>>>> /root/.cache/pip/wheels/bf/c9/a3/c538d90ef17cf7823fa51fc701a7a7a910a80f6a405bf1
>>>>   Running setup.py bdist_wheel for grpcio ... [?25l- \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \
>>>> | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | /
>>>> - \ | / - \ | / - \ |
>>>> [?25h  Stored in directory:
>>>> /root/.cache/pip/wheels/00/4d/5f/07d0d4283911d2b917b867a11b1622d9d2cc8c286eefd1
>>>> Successfully built future grpcio
>>>> Installing collected packages: six, grpcio, setuptools, protobuf,
>>>> grpcio-tools, future, mypy-protobuf
>>>> Successfully installed future-0.16.0 grpcio-1.28.1 grpcio-tools-1.14.2
>>>> mypy-protobuf-1.18 protobuf-3.11.3 setuptools-46.1.3 six-1.14.0
>>>> WARNING:root:Installing grpcio-tools took 305.39 seconds.
>>>> INFO:gen_protos:Regenerating Python proto definitions (no output files).
>>>> Process Process-1:
>>>> Traceback (most recent call last):
>>>>   File
>>>> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
>>>> 292, in generate_proto_files
>>>> from grpc_tools import protoc
>>>> ModuleNotFoundError: No module named 'grpc_tools'
>>>>
>>>> On Fri, Apr 10, 2020 at 10:01 AM Kyle Weaver 
>>>> wrote:
>>>>
>>>>> Hi everyone,
>>>>>
>>>>> Just a heads up that the Beam 2.21 release branch [1] is cut.
>>>>> - If you find any important issues that you think should be addressed
>>>>> in the release, please tag the jira with fix version 2.21.0 and cc me
>>>>> (username `ibzib`).
>>>>> - Make sure to update the change log [2] with any significant changes
>>>>> if you haven't already. Send a PR with the change and tag me. (I imagine
>>>>> I'm not the only one who forgot to do this :).)
>>>>>
>>>>> Thanks,
>>>>> Kyle
>>>>>
>>>>> [1] https://github.com/apache/beam/blob/release-2.21.0
>>>>> [2] https://github.com/apache/beam/blob/master/CHANGES.md
>>>>>
>>>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Beam 2.21 release update

2020-05-08 Thread Udi Meiri
+Chad Dombrova  , who added _find_protoc_gen_mypy.

I'm guessing that the code
in _install_grpcio_tools_and_generate_proto_files creates a kind of
virtualenv, but it only works well for staging Python modules and not
binaries like protoc-gen-mypy.
(I assume there's a reason why it doesn't invoke virtualenv, probably since
the list of things setup.py can expect to be installed is very minimal
(setuptools).)

One solution would be to make these setup.py dependencies explicit in
pyproject.toml, such that pip installs them before running setup.py:
https://pip.pypa.io/en/stable/reference/pip/#pep-517-and-518-support
It would help when using tools like pip ("pip wheel"), but I'm not sure
what the alternative for "python setup.py sdist" is.


On Thu, May 7, 2020 at 10:40 PM Thomas Weise  wrote:

> No additional stacktraces. Full error output below.
>
> It's not clear what is going wrong.
>
> There isn't any exception from the subprocess execution since the
> "WARNING:root:Installing grpcio-tools took 305.39 seconds." is printed.
>
> Also, the time it takes to perform the install is equivalent to
> successfully running the pip command.
>
> I will report back if I find anything else. Currently doing the
> explicit install via pip install -r sdks/python/build-requirements.txt
>
> Thanks,
> Thomas
>
> WARNING:root:Installing grpcio-tools took 269.27 seconds.
> INFO:gen_protos:Regenerating Python proto definitions (no output files).
> Process Process-1:
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in
> _bootstrap
> self.run()
>   File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
> self._target(*self._args, **self._kwargs)
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 378, in _install_grpcio_tools_and_generate_proto_files
> generate_proto_files(force=force)
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 315, in generate_proto_files
> protoc_gen_mypy = _find_protoc_gen_mypy()
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 233, in _find_protoc_gen_mypy
> (fname, ', '.join(search_paths)))
> RuntimeError: Could not find protoc-gen-mypy in /code/venvs/venv2/bin,
> /code/venvs/venv2/bin, /code/venvs/venv3/bin, /usr/local/sbin,
> /usr/local/bin, /usr/sbin, /usr/bin, /sbin, /bin
> Traceback (most recent call last):
>   File "setup.py", line 311, in 
> 'mypy': generate_protos_first(mypy),
>   File
> "/code/venvs/venv2/local/lib/python2.7/site-packages/setuptools/__init__.py",
> line 129, in setup
> return distutils.core.setup(**attrs)
>   File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
> dist.run_commands()
>   File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
> self.run_command(cmd)
>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File
> "/code/venvs/venv2/local/lib/python2.7/site-packages/wheel/bdist_wheel.py",
> line 204, in run
> self.run_command('build')
>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
> self.distribution.run_command(command)
>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File "/usr/lib/python2.7/distutils/command/build.py", line 128, in run
> self.run_command(cmd_name)
>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
> self.distribution.run_command(command)
>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File "setup.py", line 235, in run
> gen_protos.generate_proto_files()
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 310, in generate_proto_files
> raise ValueError("Proto generation failed (see log for details).")
> ValueError: Proto generation failed (see log for details).
>
>
> On Thu, May 7, 2020 at 2:25 PM Udi Meiri  wrote:
>
>> It's hard to say without more details what's going on. Ahmet you're right
>> that it installs build-requirements.txt and retries calling
>> generate_proto_files().
>>
>> Thomas, were there additional stacktraces? (after a "During handling of
>> the above exception, another exception occurred:" message?)
>>
>>
>> On Thu, May 7, 2020 at 11:59 AM Ahme

Re: [ANNOUNCE] New committer: Robin Qiu

2020-05-19 Thread Udi Meiri
Congratulations Robin!

On Tue, May 19, 2020, 10:15 Valentyn Tymofieiev  wrote:

> Congratulations, Robin!
>
> On Tue, May 19, 2020 at 9:10 AM Yichi Zhang  wrote:
>
>> Congrats Robin!
>>
>> On Tue, May 19, 2020 at 8:56 AM Kamil Wasilewski <
>> kamil.wasilew...@polidea.com> wrote:
>>
>>> Congrats!
>>>
>>> On Tue, May 19, 2020 at 5:33 PM Jan Lukavský  wrote:
>>>
 Congrats Robin!
 On 5/19/20 5:01 PM, Tyson Hamilton wrote:

 Congratulations!

 On Tue, May 19, 2020 at 6:10 AM Omar Ismail 
 wrote:

> Congrats!
>
> On Tue, May 19, 2020 at 5:00 AM Gleb Kanterov 
> wrote:
>
>> Congratulations!
>>
>> On Tue, May 19, 2020 at 7:31 AM Aizhamal Nurmamat kyzy <
>> aizha...@apache.org> wrote:
>>
>>> Congratulations, Robin! Thank you for your contributions!
>>>
>>> On Mon, May 18, 2020, 7:18 PM Boyuan Zhang 
>>> wrote:
>>>
 Congrats~~

 On Mon, May 18, 2020 at 7:17 PM Reza Rokni  wrote:

> Congratulations!
>
> On Tue, May 19, 2020 at 10:06 AM Ahmet Altay 
> wrote:
>
>> Hi everyone,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new
>> committer: Robin Qiu .
>>
>> Robin has been active in the community for close to 2 years,
>> worked on HyperLogLog++ [1], SQL [2], improved documentation, and 
>> helped
>> with releases(*).
>>
>> In consideration of his contributions, the Beam PMC trusts him
>> with the responsibilities of a Beam committer [3].
>>
>> Thank you for your contributions Robin!
>>
>> -Ahmet, on behalf of the Apache Beam PMC
>>
>> [1]
>> https://www.meetup.com/Zurich-Apache-Beam-Meetup/events/265529665/
>> [2]
>> https://www.meetup.com/Belgium-Apache-Beam-Meetup/events/264933301/
>> [3] https://beam.apache.org/contribute/become-a-committer
>> /#an-apache-beam-committer
>> (*) And maybe he will be a release manager soon :)
>>
>> --
>
> Omar Ismail |  Technical Solutions Engineer |  omarism...@google.com |
>
>



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [ANNOUNCE] Beam 2.21.0 Released

2020-05-28 Thread Udi Meiri
Woohoo!

On Thu, May 28, 2020 at 4:16 AM Kyle Weaver  wrote:

> The Apache Beam team is pleased to announce the release of version 2.21.0.
>
> Apache Beam is an open source unified programming model to define and
> execute data processing pipelines, including ETL, batch and stream
> (continuous) processing. See https://beam.apache.org
>
> You can download the release here:
>
> https://beam.apache.org/get-started/downloads/
>
> This release includes bug fixes, features, and improvements detailed on
> the Beam blog: https://beam.apache.org/blog/beam-2.21.0/
>
> Thanks to everyone who contributed to this release, and we hope you enjoy
> using Beam 2.21.0.
> -- Kyle Weaver, on behalf of The Apache Beam team
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


latest release on Github

2020-06-01 Thread Udi Meiri
Hi,
Trying out the new Github layout (feature preview) I've noticed that the
latest release is featured more prominently and is at v2.16.0 for some
reason. https://github.com/apache/beam/releases/tag/v2.16.0

Any idea how to fix this, and what instructions should be added to the
release guide?


smime.p7s
Description: S/MIME Cryptographic Signature


Re: latest release on Github

2020-06-01 Thread Udi Meiri
Thanks Nathan and Kyle!

On Mon, Jun 1, 2020 at 12:22 PM Kyle Weaver  wrote:

> Filed https://jira.apache.org/jira/browse/BEAM-10168
>
> On Mon, Jun 1, 2020 at 3:20 PM Kyle Weaver  wrote:
>
>> I fixed it. There was a discussion on the mailing list recently about
>> automating Github releases [1], short of that however we can just use the
>> UI.
>>
>> [1]
>> https://lists.apache.org/thread.html/re367d262333078501c47856e9b8a0fc3fd7db60c2d2ebb181275481a%40%3Cdev.beam.apache.org%3E
>>
>>
>> On Mon, Jun 1, 2020 at 3:07 PM Udi Meiri  wrote:
>>
>>> Hi,
>>> Trying out the new Github layout (feature preview) I've noticed that the
>>> latest release is featured more prominently and is at v2.16.0 for some
>>> reason. https://github.com/apache/beam/releases/tag/v2.16.0
>>>
>>> Any idea how to fix this, and what instructions should be added to the
>>> release guide?
>>>
>>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RESULT][VOTE] Accept the Firefly design donation as Beam Mascot - Deadline Mon April 6

2020-06-04 Thread Udi Meiri
That's great!

Can someone please make custom emojis for our Slack from the expression
sheet images? :) (
https://github.com/apache/beam/blob/master/website/www/site/static/images/mascot/model_sheet.png
)
A smiling firefly and one with heart eyes would be awesome.

On Wed, Jun 3, 2020 at 11:04 AM Aizhamal Nurmamat kyzy 
wrote:

> Thank you Ismael for reviewing and merging the pull request.
>
> Now the mascot files can be found here
> https://beam.apache.org/community/mascot/ on the website and here
> https://github.com/apache/beam/tree/master/website/www/site/static/images/mascot
>  in
> the repo.
>
> This project is complete now, therefore closing this thread.
>
> Thanks everyone!
>
> On Thu, May 21, 2020 at 7:07 PM Aizhamal Nurmamat kyzy <
> aizha...@apache.org> wrote:
>
>> Hello everyone,
>> Julian and I have created this pr[1] to create a page with the model
>> sheet, and links to the mascot image files. Can someone please review?
>> Thanks!
>> Aizhamal
>>
>> [1] https://github.com/apache/beam/pull/11780
>>
>> On Mon, May 11, 2020 at 10:14 AM Aizhamal Nurmamat kyzy <
>> aizha...@apache.org> wrote:
>>
>>> @Ismael, this is in my work items for as soon as we complete the
>>> migration of the website.
>>> @Kyle, thanks for filing Jira, I assigned it to myself.
>>>
>>> On Mon, May 11, 2020 at 10:03 AM Kyle Weaver 
>>> wrote:
>>>
 > Now that the vote has passed maybe we should add the images somewhere
 > in the website so people can easily find the Firefly to use it

 +1 Maybe something to revisit after the website overhaul is complete. I
 filed https://jira.apache.org/jira/browse/BEAM-9948 if anyone wants to
 take it.

 On Mon, May 11, 2020 at 12:57 PM Ismaël Mejía 
 wrote:

> Now that the vote has passed maybe we should add the images somewhere
> in the website so people can easily find the Firefly to use it.
> Something like what we do with our logos
> https://beam.apache.org/community/logos/
>
> WDYT? any taker?
>
> On Tue, Apr 28, 2020 at 7:43 PM Pablo Estrada 
> wrote:
> >
> > I'll be happy to as well!
> >
> > On Sun, Apr 26, 2020 at 4:18 AM Maximilian Michels 
> wrote:
> >>
> >> Hey Maria,
> >>
> >> I can testify :)
> >>
> >> Cheers,
> >> Max
> >>
> >> On 23.04.20 20:49, María Cruz wrote:
> >> > Hi everyone!
> >> > It is amazing to see how this process developed to collaboratively
> >> > create Apache Beam's mascot. Thank you to everyone who got
> involved!
> >> > I would like to write a blogpost for the Beam website, and I
> wanted to
> >> > ask you: would anyone like to offer their testimony about the
> process of
> >> > creating the Beam mascot, and what this means to you? Everyone's
> >> > testimony is welcome! If you witnessed the development of a
> mascot for
> >> > another open source project, even better =)
> >> >
> >> > Please feel free to express interest on this thread, and I'll
> reach out
> >> > to you off-list.
> >> >
> >> > Thanks,
> >> >
> >> > María
> >> >
> >> > On Fri, Apr 17, 2020 at 6:19 AM Jeff Klukas  >> > > wrote:
> >> >
> >> > I personally like the sound of "Datum" as a name. I also like
> the
> >> > idea of not assigning them a gender.
> >> >
> >> > As a counterpoint on the naming side, one of the slide decks
> >> > provided while iterating on the design mentions:
> >> >
> >> > > Mascot can change colors when it is “full of data” or has a
> “batch
> >> > of data” to process.  Yellow is supercharged and ready to
> process!
> >> >
> >> > Based on that, I'd argue that the mascot maps to the concept
> of a
> >> > bundle in the beam execution model and we should consider a
> name
> >> > that's a play on "bundle" or perhaps a play on "checkpoint".
> >> >
> >> > On Thu, Apr 16, 2020 at 3:44 PM Julian Bruno <
> juliangbr...@gmail.com
> >> > > wrote:
> >> >
> >> > Hi all,
> >> >
> >> > While working on the design of our Mascot
> >> > Some ideas showed up and I wish to share them.
> >> > In regard to Alex Van Boxel's question about the name of
> our Mascot.
> >> >
> >> > I was thinking about this yesterday night and feel it
> could be a
> >> > great idea to name the Mascot "*Data*" or "*Datum*". Both
> names
> >> > sound cute and make sense to me. I prefer the later.
> Datum means
> >> > a single piece of information. The Mascot is the first
> piece of
> >> > information and its job is to collect batches of data and
> >> > process it. Datum is in charge of linking information
> together.
> >> >
> >> > In addit

Kafka IO performance tests leaving behind disks on GCP

2020-06-04 Thread Udi Meiri
Hi,
I opened a bug on what seems to be leftover GKE disk images:
https://issues.apache.org/jira/browse/BEAM-10145

Can anyone familiar with these take a look?

Thanks!


smime.p7s
Description: S/MIME Cryptographic Signature


python precommit error - google-auth depenedency?

2020-06-10 Thread Udi Meiri
Hi,
I'm trying to understand these "pip check" failures:

ERROR: google-auth 1.16.1 has requirement rsa<4.1,>=3.1.4, but you'll
have rsa 4.1 which is incompatible


https://builds.apache.org/job/beam_PreCommit_Python_Cron/2860/console

However, when I do
pip install dist/apache-beam-2.23.0.dev0.tar.gz[test,cloud]

locally, the google-auth package is not installed at all.
Any ideas on how to debug where this requirement is coming from?


smime.p7s
Description: S/MIME Cryptographic Signature


<    1   2   3   4   >