Re: NOTICE: New Python PreCommit jobs

2019-10-07 Thread Chad Dombrova
There's a lot of value to switching to pytest even without xdist.  Could we
prune back the goals of this first PR to just achieving feature parity with
nose, and make a followup PR for xdist?

-chad

On Mon, Oct 7, 2019 at 12:04 PM Udi Meiri  wrote:

>
>
> On Fri, Oct 4, 2019 at 10:35 AM Chad Dombrova  wrote:
>
>>
>> I have a WiP PR to convert Beam to use pytest, but it's been stalled.
>>>
>>
>> What would it take to get it back on track?
>>
>
> Besides needing to convert ITs (removing save_main_session), which can be
> split out to a later PR, there's verifying that the same set of tests are
> collected for each suite.
>
>
>>
>>
>>> Another nice thing about pytest is that you'll be able to tell which
>>> suite a test belongs to.
>>>
>>
>> pytest has a lot of quality of life improvements over nose.  The biggest
>> and simplest one is that the test name that it prints is in the same format
>> as the runner expects for specifying individual tests to run, so you can
>> just copy and paste on the command line to run that one test.  Genius.
>> Also, since it uses directory names for tests and not module names, you can
>> tab complete.   The whole fixture
>>
> LOL at the copy-paste issue.
>
>
>> concept is also great, since it gives you a new axis for test
>> composability and reuse, instead of just complex sub-classing or
>> copy-and-paste.   After switching to pytest we went through our tests and
>> replaced all of our horrible test mixins with fixtures and the end result
>> is much more legible and maintainable.  There's honestly nothing I miss
>> about nose.
>>
>> -chad
>>
>>
>>
>>
>>


Re: NOTICE: New Python PreCommit jobs

2019-10-07 Thread Udi Meiri
On Fri, Oct 4, 2019 at 10:35 AM Chad Dombrova  wrote:

>
> I have a WiP PR to convert Beam to use pytest, but it's been stalled.
>>
>
> What would it take to get it back on track?
>

Besides needing to convert ITs (removing save_main_session), which can be
split out to a later PR, there's verifying that the same set of tests are
collected for each suite.


>
>
>> Another nice thing about pytest is that you'll be able to tell which
>> suite a test belongs to.
>>
>
> pytest has a lot of quality of life improvements over nose.  The biggest
> and simplest one is that the test name that it prints is in the same format
> as the runner expects for specifying individual tests to run, so you can
> just copy and paste on the command line to run that one test.  Genius.
> Also, since it uses directory names for tests and not module names, you can
> tab complete.   The whole fixture
>
LOL at the copy-paste issue.


> concept is also great, since it gives you a new axis for test
> composability and reuse, instead of just complex sub-classing or
> copy-and-paste.   After switching to pytest we went through our tests and
> replaced all of our horrible test mixins with fixtures and the end result
> is much more legible and maintainable.  There's honestly nothing I miss
> about nose.
>
> -chad
>
>
>
>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: NOTICE: New Python PreCommit jobs

2019-10-04 Thread Chad Dombrova
> I have a WiP PR to convert Beam to use pytest, but it's been stalled.
>

What would it take to get it back on track?


> Another nice thing about pytest is that you'll be able to tell which suite
> a test belongs to.
>

pytest has a lot of quality of life improvements over nose.  The biggest
and simplest one is that the test name that it prints is in the same format
as the runner expects for specifying individual tests to run, so you can
just copy and paste on the command line to run that one test.  Genius.
Also, since it uses directory names for tests and not module names, you can
tab complete.   The whole fixture concept is also great, since it gives you
a new axis for test composability and reuse, instead of just complex
sub-classing or copy-and-paste.   After switching to pytest we went through
our tests and replaced all of our horrible test mixins with fixtures and
the end result is much more legible and maintainable.  There's honestly
nothing I miss about nose.

-chad


Re: NOTICE: New Python PreCommit jobs

2019-10-02 Thread Chad Dombrova
Hi all,
I've posted a new PR that just splits out the python lint job here:
https://github.com/apache/beam/pull/9706

I'll be running the seed job shortly unless anyone objects.

-chad


On Tue, Oct 1, 2019 at 9:04 PM Chad Dombrova  wrote:

> I haven’t used nose’s parallel execution plugin, but I have used pytest
> with xdist with success. If your tests are designed to run in any order and
> are properly sandboxed to prevent crosstalk between concurrent runs, which
> they *should* be, then in my experience it works very well.
>
>
> On Fri, Sep 27, 2019 at 6:51 PM Kenneth Knowles  wrote:
>
>> Do things go wrong when nose is configured to use parallel execution?
>>
>> On Fri, Sep 27, 2019 at 5:09 PM Chad Dombrova  wrote:
>>
>>> By the way, the outcome on this was that splitting the python precommit
>>> job into one job per python version resulted in increasing the total test
>>> completion time by 66%, which is obviously not good.  This is because we
>>> are using Gradle to run the python tests tasks in parallel (the jenkins VMs
>>> have 16 cores each, utilized across 2 slots, IIRC), but after the split
>>> there were only 1-2 gradle tasks per test.  Since the python test runner,
>>> nose, is currently not using parallel execution, there were not enough
>>> concurrent tasks to make proper use of the VM's CPUs.
>>>
>>> tl;dr  I'm going to create a followup PR to split out just the Lint job
>>> (same as we have Spotless for Java).   This is our best ROI for now.
>>>
>>> -chad
>>>
>>>
>>> On Fri, Sep 27, 2019 at 3:27 PM Kyle Weaver  wrote:
>>>
 > Do we have good pypi caching?

 Building Python SDK harness containers takes 2 mins each (times 4, the
 number of versions) on my machine, even if nothing has changed. But we're
 already paying that cost, so I don't think splitting the jobs should make
 it any worse. (https://issues.apache.org/jira/browse/BEAM-8277 if
 anyone has any ideas)

 Kyle Weaver | Software Engineer | github.com/ibzib |
 kcwea...@google.com


 On Wed, Sep 25, 2019 at 11:21 AM Pablo Estrada 
 wrote:

> Thanks Chad, and thank you for notifying on the dev list.
>
> On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles 
> wrote:
>
>> Nice.
>>
>> Do we have good pypi caching? If not this could add a lot of overhead
>> to our already-backed-up CI queue. (btw I still think your change is 
>> good,
>> and just makes proper caching more important)
>>
>> Kenn
>>
>> On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova 
>> wrote:
>>
>>> Hi all,
>>> I'm working to make the CI experience with python a bit better, and
>>> my current initiative is splitting up the giant Python PreCommit job 
>>> into 5
>>> separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7.
>>>
>>> Around 11am Pacific time tomorrow I'm going to initiate the seed
>>> jobs, at which point all PRs will start to run the new precommit jobs.
>>> It's a bit of a chicken-and-egg scenario with testing this, so there 
>>> could
>>> be issues that pop up after the seed jobs are created, but I'll be 
>>> working
>>> to resolve those issues as quickly as possible.
>>>
>>> If you run into problems because of this change, please let me know
>>> on the github PR.
>>>
>>> Here's the PR: https://github.com/apache/beam/pull/9642
>>> Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213#
>>>
>>> The upshot is that after this is done you'll get better feedback on
>>> python test failures!
>>>
>>> Let me know if you have any concerns.
>>>
>>> thanks,
>>> chad
>>>
>>>


Re: NOTICE: New Python PreCommit jobs

2019-10-01 Thread Chad Dombrova
I haven’t used nose’s parallel execution plugin, but I have used pytest
with xdist with success. If your tests are designed to run in any order and
are properly sandboxed to prevent crosstalk between concurrent runs, which
they *should* be, then in my experience it works very well.


On Fri, Sep 27, 2019 at 6:51 PM Kenneth Knowles  wrote:

> Do things go wrong when nose is configured to use parallel execution?
>
> On Fri, Sep 27, 2019 at 5:09 PM Chad Dombrova  wrote:
>
>> By the way, the outcome on this was that splitting the python precommit
>> job into one job per python version resulted in increasing the total test
>> completion time by 66%, which is obviously not good.  This is because we
>> are using Gradle to run the python tests tasks in parallel (the jenkins VMs
>> have 16 cores each, utilized across 2 slots, IIRC), but after the split
>> there were only 1-2 gradle tasks per test.  Since the python test runner,
>> nose, is currently not using parallel execution, there were not enough
>> concurrent tasks to make proper use of the VM's CPUs.
>>
>> tl;dr  I'm going to create a followup PR to split out just the Lint job
>> (same as we have Spotless for Java).   This is our best ROI for now.
>>
>> -chad
>>
>>
>> On Fri, Sep 27, 2019 at 3:27 PM Kyle Weaver  wrote:
>>
>>> > Do we have good pypi caching?
>>>
>>> Building Python SDK harness containers takes 2 mins each (times 4, the
>>> number of versions) on my machine, even if nothing has changed. But we're
>>> already paying that cost, so I don't think splitting the jobs should make
>>> it any worse. (https://issues.apache.org/jira/browse/BEAM-8277 if
>>> anyone has any ideas)
>>>
>>> Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com
>>>
>>>
>>> On Wed, Sep 25, 2019 at 11:21 AM Pablo Estrada 
>>> wrote:
>>>
 Thanks Chad, and thank you for notifying on the dev list.

 On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles 
 wrote:

> Nice.
>
> Do we have good pypi caching? If not this could add a lot of overhead
> to our already-backed-up CI queue. (btw I still think your change is good,
> and just makes proper caching more important)
>
> Kenn
>
> On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova 
> wrote:
>
>> Hi all,
>> I'm working to make the CI experience with python a bit better, and
>> my current initiative is splitting up the giant Python PreCommit job 
>> into 5
>> separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7.
>>
>> Around 11am Pacific time tomorrow I'm going to initiate the seed
>> jobs, at which point all PRs will start to run the new precommit jobs.
>> It's a bit of a chicken-and-egg scenario with testing this, so there 
>> could
>> be issues that pop up after the seed jobs are created, but I'll be 
>> working
>> to resolve those issues as quickly as possible.
>>
>> If you run into problems because of this change, please let me know
>> on the github PR.
>>
>> Here's the PR: https://github.com/apache/beam/pull/9642
>> Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213#
>>
>> The upshot is that after this is done you'll get better feedback on
>> python test failures!
>>
>> Let me know if you have any concerns.
>>
>> thanks,
>> chad
>>
>>


Re: NOTICE: New Python PreCommit jobs

2019-09-27 Thread Kenneth Knowles
Do things go wrong when nose is configured to use parallel execution?

On Fri, Sep 27, 2019 at 5:09 PM Chad Dombrova  wrote:

> By the way, the outcome on this was that splitting the python precommit
> job into one job per python version resulted in increasing the total test
> completion time by 66%, which is obviously not good.  This is because we
> are using Gradle to run the python tests tasks in parallel (the jenkins VMs
> have 16 cores each, utilized across 2 slots, IIRC), but after the split
> there were only 1-2 gradle tasks per test.  Since the python test runner,
> nose, is currently not using parallel execution, there were not enough
> concurrent tasks to make proper use of the VM's CPUs.
>
> tl;dr  I'm going to create a followup PR to split out just the Lint job
> (same as we have Spotless for Java).   This is our best ROI for now.
>
> -chad
>
>
> On Fri, Sep 27, 2019 at 3:27 PM Kyle Weaver  wrote:
>
>> > Do we have good pypi caching?
>>
>> Building Python SDK harness containers takes 2 mins each (times 4, the
>> number of versions) on my machine, even if nothing has changed. But we're
>> already paying that cost, so I don't think splitting the jobs should make
>> it any worse. (https://issues.apache.org/jira/browse/BEAM-8277 if anyone
>> has any ideas)
>>
>> Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com
>>
>>
>> On Wed, Sep 25, 2019 at 11:21 AM Pablo Estrada 
>> wrote:
>>
>>> Thanks Chad, and thank you for notifying on the dev list.
>>>
>>> On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles 
>>> wrote:
>>>
 Nice.

 Do we have good pypi caching? If not this could add a lot of overhead
 to our already-backed-up CI queue. (btw I still think your change is good,
 and just makes proper caching more important)

 Kenn

 On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova 
 wrote:

> Hi all,
> I'm working to make the CI experience with python a bit better, and my
> current initiative is splitting up the giant Python PreCommit job into 5
> separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7.
>
> Around 11am Pacific time tomorrow I'm going to initiate the seed jobs,
> at which point all PRs will start to run the new precommit jobs.  It's a
> bit of a chicken-and-egg scenario with testing this, so there could be
> issues that pop up after the seed jobs are created, but I'll be working to
> resolve those issues as quickly as possible.
>
> If you run into problems because of this change, please let me know on
> the github PR.
>
> Here's the PR: https://github.com/apache/beam/pull/9642
> Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213#
>
> The upshot is that after this is done you'll get better feedback on
> python test failures!
>
> Let me know if you have any concerns.
>
> thanks,
> chad
>
>


Re: NOTICE: New Python PreCommit jobs

2019-09-27 Thread Chad Dombrova
By the way, the outcome on this was that splitting the python precommit job
into one job per python version resulted in increasing the total test
completion time by 66%, which is obviously not good.  This is because we
are using Gradle to run the python tests tasks in parallel (the jenkins VMs
have 16 cores each, utilized across 2 slots, IIRC), but after the split
there were only 1-2 gradle tasks per test.  Since the python test runner,
nose, is currently not using parallel execution, there were not enough
concurrent tasks to make proper use of the VM's CPUs.

tl;dr  I'm going to create a followup PR to split out just the Lint job
(same as we have Spotless for Java).   This is our best ROI for now.

-chad


On Fri, Sep 27, 2019 at 3:27 PM Kyle Weaver  wrote:

> > Do we have good pypi caching?
>
> Building Python SDK harness containers takes 2 mins each (times 4, the
> number of versions) on my machine, even if nothing has changed. But we're
> already paying that cost, so I don't think splitting the jobs should make
> it any worse. (https://issues.apache.org/jira/browse/BEAM-8277 if anyone
> has any ideas)
>
> Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com
>
>
> On Wed, Sep 25, 2019 at 11:21 AM Pablo Estrada  wrote:
>
>> Thanks Chad, and thank you for notifying on the dev list.
>>
>> On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles  wrote:
>>
>>> Nice.
>>>
>>> Do we have good pypi caching? If not this could add a lot of overhead to
>>> our already-backed-up CI queue. (btw I still think your change is good, and
>>> just makes proper caching more important)
>>>
>>> Kenn
>>>
>>> On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova  wrote:
>>>
 Hi all,
 I'm working to make the CI experience with python a bit better, and my
 current initiative is splitting up the giant Python PreCommit job into 5
 separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7.

 Around 11am Pacific time tomorrow I'm going to initiate the seed jobs,
 at which point all PRs will start to run the new precommit jobs.  It's a
 bit of a chicken-and-egg scenario with testing this, so there could be
 issues that pop up after the seed jobs are created, but I'll be working to
 resolve those issues as quickly as possible.

 If you run into problems because of this change, please let me know on
 the github PR.

 Here's the PR: https://github.com/apache/beam/pull/9642
 Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213#

 The upshot is that after this is done you'll get better feedback on
 python test failures!

 Let me know if you have any concerns.

 thanks,
 chad




Re: NOTICE: New Python PreCommit jobs

2019-09-27 Thread Kyle Weaver
> Do we have good pypi caching?

Building Python SDK harness containers takes 2 mins each (times 4, the
number of versions) on my machine, even if nothing has changed. But we're
already paying that cost, so I don't think splitting the jobs should make
it any worse. (https://issues.apache.org/jira/browse/BEAM-8277 if anyone
has any ideas)

Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com


On Wed, Sep 25, 2019 at 11:21 AM Pablo Estrada  wrote:

> Thanks Chad, and thank you for notifying on the dev list.
>
> On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles  wrote:
>
>> Nice.
>>
>> Do we have good pypi caching? If not this could add a lot of overhead to
>> our already-backed-up CI queue. (btw I still think your change is good, and
>> just makes proper caching more important)
>>
>> Kenn
>>
>> On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova  wrote:
>>
>>> Hi all,
>>> I'm working to make the CI experience with python a bit better, and my
>>> current initiative is splitting up the giant Python PreCommit job into 5
>>> separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7.
>>>
>>> Around 11am Pacific time tomorrow I'm going to initiate the seed jobs,
>>> at which point all PRs will start to run the new precommit jobs.  It's a
>>> bit of a chicken-and-egg scenario with testing this, so there could be
>>> issues that pop up after the seed jobs are created, but I'll be working to
>>> resolve those issues as quickly as possible.
>>>
>>> If you run into problems because of this change, please let me know on
>>> the github PR.
>>>
>>> Here's the PR: https://github.com/apache/beam/pull/9642
>>> Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213#
>>>
>>> The upshot is that after this is done you'll get better feedback on
>>> python test failures!
>>>
>>> Let me know if you have any concerns.
>>>
>>> thanks,
>>> chad
>>>
>>>


Re: NOTICE: New Python PreCommit jobs

2019-09-25 Thread Pablo Estrada
Thanks Chad, and thank you for notifying on the dev list.

On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles  wrote:

> Nice.
>
> Do we have good pypi caching? If not this could add a lot of overhead to
> our already-backed-up CI queue. (btw I still think your change is good, and
> just makes proper caching more important)
>
> Kenn
>
> On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova  wrote:
>
>> Hi all,
>> I'm working to make the CI experience with python a bit better, and my
>> current initiative is splitting up the giant Python PreCommit job into 5
>> separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7.
>>
>> Around 11am Pacific time tomorrow I'm going to initiate the seed jobs, at
>> which point all PRs will start to run the new precommit jobs.  It's a bit
>> of a chicken-and-egg scenario with testing this, so there could be issues
>> that pop up after the seed jobs are created, but I'll be working to resolve
>> those issues as quickly as possible.
>>
>> If you run into problems because of this change, please let me know on
>> the github PR.
>>
>> Here's the PR: https://github.com/apache/beam/pull/9642
>> Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213#
>>
>> The upshot is that after this is done you'll get better feedback on
>> python test failures!
>>
>> Let me know if you have any concerns.
>>
>> thanks,
>> chad
>>
>>


Re: NOTICE: New Python PreCommit jobs

2019-09-25 Thread Kenneth Knowles
Nice.

Do we have good pypi caching? If not this could add a lot of overhead to
our already-backed-up CI queue. (btw I still think your change is good, and
just makes proper caching more important)

Kenn

On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova  wrote:

> Hi all,
> I'm working to make the CI experience with python a bit better, and my
> current initiative is splitting up the giant Python PreCommit job into 5
> separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7.
>
> Around 11am Pacific time tomorrow I'm going to initiate the seed jobs, at
> which point all PRs will start to run the new precommit jobs.  It's a bit
> of a chicken-and-egg scenario with testing this, so there could be issues
> that pop up after the seed jobs are created, but I'll be working to resolve
> those issues as quickly as possible.
>
> If you run into problems because of this change, please let me know on the
> github PR.
>
> Here's the PR: https://github.com/apache/beam/pull/9642
> Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213#
>
> The upshot is that after this is done you'll get better feedback on python
> test failures!
>
> Let me know if you have any concerns.
>
> thanks,
> chad
>
>


NOTICE: New Python PreCommit jobs

2019-09-24 Thread Chad Dombrova
Hi all,
I'm working to make the CI experience with python a bit better, and my
current initiative is splitting up the giant Python PreCommit job into 5
separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7.

Around 11am Pacific time tomorrow I'm going to initiate the seed jobs, at
which point all PRs will start to run the new precommit jobs.  It's a bit
of a chicken-and-egg scenario with testing this, so there could be issues
that pop up after the seed jobs are created, but I'll be working to resolve
those issues as quickly as possible.

If you run into problems because of this change, please let me know on the
github PR.

Here's the PR: https://github.com/apache/beam/pull/9642
Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213#

The upshot is that after this is done you'll get better feedback on python
test failures!

Let me know if you have any concerns.

thanks,
chad