Re: NOTICE: New Python PreCommit jobs
There's a lot of value to switching to pytest even without xdist. Could we prune back the goals of this first PR to just achieving feature parity with nose, and make a followup PR for xdist? -chad On Mon, Oct 7, 2019 at 12:04 PM Udi Meiri wrote: > > > On Fri, Oct 4, 2019 at 10:35 AM Chad Dombrova wrote: > >> >> I have a WiP PR to convert Beam to use pytest, but it's been stalled. >>> >> >> What would it take to get it back on track? >> > > Besides needing to convert ITs (removing save_main_session), which can be > split out to a later PR, there's verifying that the same set of tests are > collected for each suite. > > >> >> >>> Another nice thing about pytest is that you'll be able to tell which >>> suite a test belongs to. >>> >> >> pytest has a lot of quality of life improvements over nose. The biggest >> and simplest one is that the test name that it prints is in the same format >> as the runner expects for specifying individual tests to run, so you can >> just copy and paste on the command line to run that one test. Genius. >> Also, since it uses directory names for tests and not module names, you can >> tab complete. The whole fixture >> > LOL at the copy-paste issue. > > >> concept is also great, since it gives you a new axis for test >> composability and reuse, instead of just complex sub-classing or >> copy-and-paste. After switching to pytest we went through our tests and >> replaced all of our horrible test mixins with fixtures and the end result >> is much more legible and maintainable. There's honestly nothing I miss >> about nose. >> >> -chad >> >> >> >> >>
Re: NOTICE: New Python PreCommit jobs
On Fri, Oct 4, 2019 at 10:35 AM Chad Dombrova wrote: > > I have a WiP PR to convert Beam to use pytest, but it's been stalled. >> > > What would it take to get it back on track? > Besides needing to convert ITs (removing save_main_session), which can be split out to a later PR, there's verifying that the same set of tests are collected for each suite. > > >> Another nice thing about pytest is that you'll be able to tell which >> suite a test belongs to. >> > > pytest has a lot of quality of life improvements over nose. The biggest > and simplest one is that the test name that it prints is in the same format > as the runner expects for specifying individual tests to run, so you can > just copy and paste on the command line to run that one test. Genius. > Also, since it uses directory names for tests and not module names, you can > tab complete. The whole fixture > LOL at the copy-paste issue. > concept is also great, since it gives you a new axis for test > composability and reuse, instead of just complex sub-classing or > copy-and-paste. After switching to pytest we went through our tests and > replaced all of our horrible test mixins with fixtures and the end result > is much more legible and maintainable. There's honestly nothing I miss > about nose. > > -chad > > > > > smime.p7s Description: S/MIME Cryptographic Signature
Re: NOTICE: New Python PreCommit jobs
> I have a WiP PR to convert Beam to use pytest, but it's been stalled. > What would it take to get it back on track? > Another nice thing about pytest is that you'll be able to tell which suite > a test belongs to. > pytest has a lot of quality of life improvements over nose. The biggest and simplest one is that the test name that it prints is in the same format as the runner expects for specifying individual tests to run, so you can just copy and paste on the command line to run that one test. Genius. Also, since it uses directory names for tests and not module names, you can tab complete. The whole fixture concept is also great, since it gives you a new axis for test composability and reuse, instead of just complex sub-classing or copy-and-paste. After switching to pytest we went through our tests and replaced all of our horrible test mixins with fixtures and the end result is much more legible and maintainable. There's honestly nothing I miss about nose. -chad
Re: NOTICE: New Python PreCommit jobs
I have a WiP PR to convert Beam to use pytest, but it's been stalled. The nice thing about pytest-xdist is that it runs tests in a multi-process, single-thread-per-process fashion, so one test isn't affected by another changing some global setting. The not-so-nice thing is that xdist adds some globals to the main session that fail to pickle, so I'm having to remove save_main_session from our tests first. Another nice thing about pytest is that you'll be able to tell which suite a test belongs to. On Wed, Oct 2, 2019 at 10:16 AM Chad Dombrova wrote: > Hi all, > I've posted a new PR that just splits out the python lint job here: > https://github.com/apache/beam/pull/9706 > > I'll be running the seed job shortly unless anyone objects. > > -chad > > > On Tue, Oct 1, 2019 at 9:04 PM Chad Dombrova wrote: > >> I haven’t used nose’s parallel execution plugin, but I have used pytest >> with xdist with success. If your tests are designed to run in any order and >> are properly sandboxed to prevent crosstalk between concurrent runs, which >> they *should* be, then in my experience it works very well. >> >> >> On Fri, Sep 27, 2019 at 6:51 PM Kenneth Knowles wrote: >> >>> Do things go wrong when nose is configured to use parallel execution? >>> >>> On Fri, Sep 27, 2019 at 5:09 PM Chad Dombrova wrote: >>> By the way, the outcome on this was that splitting the python precommit job into one job per python version resulted in increasing the total test completion time by 66%, which is obviously not good. This is because we are using Gradle to run the python tests tasks in parallel (the jenkins VMs have 16 cores each, utilized across 2 slots, IIRC), but after the split there were only 1-2 gradle tasks per test. Since the python test runner, nose, is currently not using parallel execution, there were not enough concurrent tasks to make proper use of the VM's CPUs. tl;dr I'm going to create a followup PR to split out just the Lint job (same as we have Spotless for Java). This is our best ROI for now. -chad On Fri, Sep 27, 2019 at 3:27 PM Kyle Weaver wrote: > > Do we have good pypi caching? > > Building Python SDK harness containers takes 2 mins each (times 4, the > number of versions) on my machine, even if nothing has changed. But we're > already paying that cost, so I don't think splitting the jobs should make > it any worse. (https://issues.apache.org/jira/browse/BEAM-8277 if > anyone has any ideas) > > Kyle Weaver | Software Engineer | github.com/ibzib | > kcwea...@google.com > > > On Wed, Sep 25, 2019 at 11:21 AM Pablo Estrada > wrote: > >> Thanks Chad, and thank you for notifying on the dev list. >> >> On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles >> wrote: >> >>> Nice. >>> >>> Do we have good pypi caching? If not this could add a lot of >>> overhead to our already-backed-up CI queue. (btw I still think your >>> change >>> is good, and just makes proper caching more important) >>> >>> Kenn >>> >>> On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova >>> wrote: >>> Hi all, I'm working to make the CI experience with python a bit better, and my current initiative is splitting up the giant Python PreCommit job into 5 separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7. Around 11am Pacific time tomorrow I'm going to initiate the seed jobs, at which point all PRs will start to run the new precommit jobs. It's a bit of a chicken-and-egg scenario with testing this, so there could be issues that pop up after the seed jobs are created, but I'll be working to resolve those issues as quickly as possible. If you run into problems because of this change, please let me know on the github PR. Here's the PR: https://github.com/apache/beam/pull/9642 Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213# The upshot is that after this is done you'll get better feedback on python test failures! Let me know if you have any concerns. thanks, chad smime.p7s Description: S/MIME Cryptographic Signature
Re: NOTICE: New Python PreCommit jobs
Hi all, I've posted a new PR that just splits out the python lint job here: https://github.com/apache/beam/pull/9706 I'll be running the seed job shortly unless anyone objects. -chad On Tue, Oct 1, 2019 at 9:04 PM Chad Dombrova wrote: > I haven’t used nose’s parallel execution plugin, but I have used pytest > with xdist with success. If your tests are designed to run in any order and > are properly sandboxed to prevent crosstalk between concurrent runs, which > they *should* be, then in my experience it works very well. > > > On Fri, Sep 27, 2019 at 6:51 PM Kenneth Knowles wrote: > >> Do things go wrong when nose is configured to use parallel execution? >> >> On Fri, Sep 27, 2019 at 5:09 PM Chad Dombrova wrote: >> >>> By the way, the outcome on this was that splitting the python precommit >>> job into one job per python version resulted in increasing the total test >>> completion time by 66%, which is obviously not good. This is because we >>> are using Gradle to run the python tests tasks in parallel (the jenkins VMs >>> have 16 cores each, utilized across 2 slots, IIRC), but after the split >>> there were only 1-2 gradle tasks per test. Since the python test runner, >>> nose, is currently not using parallel execution, there were not enough >>> concurrent tasks to make proper use of the VM's CPUs. >>> >>> tl;dr I'm going to create a followup PR to split out just the Lint job >>> (same as we have Spotless for Java). This is our best ROI for now. >>> >>> -chad >>> >>> >>> On Fri, Sep 27, 2019 at 3:27 PM Kyle Weaver wrote: >>> > Do we have good pypi caching? Building Python SDK harness containers takes 2 mins each (times 4, the number of versions) on my machine, even if nothing has changed. But we're already paying that cost, so I don't think splitting the jobs should make it any worse. (https://issues.apache.org/jira/browse/BEAM-8277 if anyone has any ideas) Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com On Wed, Sep 25, 2019 at 11:21 AM Pablo Estrada wrote: > Thanks Chad, and thank you for notifying on the dev list. > > On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles > wrote: > >> Nice. >> >> Do we have good pypi caching? If not this could add a lot of overhead >> to our already-backed-up CI queue. (btw I still think your change is >> good, >> and just makes proper caching more important) >> >> Kenn >> >> On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova >> wrote: >> >>> Hi all, >>> I'm working to make the CI experience with python a bit better, and >>> my current initiative is splitting up the giant Python PreCommit job >>> into 5 >>> separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7. >>> >>> Around 11am Pacific time tomorrow I'm going to initiate the seed >>> jobs, at which point all PRs will start to run the new precommit jobs. >>> It's a bit of a chicken-and-egg scenario with testing this, so there >>> could >>> be issues that pop up after the seed jobs are created, but I'll be >>> working >>> to resolve those issues as quickly as possible. >>> >>> If you run into problems because of this change, please let me know >>> on the github PR. >>> >>> Here's the PR: https://github.com/apache/beam/pull/9642 >>> Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213# >>> >>> The upshot is that after this is done you'll get better feedback on >>> python test failures! >>> >>> Let me know if you have any concerns. >>> >>> thanks, >>> chad >>> >>>
Re: NOTICE: New Python PreCommit jobs
I haven’t used nose’s parallel execution plugin, but I have used pytest with xdist with success. If your tests are designed to run in any order and are properly sandboxed to prevent crosstalk between concurrent runs, which they *should* be, then in my experience it works very well. On Fri, Sep 27, 2019 at 6:51 PM Kenneth Knowles wrote: > Do things go wrong when nose is configured to use parallel execution? > > On Fri, Sep 27, 2019 at 5:09 PM Chad Dombrova wrote: > >> By the way, the outcome on this was that splitting the python precommit >> job into one job per python version resulted in increasing the total test >> completion time by 66%, which is obviously not good. This is because we >> are using Gradle to run the python tests tasks in parallel (the jenkins VMs >> have 16 cores each, utilized across 2 slots, IIRC), but after the split >> there were only 1-2 gradle tasks per test. Since the python test runner, >> nose, is currently not using parallel execution, there were not enough >> concurrent tasks to make proper use of the VM's CPUs. >> >> tl;dr I'm going to create a followup PR to split out just the Lint job >> (same as we have Spotless for Java). This is our best ROI for now. >> >> -chad >> >> >> On Fri, Sep 27, 2019 at 3:27 PM Kyle Weaver wrote: >> >>> > Do we have good pypi caching? >>> >>> Building Python SDK harness containers takes 2 mins each (times 4, the >>> number of versions) on my machine, even if nothing has changed. But we're >>> already paying that cost, so I don't think splitting the jobs should make >>> it any worse. (https://issues.apache.org/jira/browse/BEAM-8277 if >>> anyone has any ideas) >>> >>> Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com >>> >>> >>> On Wed, Sep 25, 2019 at 11:21 AM Pablo Estrada >>> wrote: >>> Thanks Chad, and thank you for notifying on the dev list. On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles wrote: > Nice. > > Do we have good pypi caching? If not this could add a lot of overhead > to our already-backed-up CI queue. (btw I still think your change is good, > and just makes proper caching more important) > > Kenn > > On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova > wrote: > >> Hi all, >> I'm working to make the CI experience with python a bit better, and >> my current initiative is splitting up the giant Python PreCommit job >> into 5 >> separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7. >> >> Around 11am Pacific time tomorrow I'm going to initiate the seed >> jobs, at which point all PRs will start to run the new precommit jobs. >> It's a bit of a chicken-and-egg scenario with testing this, so there >> could >> be issues that pop up after the seed jobs are created, but I'll be >> working >> to resolve those issues as quickly as possible. >> >> If you run into problems because of this change, please let me know >> on the github PR. >> >> Here's the PR: https://github.com/apache/beam/pull/9642 >> Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213# >> >> The upshot is that after this is done you'll get better feedback on >> python test failures! >> >> Let me know if you have any concerns. >> >> thanks, >> chad >> >>
Re: NOTICE: New Python PreCommit jobs
Do things go wrong when nose is configured to use parallel execution? On Fri, Sep 27, 2019 at 5:09 PM Chad Dombrova wrote: > By the way, the outcome on this was that splitting the python precommit > job into one job per python version resulted in increasing the total test > completion time by 66%, which is obviously not good. This is because we > are using Gradle to run the python tests tasks in parallel (the jenkins VMs > have 16 cores each, utilized across 2 slots, IIRC), but after the split > there were only 1-2 gradle tasks per test. Since the python test runner, > nose, is currently not using parallel execution, there were not enough > concurrent tasks to make proper use of the VM's CPUs. > > tl;dr I'm going to create a followup PR to split out just the Lint job > (same as we have Spotless for Java). This is our best ROI for now. > > -chad > > > On Fri, Sep 27, 2019 at 3:27 PM Kyle Weaver wrote: > >> > Do we have good pypi caching? >> >> Building Python SDK harness containers takes 2 mins each (times 4, the >> number of versions) on my machine, even if nothing has changed. But we're >> already paying that cost, so I don't think splitting the jobs should make >> it any worse. (https://issues.apache.org/jira/browse/BEAM-8277 if anyone >> has any ideas) >> >> Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com >> >> >> On Wed, Sep 25, 2019 at 11:21 AM Pablo Estrada >> wrote: >> >>> Thanks Chad, and thank you for notifying on the dev list. >>> >>> On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles >>> wrote: >>> Nice. Do we have good pypi caching? If not this could add a lot of overhead to our already-backed-up CI queue. (btw I still think your change is good, and just makes proper caching more important) Kenn On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova wrote: > Hi all, > I'm working to make the CI experience with python a bit better, and my > current initiative is splitting up the giant Python PreCommit job into 5 > separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7. > > Around 11am Pacific time tomorrow I'm going to initiate the seed jobs, > at which point all PRs will start to run the new precommit jobs. It's a > bit of a chicken-and-egg scenario with testing this, so there could be > issues that pop up after the seed jobs are created, but I'll be working to > resolve those issues as quickly as possible. > > If you run into problems because of this change, please let me know on > the github PR. > > Here's the PR: https://github.com/apache/beam/pull/9642 > Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213# > > The upshot is that after this is done you'll get better feedback on > python test failures! > > Let me know if you have any concerns. > > thanks, > chad > >
Re: NOTICE: New Python PreCommit jobs
By the way, the outcome on this was that splitting the python precommit job into one job per python version resulted in increasing the total test completion time by 66%, which is obviously not good. This is because we are using Gradle to run the python tests tasks in parallel (the jenkins VMs have 16 cores each, utilized across 2 slots, IIRC), but after the split there were only 1-2 gradle tasks per test. Since the python test runner, nose, is currently not using parallel execution, there were not enough concurrent tasks to make proper use of the VM's CPUs. tl;dr I'm going to create a followup PR to split out just the Lint job (same as we have Spotless for Java). This is our best ROI for now. -chad On Fri, Sep 27, 2019 at 3:27 PM Kyle Weaver wrote: > > Do we have good pypi caching? > > Building Python SDK harness containers takes 2 mins each (times 4, the > number of versions) on my machine, even if nothing has changed. But we're > already paying that cost, so I don't think splitting the jobs should make > it any worse. (https://issues.apache.org/jira/browse/BEAM-8277 if anyone > has any ideas) > > Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com > > > On Wed, Sep 25, 2019 at 11:21 AM Pablo Estrada wrote: > >> Thanks Chad, and thank you for notifying on the dev list. >> >> On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles wrote: >> >>> Nice. >>> >>> Do we have good pypi caching? If not this could add a lot of overhead to >>> our already-backed-up CI queue. (btw I still think your change is good, and >>> just makes proper caching more important) >>> >>> Kenn >>> >>> On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova wrote: >>> Hi all, I'm working to make the CI experience with python a bit better, and my current initiative is splitting up the giant Python PreCommit job into 5 separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7. Around 11am Pacific time tomorrow I'm going to initiate the seed jobs, at which point all PRs will start to run the new precommit jobs. It's a bit of a chicken-and-egg scenario with testing this, so there could be issues that pop up after the seed jobs are created, but I'll be working to resolve those issues as quickly as possible. If you run into problems because of this change, please let me know on the github PR. Here's the PR: https://github.com/apache/beam/pull/9642 Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213# The upshot is that after this is done you'll get better feedback on python test failures! Let me know if you have any concerns. thanks, chad
Re: NOTICE: New Python PreCommit jobs
> Do we have good pypi caching? Building Python SDK harness containers takes 2 mins each (times 4, the number of versions) on my machine, even if nothing has changed. But we're already paying that cost, so I don't think splitting the jobs should make it any worse. (https://issues.apache.org/jira/browse/BEAM-8277 if anyone has any ideas) Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com On Wed, Sep 25, 2019 at 11:21 AM Pablo Estrada wrote: > Thanks Chad, and thank you for notifying on the dev list. > > On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles wrote: > >> Nice. >> >> Do we have good pypi caching? If not this could add a lot of overhead to >> our already-backed-up CI queue. (btw I still think your change is good, and >> just makes proper caching more important) >> >> Kenn >> >> On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova wrote: >> >>> Hi all, >>> I'm working to make the CI experience with python a bit better, and my >>> current initiative is splitting up the giant Python PreCommit job into 5 >>> separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7. >>> >>> Around 11am Pacific time tomorrow I'm going to initiate the seed jobs, >>> at which point all PRs will start to run the new precommit jobs. It's a >>> bit of a chicken-and-egg scenario with testing this, so there could be >>> issues that pop up after the seed jobs are created, but I'll be working to >>> resolve those issues as quickly as possible. >>> >>> If you run into problems because of this change, please let me know on >>> the github PR. >>> >>> Here's the PR: https://github.com/apache/beam/pull/9642 >>> Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213# >>> >>> The upshot is that after this is done you'll get better feedback on >>> python test failures! >>> >>> Let me know if you have any concerns. >>> >>> thanks, >>> chad >>> >>>
Re: NOTICE: New Python PreCommit jobs
Thanks Chad, and thank you for notifying on the dev list. On Wed, Sep 25, 2019 at 10:59 AM Kenneth Knowles wrote: > Nice. > > Do we have good pypi caching? If not this could add a lot of overhead to > our already-backed-up CI queue. (btw I still think your change is good, and > just makes proper caching more important) > > Kenn > > On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova wrote: > >> Hi all, >> I'm working to make the CI experience with python a bit better, and my >> current initiative is splitting up the giant Python PreCommit job into 5 >> separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7. >> >> Around 11am Pacific time tomorrow I'm going to initiate the seed jobs, at >> which point all PRs will start to run the new precommit jobs. It's a bit >> of a chicken-and-egg scenario with testing this, so there could be issues >> that pop up after the seed jobs are created, but I'll be working to resolve >> those issues as quickly as possible. >> >> If you run into problems because of this change, please let me know on >> the github PR. >> >> Here's the PR: https://github.com/apache/beam/pull/9642 >> Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213# >> >> The upshot is that after this is done you'll get better feedback on >> python test failures! >> >> Let me know if you have any concerns. >> >> thanks, >> chad >> >>
Re: NOTICE: New Python PreCommit jobs
Nice. Do we have good pypi caching? If not this could add a lot of overhead to our already-backed-up CI queue. (btw I still think your change is good, and just makes proper caching more important) Kenn On Tue, Sep 24, 2019 at 9:55 PM Chad Dombrova wrote: > Hi all, > I'm working to make the CI experience with python a bit better, and my > current initiative is splitting up the giant Python PreCommit job into 5 > separate jobs into separate jobs for Lint, Py2, Py3.5, Py3.6, and Py3.7. > > Around 11am Pacific time tomorrow I'm going to initiate the seed jobs, at > which point all PRs will start to run the new precommit jobs. It's a bit > of a chicken-and-egg scenario with testing this, so there could be issues > that pop up after the seed jobs are created, but I'll be working to resolve > those issues as quickly as possible. > > If you run into problems because of this change, please let me know on the > github PR. > > Here's the PR: https://github.com/apache/beam/pull/9642 > Here's the Jira: https://issues.apache.org/jira/browse/BEAM-8213# > > The upshot is that after this is done you'll get better feedback on python > test failures! > > Let me know if you have any concerns. > > thanks, > chad > >