Re: [openstack-dev] [tempest]Tempest test concurrency

2016-09-21 Thread Bob Hansen
Matthew, this helps tremendously. As you can tell the conclusion I was
heading towards was not accurate.

Now to look a bit deeper.

Thanks,

Bob Hansen
z/VM OpenStack Enablement

Matthew Treinish <mtrein...@kortar.org> wrote on 09/21/2016 11:07:04 AM:

> From: Matthew Treinish <mtrein...@kortar.org>
> To: "OpenStack Development Mailing List (not for usage questions)"
> <openstack-dev@lists.openstack.org>
> Date: 09/21/2016 11:09 AM
> Subject: Re: [openstack-dev] [tempest]Tempest test concurrency
>
> On Wed, Sep 21, 2016 at 10:44:51AM -0400, Bob Hansen wrote:
> >
> >
> > I have been looking at some of the stackviz output as I'm trying to
improve
> > the run time of my thrid-party CI. As an example:
> >
> > http://logs.openstack.org/36/371836/1/check/gate-tempest-dsvm-
> full-ubuntu-xenial/087db0f/logs/stackviz/#/stdin/timeline
> >
> > What jumps out is the amount of time that each worker is not running
any
> > tests. I would have expected quite a bit more concurrecy between the
two
> > workers in the chart, e.g. more overlap. I've noticed a simliar thing
with
> > my test runs using 4 workers.
>
> So the gaps between tests aren't actually wait time, the workers
aresaturated
> doing stuff during a run. Those gaps are missing data in the subunit
streams
> that are used as the soure of the data for rendering those timelines. The
gaps
> are where things like setUp, setUpClass, tearDown, tearDownClass, and
> addCleanups which are not added to the subunit stream. It's just an
> artifact of
> the incomplete data, not bad scheduling. This also means that testr does
not
> take into account any of the missing timing when it makes decisions based
on
> previous runs.
>
> >
> > Can anyone explain why this is and where can I find out more
information
> > about the scheduler and what information it is using to decide when to
> > dispatch tests? I'm already feeding my system a prior subunit stream to
> > help influence the scheduler as my test run times are different due to
the
> > way our openstack implementation is architected. A simple round-robin
> > approach is not the most efficeint in my case.
>
> If you're curious about how testr does scheduling most of that happens
here:
>
> https://github.com/testing-cabal/testrepository/blob/master/
> testrepository/testcommand.py
>
> One thing to remember is that testr isn't actually a test runner, it's a
test
> runner runner. It partitions the tests based on time information and
passes
> those to (multiple) test runner workers. The actual order of execution
inside
> those partitions is handled by the test runner itself. (in our case
> subunit.run)
>
> -Matt Treinish
> [attachment "signature.asc" deleted by Bob Hansen/Endicott/IBM]
>
__
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tempest]Tempest test concurrency

2016-09-21 Thread Matthew Treinish
On Wed, Sep 21, 2016 at 10:44:51AM -0400, Bob Hansen wrote:
> 
> 
> I have been looking at some of the stackviz output as I'm trying to improve
> the run time of my thrid-party CI. As an example:
> 
> http://logs.openstack.org/36/371836/1/check/gate-tempest-dsvm-full-ubuntu-xenial/087db0f/logs/stackviz/#/stdin/timeline
> 
> What jumps out is the amount of time that each worker is not running any
> tests. I would have expected quite a bit more concurrecy between the two
> workers in the chart, e.g. more overlap. I've noticed a simliar thing with
> my test runs using 4 workers.

So the gaps between tests aren't actually wait time, the workers are saturated
doing stuff during a run. Those gaps are missing data in the subunit streams
that are used as the soure of the data for rendering those timelines. The gaps
are where things like setUp, setUpClass, tearDown, tearDownClass, and
addCleanups which are not added to the subunit stream. It's just an artifact of
the incomplete data, not bad scheduling. This also means that testr does not
take into account any of the missing timing when it makes decisions based on
previous runs.

> 
> Can anyone explain why this is and where can I find out more information
> about the scheduler and what information it is using to decide when to
> dispatch tests? I'm already feeding my system a prior subunit stream to
> help influence the scheduler as my test run times are different due to the
> way our openstack implementation is architected. A simple round-robin
> approach is not the most efficeint in my case.

If you're curious about how testr does scheduling most of that happens here:

https://github.com/testing-cabal/testrepository/blob/master/testrepository/testcommand.py

One thing to remember is that testr isn't actually a test runner, it's a test
runner runner. It partitions the tests based on time information and passes
those to (multiple) test runner workers. The actual order of execution inside
those partitions is handled by the test runner itself. (in our case subunit.run)

-Matt Treinish


signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tempest]Tempest test concurrency

2016-09-21 Thread Bob Hansen


I have been looking at some of the stackviz output as I'm trying to improve
the run time of my thrid-party CI. As an example:

http://logs.openstack.org/36/371836/1/check/gate-tempest-dsvm-full-ubuntu-xenial/087db0f/logs/stackviz/#/stdin/timeline

What jumps out is the amount of time that each worker is not running any
tests. I would have expected quite a bit more concurrecy between the two
workers in the chart, e.g. more overlap. I've noticed a simliar thing with
my test runs using 4 workers.

Can anyone explain why this is and where can I find out more information
about the scheduler and what information it is using to decide when to
dispatch tests? I'm already feeding my system a prior subunit stream to
help influence the scheduler as my test run times are different due to the
way our openstack implementation is architected. A simple round-robin
approach is not the most efficeint in my case.

(maybe openstack-infra is a better place to ask?)

Thanks!

Bob Hansen
z/VM OpenStack Enablement
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev