Re: Tests locking up with 100% CPU usage

Alex Luberg Wed, 24 Sep 2014 21:42:11 -0700

I have discovered that the suite passed with 756 tests, and if I added
another test(just copied some existing one with a different name) it locked
up at some test(which was not the one i've copied). I suspect that it is
not related to the actual test code, but something with nose/python/sandbox.


On Mon, Sep 22, 2014 at 3:40 AM, Igor Bondarenko <[email protected]> wrote:

> On Sat, Sep 20, 2014 at 12:25 AM, Dave Brondsema <[email protected]>
> wrote:
>
> > On 9/19/14 12:18 PM, Dave Brondsema wrote:
> > > Starting with Igor's comments on
> > https://sourceforge.net/p/allura/tickets/7657/#c7d9
> > >
> > >> There's a couple of new tests commented out in a last commit. I can't
> > figure out why, but they cause allura/tests/test_dispatch.py to hang when
> > run together with other tests. Also I have added and then removed tests
> for
> > enable/disable user for the same reason.
> > >>
> > >> I think it needs another pair of eyes on it, since I've already spent
> > too much time dealing with this tests and have no idea what's
> happening...
> > Maybe I'm missing something obvious.
> > >
> > > Alex and I have seen this recently too, and its hard to figure out what
> > exactly
> > > is the problem.  I first noticed it when running `./run_tests
> > --with-coverage`
> > > which would run nosetests in the Allura dir and would not use
> > --processes=N
> > > because of the with-coverage param.  So basically just a regular run of
> > the
> > > tests in the Allura dir would cause the CPU to go into 100% usage and
> > the tests
> > > wouldn't finish.  Couldn't ctrl-C or profile them, had to kill -9 it.
> > >
> > > That was on Centos 5.10 and a workaround was to run with --processes=N
> > and then
> > > the tests would finish fine.  On the Ubuntu vagrant image, I didn't
> > encounter
> > > any problem in the first place.  So perhaps related to the environment.
> > >
> > > I tried to narrow down to a specific test that might be the culprit.  I
> > found
> > > tests consistently got up to TestSecurity.test_auth (which is a bit
> > weird and
> > > old test anyway).  And also that commenting out that test let them all
> > pass.
> > >
> > > But I'm pretty sure Alex said he dug into this as well and found
> > variation in
> > > what tests could cause the problem.  I think he told me that going back
> > in
> > > git-history before the problem, and then adding a single test (a copy
> of
> > an
> > > existing one) caused the problem.  So perhaps some limit, or resource
> > tipping
> > > point is hit.
> > >
> > > Alex or Igor, any more data points you know from what you've seen?
> > >
> > > Anyone else seen anything like this?  Or have ideas for how to approach
> > nailing
> > > it down better?
> > >
> > >
> >
> > I tried checking out branch je/42cc_7657 and going back to commit
> > 4cc63586e5728d7d0c5c2f09150eb07eb7e4edc1 (before tests were commented
> out)
> > to
> > see what happened for me:
> >
> > On vagrant / ubuntu, it froze at test_dispatch.py same as you.  So some
> > consistency there.  Tests passed when I ran `nosetests
> > --process-timeout=180
> > --processes=4 -v` in the Allura dir.  Seemed slow at the end though,
> almost
> > thought it froze.
> >
> > On centos, it froze at a different spot with a regular nosetests run.  It
> > passed
> > with `nosetests allura/tests/ --processes=4 --process-timeout=180 -v`.
> > For some
> > reason (hopefully unrelated), I needed to specify path "allura/tests/" to
> > avoid
> > an IOError from multiprocessing.
> >
> > So at least multiprocess tests still seems like a workaround for me.
> Note:
> > ./run_tests picks a --processes=N value dynamically based on the
> machine's
> > CPU
> > cores, so with a single core you don't get multiple processes that way.
> > Also
> > note: if you have nose-progressive installed and active, that is
> > incompatible
> > with multiple processes.
> >
> >
> It works exactly as you described for me too.
>
> I've reverted some commits with those tests, since problem not with them
> and they are useful https://sourceforge.net/p/allura/tickets/7657/#8c06
> and
> also made a fix in 42cc's Makefile (commited directly in master), so that
> it would always run tests in parallel (turns out here at 42cc we have
> single core CPUs on boxes that run tests, that's why I had locks on our CI
> also :( )
>

Re: Tests locking up with 100% CPU usage

Reply via email to