On Sat, Sep 20, 2014 at 12:25 AM, Dave Brondsema <[email protected]> wrote:
> On 9/19/14 12:18 PM, Dave Brondsema wrote: > > Starting with Igor's comments on > https://sourceforge.net/p/allura/tickets/7657/#c7d9 > > > >> There's a couple of new tests commented out in a last commit. I can't > figure out why, but they cause allura/tests/test_dispatch.py to hang when > run together with other tests. Also I have added and then removed tests for > enable/disable user for the same reason. > >> > >> I think it needs another pair of eyes on it, since I've already spent > too much time dealing with this tests and have no idea what's happening... > Maybe I'm missing something obvious. > > > > Alex and I have seen this recently too, and its hard to figure out what > exactly > > is the problem. I first noticed it when running `./run_tests > --with-coverage` > > which would run nosetests in the Allura dir and would not use > --processes=N > > because of the with-coverage param. So basically just a regular run of > the > > tests in the Allura dir would cause the CPU to go into 100% usage and > the tests > > wouldn't finish. Couldn't ctrl-C or profile them, had to kill -9 it. > > > > That was on Centos 5.10 and a workaround was to run with --processes=N > and then > > the tests would finish fine. On the Ubuntu vagrant image, I didn't > encounter > > any problem in the first place. So perhaps related to the environment. > > > > I tried to narrow down to a specific test that might be the culprit. I > found > > tests consistently got up to TestSecurity.test_auth (which is a bit > weird and > > old test anyway). And also that commenting out that test let them all > pass. > > > > But I'm pretty sure Alex said he dug into this as well and found > variation in > > what tests could cause the problem. I think he told me that going back > in > > git-history before the problem, and then adding a single test (a copy of > an > > existing one) caused the problem. So perhaps some limit, or resource > tipping > > point is hit. > > > > Alex or Igor, any more data points you know from what you've seen? > > > > Anyone else seen anything like this? Or have ideas for how to approach > nailing > > it down better? > > > > > > I tried checking out branch je/42cc_7657 and going back to commit > 4cc63586e5728d7d0c5c2f09150eb07eb7e4edc1 (before tests were commented out) > to > see what happened for me: > > On vagrant / ubuntu, it froze at test_dispatch.py same as you. So some > consistency there. Tests passed when I ran `nosetests > --process-timeout=180 > --processes=4 -v` in the Allura dir. Seemed slow at the end though, almost > thought it froze. > > On centos, it froze at a different spot with a regular nosetests run. It > passed > with `nosetests allura/tests/ --processes=4 --process-timeout=180 -v`. > For some > reason (hopefully unrelated), I needed to specify path "allura/tests/" to > avoid > an IOError from multiprocessing. > > So at least multiprocess tests still seems like a workaround for me. Note: > ./run_tests picks a --processes=N value dynamically based on the machine's > CPU > cores, so with a single core you don't get multiple processes that way. > Also > note: if you have nose-progressive installed and active, that is > incompatible > with multiple processes. > > It works exactly as you described for me too. I've reverted some commits with those tests, since problem not with them and they are useful https://sourceforge.net/p/allura/tickets/7657/#8c06 and also made a fix in 42cc's Makefile (commited directly in master), so that it would always run tests in parallel (turns out here at 42cc we have single core CPUs on boxes that run tests, that's why I had locks on our CI also :( )
