I have discovered that the suite passed with 756 tests, and if I added another test(just copied some existing one with a different name) it locked up at some test(which was not the one i've copied). I suspect that it is not related to the actual test code, but something with nose/python/sandbox.
On Mon, Sep 22, 2014 at 3:40 AM, Igor Bondarenko <[email protected]> wrote: > On Sat, Sep 20, 2014 at 12:25 AM, Dave Brondsema <[email protected]> > wrote: > > > On 9/19/14 12:18 PM, Dave Brondsema wrote: > > > Starting with Igor's comments on > > https://sourceforge.net/p/allura/tickets/7657/#c7d9 > > > > > >> There's a couple of new tests commented out in a last commit. I can't > > figure out why, but they cause allura/tests/test_dispatch.py to hang when > > run together with other tests. Also I have added and then removed tests > for > > enable/disable user for the same reason. > > >> > > >> I think it needs another pair of eyes on it, since I've already spent > > too much time dealing with this tests and have no idea what's > happening... > > Maybe I'm missing something obvious. > > > > > > Alex and I have seen this recently too, and its hard to figure out what > > exactly > > > is the problem. I first noticed it when running `./run_tests > > --with-coverage` > > > which would run nosetests in the Allura dir and would not use > > --processes=N > > > because of the with-coverage param. So basically just a regular run of > > the > > > tests in the Allura dir would cause the CPU to go into 100% usage and > > the tests > > > wouldn't finish. Couldn't ctrl-C or profile them, had to kill -9 it. > > > > > > That was on Centos 5.10 and a workaround was to run with --processes=N > > and then > > > the tests would finish fine. On the Ubuntu vagrant image, I didn't > > encounter > > > any problem in the first place. So perhaps related to the environment. > > > > > > I tried to narrow down to a specific test that might be the culprit. I > > found > > > tests consistently got up to TestSecurity.test_auth (which is a bit > > weird and > > > old test anyway). And also that commenting out that test let them all > > pass. > > > > > > But I'm pretty sure Alex said he dug into this as well and found > > variation in > > > what tests could cause the problem. I think he told me that going back > > in > > > git-history before the problem, and then adding a single test (a copy > of > > an > > > existing one) caused the problem. So perhaps some limit, or resource > > tipping > > > point is hit. > > > > > > Alex or Igor, any more data points you know from what you've seen? > > > > > > Anyone else seen anything like this? Or have ideas for how to approach > > nailing > > > it down better? > > > > > > > > > > I tried checking out branch je/42cc_7657 and going back to commit > > 4cc63586e5728d7d0c5c2f09150eb07eb7e4edc1 (before tests were commented > out) > > to > > see what happened for me: > > > > On vagrant / ubuntu, it froze at test_dispatch.py same as you. So some > > consistency there. Tests passed when I ran `nosetests > > --process-timeout=180 > > --processes=4 -v` in the Allura dir. Seemed slow at the end though, > almost > > thought it froze. > > > > On centos, it froze at a different spot with a regular nosetests run. It > > passed > > with `nosetests allura/tests/ --processes=4 --process-timeout=180 -v`. > > For some > > reason (hopefully unrelated), I needed to specify path "allura/tests/" to > > avoid > > an IOError from multiprocessing. > > > > So at least multiprocess tests still seems like a workaround for me. > Note: > > ./run_tests picks a --processes=N value dynamically based on the > machine's > > CPU > > cores, so with a single core you don't get multiple processes that way. > > Also > > note: if you have nose-progressive installed and active, that is > > incompatible > > with multiple processes. > > > > > It works exactly as you described for me too. > > I've reverted some commits with those tests, since problem not with them > and they are useful https://sourceforge.net/p/allura/tickets/7657/#8c06 > and > also made a fix in 42cc's Makefile (commited directly in master), so that > it would always run tests in parallel (turns out here at 42cc we have > single core CPUs on boxes that run tests, that's why I had locks on our CI > also :( ) >
