Seems like there's no easy fix for this :( Since the workaround we're adopting is to run the Allura package's tests with nosetests --processes=2 (or more) we should probably force ./run_tests to do that so it doesn't cause problems for somebody trying out Allura on a single core machine or VM. Any downside to that?
On 9/25/14 12:41 AM, Alex Luberg wrote: > I have discovered that the suite passed with 756 tests, and if I added > another test(just copied some existing one with a different name) it locked > up at some test(which was not the one i've copied). I suspect that it is > not related to the actual test code, but something with nose/python/sandbox. > > On Mon, Sep 22, 2014 at 3:40 AM, Igor Bondarenko <[email protected]> wrote: > >> On Sat, Sep 20, 2014 at 12:25 AM, Dave Brondsema <[email protected]> >> wrote: >> >>> On 9/19/14 12:18 PM, Dave Brondsema wrote: >>>> Starting with Igor's comments on >>> https://sourceforge.net/p/allura/tickets/7657/#c7d9 >>>> >>>>> There's a couple of new tests commented out in a last commit. I can't >>> figure out why, but they cause allura/tests/test_dispatch.py to hang when >>> run together with other tests. Also I have added and then removed tests >> for >>> enable/disable user for the same reason. >>>>> >>>>> I think it needs another pair of eyes on it, since I've already spent >>> too much time dealing with this tests and have no idea what's >> happening... >>> Maybe I'm missing something obvious. >>>> >>>> Alex and I have seen this recently too, and its hard to figure out what >>> exactly >>>> is the problem. I first noticed it when running `./run_tests >>> --with-coverage` >>>> which would run nosetests in the Allura dir and would not use >>> --processes=N >>>> because of the with-coverage param. So basically just a regular run of >>> the >>>> tests in the Allura dir would cause the CPU to go into 100% usage and >>> the tests >>>> wouldn't finish. Couldn't ctrl-C or profile them, had to kill -9 it. >>>> >>>> That was on Centos 5.10 and a workaround was to run with --processes=N >>> and then >>>> the tests would finish fine. On the Ubuntu vagrant image, I didn't >>> encounter >>>> any problem in the first place. So perhaps related to the environment. >>>> >>>> I tried to narrow down to a specific test that might be the culprit. I >>> found >>>> tests consistently got up to TestSecurity.test_auth (which is a bit >>> weird and >>>> old test anyway). And also that commenting out that test let them all >>> pass. >>>> >>>> But I'm pretty sure Alex said he dug into this as well and found >>> variation in >>>> what tests could cause the problem. I think he told me that going back >>> in >>>> git-history before the problem, and then adding a single test (a copy >> of >>> an >>>> existing one) caused the problem. So perhaps some limit, or resource >>> tipping >>>> point is hit. >>>> >>>> Alex or Igor, any more data points you know from what you've seen? >>>> >>>> Anyone else seen anything like this? Or have ideas for how to approach >>> nailing >>>> it down better? >>>> >>>> >>> >>> I tried checking out branch je/42cc_7657 and going back to commit >>> 4cc63586e5728d7d0c5c2f09150eb07eb7e4edc1 (before tests were commented >> out) >>> to >>> see what happened for me: >>> >>> On vagrant / ubuntu, it froze at test_dispatch.py same as you. So some >>> consistency there. Tests passed when I ran `nosetests >>> --process-timeout=180 >>> --processes=4 -v` in the Allura dir. Seemed slow at the end though, >> almost >>> thought it froze. >>> >>> On centos, it froze at a different spot with a regular nosetests run. It >>> passed >>> with `nosetests allura/tests/ --processes=4 --process-timeout=180 -v`. >>> For some >>> reason (hopefully unrelated), I needed to specify path "allura/tests/" to >>> avoid >>> an IOError from multiprocessing. >>> >>> So at least multiprocess tests still seems like a workaround for me. >> Note: >>> ./run_tests picks a --processes=N value dynamically based on the >> machine's >>> CPU >>> cores, so with a single core you don't get multiple processes that way. >>> Also >>> note: if you have nose-progressive installed and active, that is >>> incompatible >>> with multiple processes. >>> >>> >> It works exactly as you described for me too. >> >> I've reverted some commits with those tests, since problem not with them >> and they are useful https://sourceforge.net/p/allura/tickets/7657/#8c06 >> and >> also made a fix in 42cc's Makefile (commited directly in master), so that >> it would always run tests in parallel (turns out here at 42cc we have >> single core CPUs on boxes that run tests, that's why I had locks on our CI >> also :( ) >> > -- Dave Brondsema : [email protected] http://www.brondsema.net : personal http://www.splike.com : programming <><
