On Fri, 2011-05-13 at 16:57 -0700, Erik Rose wrote: > tl;dr: I've written an alternative TestCase base class which makes > fixture-using tests much more I/O efficient on transactional DBs, and I'd > like to upstream it. > > Greetings, all! This is my first django-dev post, so please be gentle. :-) I > hack on support.mozilla.com, a fairly large Django site with about 1000 > tests. Those tests make heavy use of fixtures and, as a result, used to take > over 5 minutes to run. So, I spent a few days seeing if I could cut the > amount of DB I/O needed. Ultimately, I got the run down to just over 1 > minute, and almost all of those gains are translatable to any Django site > running against a transactional DB. No changes to the apps themselves are > needed. I'd love to push some of this work upstream, if there's interest (or > even lack of opposition ;-)). > > The speedups came from 3 main optimizations: > > 1. Class-level fixture setup > > Given a transaction DB, there's no reason to reload fixtures via dozens of > SQL statements before every test. I made use of setup_class() and > teardown_class() (yay, unittest2!) to change the flow for TestCase-using > tests to this: > a. Load the fixtures at the top of the class, and commit. > b. Run a test. > c. Roll back, returning to pristine fixtures. Go back to step b. > d. At class teardown, figure out which tables the fixtures loaded into, > and expressly clear out what was added. > > Before this optimization: 302s to run the suite > After: 97s. > > Before: 37,583 queries > After: 4,116 > > On top of that, an additional 4s was saved by reusing a single connection > rather than opening and closing them all the time, bringing the final number > down to 93s. (We can get away with this because we're committing any > on-cursor-initialization setup, whereas the old TestCase rolled it back.) > > Here's the code: > https://github.com/erikrose/test-utils/blob/master/test_utils/__init__.py#L121. > I'd love to generalize it a bit (to fall back to the old behavior with > non-transactional backends, for example) and offer it as a patch to Django > proper, replacing TestCase. Thoughts? > > (If you notice that copy-and-paste of loaddata sitting off to the side in > another module, don't fret; in the patch, that would turn into a refactoring > of loaddata to make the computation of the fixture-referenced tables > separately reusable.) > > > 2. Fixture grouping > > I next observed that many test classes reused the same sets of fixtures, > often via subclassing. After the previous optimization, our tests still > loaded fixtures 114 times, even though there were only 11 distinct sets of > them. So, I thought: why not write a custom testrunner that buckets the > classes by fixture set and advises the classes that, unless they're the first > or last in a bucket, they shouldn't bother tearing down or setting up the > fixtures, respectively? This took the form of a custom nose plugin (we use > nose for all our Django stuff), and it took another quarter off the test run: > > Before: 97s > After: 74s > > Of course, test independence is still preserved. We're just factoring out > pointlessly repeated setup. > > I don't really have plans to upstream this unless someone calls for it, but > I'll be making it available soon, likely as part of django-nose. > > > 3. Startup optimizations > > At this point, it was bothering me that, just to run a single test, I had to > wait through 15s of DB initialization (mostly auth_permissions and > django_content_type population)—stuff which was already perfectly valid from > the previous test run. So, building on some work we had already done in this > direction, I decided to skip the teardown of the test DB and, symmetrically, > the setup on future runs. If you make schema changes, just set an env var, > and it wipes and remakes the DB like usual. I could see pushing this into > django-nose as well, but it's got the hackiest implementation and can > potentially confuse users. I mention it for completeness. > > Before: startup time 15s > After: 3s (There's quite a wide variance due to I/O caching luck.) > > Code: https://github.com/erikrose/test-utils/commit/b95a1b7 > > > If you read this far, you get a cookie! I welcome your feedback on merging > optimization #1 into core, as well as any accusations of insanity re: #2 and > #3. FWIW, everything works great without touching any of the tests on 3 of > our Django sites, totaling over 2000 tests.
I would be very happy to test this against Oracle database to see is how much patch improves speed since previously running tests against Oracle has been a real pain specially all db recreate stuff took a long long time. -- Jani Tiainen -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
