Re: Test optimizations (2-5x as fast)

David Cramer Fri, 13 May 2011 22:17:50 -0700

More quick notes. You can do something like this to handle the
flushing:

                    sql_list = connection.ops.sql_flush(no_style(),
tables, connection.introspection.sequence_list())
                    for sql in sql_list:
                        cursor.execute(sql)


Unfortunately, you're still reliant that nothing was created with
signals that uses constraints. For us this is very common, and I can't
imagine we're an edge case there

On May 13, 9:42 pm, David Cramer <[email protected]> wrote:
> You sir, are my personal hero for the day :)
>
> We had also been looking at how we could speed up the fixture loading
> (we were almost ready to go so far as to make one giant fixture that
> just loaded at the start of the test runner). This is awesome progress
>
> On May 13, 4:57 pm, Erik Rose <[email protected]> wrote:
>
>
>
>
>
>
>
> > tl;dr: I've written an alternative TestCase base class which makes 
> > fixture-using tests much more I/O efficient on transactional DBs, and I'd 
> > like to upstream it.
>
> > Greetings, all! This is my first django-dev post, so please be gentle. :-) 
> > I hack on support.mozilla.com, a fairly large Django site with about 1000 
> > tests. Those tests make heavy use of fixtures and, as a result, used to 
> > take over 5 minutes to run. So, I spent a few days seeing if I could cut 
> > the amount of DB I/O needed. Ultimately, I got the run down to just over 1 
> > minute, and almost all of those gains are translatable to any Django site 
> > running against a transactional DB. No changes to the apps themselves are 
> > needed. I'd love to push some of this work upstream, if there's interest 
> > (or even lack of opposition ;-)).
>
> > The speedups came from 3 main optimizations:
>
> > 1. Class-level fixture setup
>
> > Given a transaction DB, there's no reason to reload fixtures via dozens of 
> > SQL statements before every test. I made use of setup_class() and 
> > teardown_class() (yay, unittest2!) to change the flow for TestCase-using 
> > tests to this:
> >     a. Load the fixtures at the top of the class, and commit.
> >     b. Run a test.
> >     c. Roll back, returning to pristine fixtures. Go back to step b.
> >     d. At class teardown, figure out which tables the fixtures loaded into, 
> > and expressly clear out what was added.
>
> > Before this optimization: 302s to run the suite
> > After: 97s.
>
> > Before: 37,583 queries
> > After: 4,116
>
> > On top of that, an additional 4s was saved by reusing a single connection 
> > rather than opening and closing them all the time, bringing the final 
> > number down to 93s. (We can get away with this because we're committing any 
> > on-cursor-initialization setup, whereas the old TestCase rolled it back.)
>
> > Here's the 
> > code:https://github.com/erikrose/test-utils/blob/master/test_utils/__init_....
> >  I'd love to generalize it a bit (to fall back to the old behavior with 
> > non-transactional backends, for example) and offer it as a patch to Django 
> > proper, replacing TestCase. Thoughts?
>
> > (If you notice that copy-and-paste of loaddata sitting off to the side in 
> > another module, don't fret; in the patch, that would turn into a 
> > refactoring of loaddata to make the computation of the fixture-referenced 
> > tables separately reusable.)
>
> > 2. Fixture grouping
>
> > I next observed that many test classes reused the same sets of fixtures, 
> > often via subclassing. After the previous optimization, our tests still 
> > loaded fixtures 114 times, even though there were only 11 distinct sets of 
> > them. So, I thought: why not write a custom testrunner that buckets the 
> > classes by fixture set and advises the classes that, unless they're the 
> > first or last in a bucket, they shouldn't bother tearing down or setting up 
> > the fixtures, respectively? This took the form of a custom nose plugin (we 
> > use nose for all our Django stuff), and it took another quarter off the 
> > test run:
>
> > Before: 97s
> > After: 74s
>
> > Of course, test independence is still preserved. We're just factoring out 
> > pointlessly repeated setup.
>
> > I don't really have plans to upstream this unless someone calls for it, but 
> > I'll be making it available soon, likely as part of django-nose.
>
> > 3. Startup optimizations
>
> > At this point, it was bothering me that, just to run a single test, I had 
> > to wait through 15s of DB initialization (mostly auth_permissions and 
> > django_content_type population)—stuff which was already perfectly valid 
> > from the previous test run. So, building on some work we had already done 
> > in this direction, I decided to skip the teardown of the test DB and, 
> > symmetrically, the setup on future runs. If you make schema changes, just 
> > set an env var, and it wipes and remakes the DB like usual. I could see 
> > pushing this into django-nose as well, but it's got the hackiest 
> > implementation and can potentially confuse users. I mention it for 
> > completeness.
>
> > Before: startup time 15s
> > After: 3s (There's quite a wide variance due to I/O caching luck.)
>
> > Code:https://github.com/erikrose/test-utils/commit/b95a1b7
>
> > If you read this far, you get a cookie! I welcome your feedback on merging 
> > optimization #1 into core, as well as any accusations of insanity re: #2 
> > and #3. FWIW, everything works great without touching any of the tests on 3 
> > of our Django sites, totaling over 2000 tests.
>
> > Best regards and wishes for a happy weekend,
> > Erik Rose
> > support.mozilla.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: Test optimizations (2-5x as fast)

Reply via email to