Re: How might we mark a test suite isn't parallalizable?

Buddy Burden Sat, 04 May 2013 13:37:58 -0700

Ovid,

I lean more towards Mark's approach on this one, albeit with a slight twist.


> For many of the test suites I've worked on, the business rules are
complex enough that this is a complete non-starter. I *must* have a
database in a known-good state at the start of every test run.
>
>     is $customer_table->count, 2, "We should find the correct number of
records";

I've just never written a test like that.  I can't think that I've ever
needed to, really ... although I think I vaguely remember writing something
which used $^T to guarantee I was only pulling records that I'd added, so
perhaps that's close.  But that was only once.

We have several databases, but unit tests definitely don't have their own.
Typically unit tests run either against the dev database, or the QA
database.  Primarily, they run against whichever database the current
developer has their config pointed to.  This has to be the case, since
sometimes we make modifications to the schema.  If the unit tests all ran
against their own database, then my unit tests for my new feature involving
the schema change would necessarily fail.  Or, contrariwise, if I make the
schema modification on the unit test database, then every other dev's unit
tests would fail.  I suppose if we were using MySQL, it might be feasible
to create a new database on the fly for every unit test run.  When you're
stuck with Oracle though ... not so much. :-/

So all our unit tests just connect to whatever database you're currently
pointed at, and they all create their own data, and they all roll it back
at the end.  In fact, our common test module (which is based on Test::Most)
does the rollback for you.  In fact in fact, it won't allow you to commit.
So there's never anything to clean up.

AFA leaving the data around for debugging purposes, we've never needed
that.  The common test module exports a "DBdump" function that will dump
out whatever records you need.  If you run into a problem with the data and
you need to see what the data is, you stick a DBdump in there.  When you're
finished debugging, you either comment it out, or (better yet) just change
it from `diag DBdump` to `note DBdump` and that way you can get the dump
back any time just by adding -v to your prove.

AFAIK the only time anyone's ever asked me to make it possible for the data
to hang around afterwards was when the QA department was toying with the
idea of using the common test module to create test data for their manual
testing scenarios, but they eventually found another way around that.
Certainly no one's ever asked me to do so for a unit test.  If they did,
there's a way to commit if you really really want to--I just don't tell
anyone what it is. ;->

Our data generation routines generate randomized data for things that have
to be unique (e.g. email addresses) using modules such as String::Random.
In the unlikely event that it gets a collision, it just retries a few
times.  If a completely randomly generated string isn't unique after, say,
10 tries, you've probably got a bigger problem anyway.  Once it's inserted,
we pull it back out again using whatever unique key we generated, so we
don't ever have a need to count records or anything like that.  Perhaps
count the number of records _attached_ to a record we inserted previously
in the test, but that obviously isn't impacted by having extra data in the
table.

Unlike Mark, I won't say we _count_ on the random data being in the DB; we
just don't mind it.  We only ever look at the data we just inserted.  And,
since all unit test data is in a transaction (whether ours or someone
else's who happens to be running a unit test at the same time), the unit
tests can't conflict with each other, or with themselves (i.e. we do use
parallelization for all our unit tests).  The only problems we ever see
with this approach are:

* The performance on the unit tests can be bad if lots and lots of things
are hitting the same tables at the same time.
* If the inserts or updates aren't judicious with their locking, some tests
can lock other tests out from accessing the table they want.

And the cool thing there is, both of those issues expose problems in the
implementation that need fixing anyway: scalability problems and potential
DB contention issues.  So forcing people to fix those in order to make
their unit tests run smoothly is a net gain.

Anyways, just wanted to throw in yet another perspective.


            -- Buddy

Re: How might we mark a test suite isn't parallalizable?

Reply via email to