Craig Ringer <> writes:
> On 07/18/2012 06:56 AM, Tom Lane wrote:
>> This implies that nobody has done pull-the-plug testing on either HEAD
>> or 9.2 since the checkpointer split went in (2011-11-01)

> That makes me wonder if on top of the buildfarm, extending some 
> buildfarm machines into a "crashfarm" is needed:

Not sure if we need a whole "farm", but certainly having at least one
machine testing this sort of stuff on a regular basis would make me feel
a lot better.

> The main challenge would be coming up with suitable tests to run, ones 
> that could then be checked to make sure nothing was broken.

One fairly simple test scenario could go like this:

        * run the regression tests
        * pg_dump the regression database
        * run the regression tests again
        * hard-kill immediately upon completion
        * restart database, allow it to perform recovery
        * pg_dump the regression database
        * diff previous and new dumps; should be the same

The main thing this wouldn't cover is discrepancies in user indexes,
since pg_dump doesn't do anything that's likely to result in indexscans
on user tables.  It ought to be enough to detect the sort of system-wide
problem we're talking about here, though.

In general I think the hard part is automated reproduction of an
OS-crash scenario, but your ideas about how to do that sound promising.
Once we have that going, it shouldn't be hard to come up with tests
of the form "do X, hard-crash, recover, check X still looks sane".

> What else should be checked? The main thing that comes to mind for me is 
> something I've worried about for a while: that Pg might not always 
> handle out-of-disk-space anywhere near as gracefully as it's often 
> claimed to.


                        regards, tom lane

Sent via pgsql-hackers mailing list (
To make changes to your subscription:

Reply via email to