On 7/22/13 2:57 PM, Andres Freund wrote:
* I'd be very surprised if this doesn't make WAL replay of update heavy workloads slower by at least factor of 2.
I was thinking about what a benchmark of WAL replay would look like last year. I don't think that data is captured very well yet, and it should be.
My idea was to break the benchmark into two pieces. One would take a base backup, then run a series of tests and archive the resulting the WAL. I doubt you can make a useful benchmark here without a usefully populated database, that's why the base backup step is needed.
The first useful result then is to measure how long commit/archiving took and the WAL volume, which is what's done by the test harness for this program. Then the resulting backup would be setup for replay. tarring up the backup and WAL archive could even give you a repeatable test set for ones where it's only replay changes happening. Then the main number that's useful, total replay time, would be measured.
The main thing I wanted this for wasn't for code changes; it was to benchmark configuration changes. I'd like to be able to answer questions like "which I/O scheduler is best for a standby" in a way that has real test data behind it. The same approach should useful for answering your concerns about the replay performance impact of this change too though.
* It makes changeset extraction either more expensive or it would have to be disabled there.
That argues that if committed at all, the ability to turn this off I was asking about would be necessary. It sounds like this *could* work like how minimal WAL archiving levels allow optimizations that are disabled at higher ones--like the COPY into a truncated/new table cheat.
-- Greg Smith 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers