Re: [HACKERS] improving concurrent transactin commit rate

Greg Smith Tue, 24 Mar 2009 20:24:28 -0700

On Tue, 24 Mar 2009, Sam Mason wrote:

The conceptual idea is to have at most one outstanding flush for the
log going through the filesystem at any one time.


Quoting from src/backend/access/transam/xlog.c, inside XLogFlush:

"Since fsync is usually a horribly expensive operation, we try topiggyback as much data as we can on each fsync: if we see any more dataentered into the xlog buffer, we'll write and fsync that too, so that thefinal value of LogwrtResult.Flush is as large as possible. This gives ussome chance of avoiding another fsync immediately after."

The logic implementing that idea takes care of bunching up flushes for WALdata that also happens to be ready to go at that point. You can see thismost easily by doing inserts into a system that's limited by a slow fsync,like a single disk without write cache where you're bound by RPM speed.If you have, say, a 7200RPM disk, no one client can commit faster than 120times/second. But if you have 10 clients all pushing small inserts in,it's fairly easy to see >500 transactions/second, because a bunch ofcommits will get batched up during the time the last fsync is waiting forthe disk to finish.

The other idea you'll already find implemented in there is controlled bycommit_delay. If there are more than commit_siblings worth of opentransactions at the point where a commit is supposed to happen, that willpause commit_delay microseconds in hopes that other transactions will jumponboard via the mechanism described above. In practice, it's very hard totune that usefully. You can use it to help bunch together commits a bitbetter into bigger batches on a really busy system (where not having morethan one commit ready is unexpected), it's not much help outside of thatcontext.

Check out the rest of the comments in xlog.c, there's a lot in therethat's not really covered in the README. If you turn on WAL_DEBUG andXLOG_DEBUG you can actually watch some of this happen. I found time spentreading the source to that file and src/backend/storage/buffer/bufmgr.c tobe really well spent, some of the most interesting parts of the codebaseto understand from a low-level performance tuning perspective are in thosetwo.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] improving concurrent transactin commit rate

Reply via email to