On Fri, 6 Apr 2007, Takayuki Tsunakawa wrote:

could anyone evaluate O_SYNC approach again that commercial databases use and tell me if and why PostgreSQL's fsync() approach is better than theirs?

I noticed a big improvement switching the WAL to use O_SYNC (+O_DIRECT) instead of fsync on my big and my little servers with battery-backed cache, so I know sync writes perform reasonably well on my hardware. Since I've had problems with the fsync at checkpoint time, I did a similar test to yours recently, adding O_SYNC to the open calls and pulling the fsyncs out to get a rough idea how things would work.

Performance was reasonable most of the time, but when I hit a checkpoint with a lot of the buffer cache dirty it was incredibly bad. It took minutes to write everything out, compared with a few seconds for the current case, and the background writer was too sluggish as well to help. This appears to match your data.

If you compare how Oracle handles their writes and checkpoints to the Postgres code, it's obvious they have a different architecture that enables them to support sync writing usefully. I'd recommend the Database Writer Process section of http://www.lc.leidenuniv.nl/awcourse/oracle/server.920/a96524/c09procs.htm as an introduction for those not familiar with that; it's interesting reading for anyone tinking with background writer code.

It would be great to compare performance of the current PostgreSQL code with a fancy multiple background writer version using the latest sync methods or AIO; there have actually been multiple updates to improve O_SYNC writes within Linux during the 2.6 kernel series that make this more practical than ever on that platform. But as you've already seen, the performance hurdle to overcome is significant, and it would have to be optional as a result. When you add all this up--have to keep the current non-sync writes around as well, need to redesign the whole background writer/checkpoint approach around the idea of sync writes, and the OS-specific parts that would come from things like AIO--it gets real messy. Good luck drumming up support for all that when the initial benchmarks suggest it's going to be a big step back.

* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Reply via email to