On Thu, Jun 27, 2013 at 12:54 PM, Atri Sharma <atri.j...@gmail.com> wrote: >> Well, it does take longer to fsync a larger byte range to disk than a >> smaller byte range, in some cases. But it's generally more efficient >> to write one larger range than many smaller ranges, so you come out >> ahead on the whole. > > Right, that does make sense. > > So, the overhead of writing a lot of WAL buffers is mitigated because > one large write is better than lots of smaller rights?
Yep. To take a degenerate case, suppose that you had many small WAL records, say 64 bytes each, so more than 100 per 8K block. If you flush those one by one, you're going to rewrite that block 100 times. If you flush them all at once, you write that block once. But even when the range is more than the minimum write size (8K for WAL), there are still wins. Writing 16K or 24K or 32K submitted as a single request can likely be done in a single revolution of the disk head. But if you write 8K and wait until it's done, and then write another 8K and wait until that's done, the second request may not arrive until after the disk head has passed the position where the second block needs to go. Now you have to wait for the drive to spin back around to the right position. The details of course vary with the hardware in use, but there are very few I/O operations where batching smaller requests into larger chunks doesn't help to some degree. Of course, the optimal transfer size does vary considerably based on the type of I/O and the specific hardware in use. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers