On Thu, 2011-08-04 at 18:07 -0400, Bruce Momjian wrote: > I am confused how generating WAL traffic that is larger than the heap > file we are fsync'ing can possibly be slower. Are you just throwing out > an idea to try to make me prove it?
That's worded in a slightly confusing way, but here is the trade-off: 1. If you are using WAL, then regardless of what your transaction does, only the WAL needs to be fsync'd at commit time. Conveniently, that's being written sequentially, so it's a single fairly cheap fsync; and all the data page changes are deferred, collected together, and fsync'd at checkpoint time (rather than commit time). The cost is that you double-write the data. 2. If you aren't using WAL, you need to fsync every data file the transaction touched, which are probably not localized with other activity. Also, the _entire_ data files needs to be sync'd, so perhaps many other transactions have made changes to one data file all over, and it may require _many_ seeks to accomplish the one fsync. The benefit is that you don't double-write the data. So, fundamentally, WAL is (in the OLTP case, where a transaction is much shorter than a checkpoint interval) a big performance _win_, because it allows us to do nice sequential writing in a single place for all activities of all transactions; and defer all those random writes to data pages until the next checkpoint. So we shouldn't treat WAL like a cost burden that we want to avoid in every case we can. But in the data load case (where many checkpoints may happen during a single transaction anyway), it happens that avoiding WAL is a performance win, because the seeks are not the dominant cost. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers