On Mon, May 18, 2015 at 1:53 AM, Robert Haas <robertmh...@gmail.com> wrote: > > On May 17, 2015, at 11:04 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Sun, May 17, 2015 at 7:45 AM, Robert Haas <robertmh...@gmail.com> wrote: > > > > <crazy-idea>I wonder if we could write WAL to two different files in > > alternation, so that we could be writing to one file which fsync-ing > > the other.</crazy-idea> > > Won't the order of transactions replay during recovery can cause > problems if we do alternation while writing. I think this is one of > the reasons WAL is written sequentially. Another thing is that during > recovery, currently whenever we encounter mismatch in stored CRC > and actual record CRC, we call it end of recovery, but with writing > to 2 files simultaneously we might need to rethink that rule. > > > Well, yeah. That's why I said it was a crazy idea. >
Another idea could be try to write as per disk sector size which I think in most cases is 512 bytes (some latest disks do have larger size sectors, so it should be configurable in some way). I think with this ideally we don't need CRC for each WAL record, as that data will be either written or not written. Even if we don't want to rely on the fact that sector-sized writes are atomic, we can have a configurable CRC per writeable-unit (which in this scheme would be 512 bytes). It can have dual benefit. First it can help us in minimizing repeated writes problem and second is that by eliminating the need to have CRC for each record it can reduce the WAL volume and CPU load. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com