Mark Wong wrote: > O_DIRECT + fsync() can make sense. It avoids the copying of data > to the page cache before being written and will also guarantee > that the file's metadata is also written to disk. It also > prevents the page cache from filling up with write data that > will never be read (I assume it is only read if a recovery > is necessary - which should be rare). It can also > helps disks with write back cache when using the journaling > file system that use i/o barriers. You would want to use > large writes, since the kernel page cache won't be writing > multiple pages for you.
Right, but it seems O_DIRECT is pretty much the same as O_DIRECT with O_DSYNC because the data is always written to disk on write(). Our logic is that there is nothing for fdatasync to do in most cases after using O_DIRECT, so the O_DIRECT/fdatasync() combination doesn't make sense. And FreeBSD, and perhaps others, need O_SYNC or fdatasync with O_DIRECT because O_DIRECT doesn't force stuff to disk in all cases. > I need to look at the kernel code more to comment on O_DIRECT with > O_SYNC. > > Questions: > > Does the database transaction logger preallocate the log file? Yes. > Does the logger care about the order in which each write hits the disk? Not really. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings