> "Magnus Hagander" <[EMAIL PROTECTED]> writes: > > Tom, if you look at all the requirements of FILE_FLAG_NO_BUFFERING on > > http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/ > > base/createfile.asp, can you say offhand if the WAL code fulfills them? > > If I'm reading it right, you are referring to: > > File access must begin at byte offsets within the file that are > integer multiples of the volume's sector size. > > File access must be for numbers of bytes that are integer multiples > of the volume's sector size. For example, if the sector size is 512 > bytes, an application can request reads and writes of 512, 1024, or > 2048 bytes, but not of 335, 981, or 7171 bytes. > > Buffer addresses for read and write operations should be sector > aligned (aligned on addresses in memory that are integer multiples > of the volume's sector size). Depending on the disk, this > requirement may not be enforced. > > 1 and 2 should be no problem since we only read or write integral pages > (8K). 3 is a bit bogus IMHO, or even a lot bogus. You can set > ALIGNOF_BUFFER in src/include/pg_config_manual.h to whatever you think > the alignment requirement really needs to be (I'd try 512).
After multiple runs on different blocksizes( a few anomalous results aside), I didn't see a whole lot of difference between FILE_FLAG_NO_BUFFERING being on or off for writing performance. However, with NO_BUFFERING set, the file is not *read* cached at all. While the performance is on not terrible for reads, some careful consideration would have to be given for using it outside of WAL. For WAL, though, it seems perfect. If my results are to be believed, we can expect up to a 30 yes, that's three + zero times faster sync performance by ditching FlushFileBuffers (although probably far less in practice). Applying FILE_FLAG_WRITE_THROUGH to non WAL data files will give similar speedups to checkpoints, but right now I'm making no assumptions about the safety issue. I'd like to point out here that using the FlushFileBuffers() sync approach it was impossible to get my 3ware raid controller to cache the writes at all. This means that unless we change the sync method for data files, win32 will always have horrible checkpoint performance (and I do mean horrible). My suggestion would be to FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH for WAL, and FILE_FLAG_WRITE_THROUGH for everything else. Then it's time to power-fail test etc. and make sure things work the way they are supposed to. By the way, by some quirk of fate, 8k seems to be a fairly good choice of block size. 4k block sizes give slightly lower latency but not nearly as much throughput. Merlin ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match