> > Even with true fdatasync it's not obviously good for > > performance - it takes too long time to write 16Mb files > > and fills OS buffer cache with trash-:( > > True. But at least the write is (hopefully) being done at a > non-performance-critical time. There is no such hope: XLogWrite may be called from XLogFlush (at commit time and from bufmgr on replacements) *and* from XLogInsert - ie new log file may be required at any time. > > Probably, we need in separate process like LGWR (log > > writer) in Oracle. > > I think the create-ahead feature in the checkpoint maker should be > on by default. I'm not sure - it increases disk requirements. > > I considered this mostly as hint for OS about how log file should be > > allocated (to decrease fragmentation). Not sure how OSes > > use such hints but seek+write costs nothing. > > AFAIK, extant Unixes will not regard this as a hint at all; they'll > think it is a great opportunity to not store zeroes :-(. Yes, but if I would write file system then I wouldn't allocate space for file block by block - I would try to pre-allocate more than required by write(). So I hoped that seek+write is hint for OS: "Hey, I need in 16Mb file - try to make it as continuous as possible". Don't know does it work, though -:) > One reason that I like logfile fill to be done separately is that it's > easier to convince ourselves that failure (due to out of disk space) > need not require elog(STOP) than if we have the same failure during > XLogWrite. You are right that we don't have time to consider > each STOP in the WAL code, but I think we should at least look at > that case... What problem with elog(STOP) in the absence of disk space? I think running out of disk is bad enough to stop DB operations. Vadim ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster