On Thu, Nov 10, 2005 at 11:39:34AM -0500, Tom Lane wrote: > No, Mike is right: for WAL you shouldn't need any journaling. This is > because we zero out *and fsync* an entire WAL file before we ever > consider putting live WAL data in it. During live use of a WAL file, > its metadata is not changing. As long as the filesystem follows > the minimal rule of syncing metadata about a file when it fsyncs the > file, all the live WAL files should survive crashes OK.
Yes, with emphasis on the zero out... :-) > You do need metadata journaling for all non-WAL PG files, since we don't > fsync them every time we extend them; which means the filesystem could > lose track of which disk blocks belong to such a file, if it's not > journaled. I think there may be theoretical problems with regard to the ordering of the fsync operation, for files that are not pre-allocated. For example, if a new block is allocated - there are two blocks that need to be updated. The indirect reference block (or inode block, if block references fit into the inode entry), and the block itself. If the indirect reference block is written first, before the data block, the state of the disk is inconsistent. This would be a crash during the fsync() operation. The metadata journalling can ensure that the data block is allocated first, and then all the necessary references updated, allowing for the operation to be incomplete and rolled back, or committed in full. Or, that is my understanding, anyways, and this is why I would not use ext2 for the database, even if it was claimed that fsync() was used. For WAL, with pre-allocated zero blocks? Sure. Ext2... :-) mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings