Heikki Linnakangas <[EMAIL PROTECTED]> writes: > Consider the variant with extra marker files. In that case, backend B > doesn't have to know about the .notcommitted status to flush the buffers.
[ shrug ] It's still broken, and the reason is that there's no equivalent of fsync for directory operations. Consider A creates 1234 and 1234.notcommitted A commits B performs a checkpoint crash all before A manages to delete 1234.notcommitted, or at least before that deletion has made its way to disk. Upon restart, only WAL events after the checkpoint will be replayed, so 1234.notcommitted doesn't go away, and then you've got a problem. To fix this there would need to be a way (1) for B to be aware of the pending file deletion and (2) for B to delay committing the checkpoint until the directory update is surely down on disk. Your proposal doesn't provide for (1), and even if we fixed that, I know of no portable kernel API for (2). fsync isn't applicable. While your original patch is buggy, it's at least fixable and has localized, limited impact. I don't think these schemes are safe at all --- they put a great deal more weight on the semantics of the filesystem than I care to do. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org