Proposal: Implement a new option for COMMIT, for enhancing performance, providing a MySQL-like trade-off between performance and robustness for *only* those that want it.
COMMIT NOWAIT This form of COMMIT will *not* perform XLogFlush(), but will rely on a special background process to perform regular WAL fsyncs (see later). COMMIT NOWAIT can co-exist with the normal form of COMMIT and does not threaten the consistency or robustness of other COMMIT modes. Read that again and think about it, before we go further, please. Normal COMMIT still guarantees to flush all of WAL up to the point of the commit, whether or not the previous commits have requested that. Mixing COMMIT NOWAIT with other modes does not effect the performance of other backends - those that specify that mode are faster, those that do not simply go at the same speed they did before. This is important, because it allows us to have a fully robust server, yet with certain critical applications going along much faster. No need for an all-or-nothing approach at db cluster level. Unlike fsync = off, WAL is always consistent and the server can be recovered easily, though with some potential for data loss for transactions that chose the COMMIT NOWAIT option. Sounds like a hole there: normal COMMITs that rely on data written by COMMIT NOWAIT transactions are still safe, because the normal COMMIT is still bound by the guarantee to go to disk. The buffer manager/WAL interlock is not effected by this change and remains in place, as it should. This implements the TODO item: --Allow buffered WAL writes and fsync "Instead of guaranteeing recovery of all committed transactions, this would provide improved performance by delaying WAL writes and fsync so an abrupt operating system restart might lose a few seconds of committed transactions but still be consistent. We could perhaps remove the 'fsync' parameter (which results in an an inconsistent database) in favor of this capability." Why do we want this?? Because some apps have *lots* of data and many really don't care whether they lose a few records. Honestly, I've met people that want this, even after 2 hours of discussion and understanding. Plus probably lots of MySQLers also. User Control ------------ New commit mode is available by explicit command, or as a default setting that will be applied to all COMMITs, or both. The full syntax would be COMMIT [WRITE] NOWAIT [IMMEDIATE], for Oracle compatibility (why choose incompatibility?). Note that this is not a transaction start setting like Isolation Level; this happens at end of transaction. The syntax for END is unchanged, defaulting to normal behaviour unless overridden. New userset GUC, commit_wait_default = on (default) | off We change the meaning of the commit_delay parameter: - If commit_delay = 0 then commit_wait_default cannot be set off. - WAL will be flushed every commit_delay milliseconds; if no flush is required this will do nothing very quickly, so there is little overhead of no COMMIT NOWAIT commits have been made. Implementation -------------- COMMIT NOWAIT in xact.c simply ignores XLogFlush and returns. Who does the XLogFlush? Well, my recommendation is a totally new process, WALWriter. But I can see that many of you will say bgwriter should be the person to do this work. IMHO doing WAL flushes will take time and thats time that bgwriter really needs to do other things, plus it can't really guarantee to do flush regularly when its doing checkpoints. When commit_delay > 0 then the WALwriter will startup, or shutdown if commit_delay = 0. WALWriter will XLogFlush every commit_delay milliseconds. A prototype patch is posted to -patches, which is WORK IN PROGRESS. The following TODO items remain 1. discuss which process will issue regular XLogFlush(). If agreed, implement WALWriter process to perform this task. (Yes, the patch isn't fully implemented, yet). 2. remove fsync parameter 3. Prevent COMMIT NOWAIT when commit_delay = 0 4. Discuss whether commit_delay is OK to usurp; twas just an earlier suggestion from someone else, can go either way. 5. docs The remaining items can be completed very quickly if this proposal is acceptable. (I wrote this over Christmas, so it turning up now isn't a rushed proposal and I'm pretty certain it ain't broke). Comments? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings