On Wednesday 20 August 2003 00:06, you wrote: > > ok, but how do you make sure the file is really on disk instead of, e.g., > > half on disk and half on cache? > > We close the file before we rename it.
This would be ok if the underlying operating system flushed the disk cache upon close(), but I'm afraid this is not the case (at least on linux). This is from man 2 close: A successful close does not guarantee that the data has been success- fully saved to disk, as the kernel defers writes. It is not common for a filesystem to flush the buffers when the stream is closed. If you need to be sure that the data is physically stored use fsync(2). (It will depend on the disk hardware at this point.) This behaviour is declared conforming to SVr4, SVID, POSIX, X/OPEN, BSD 4.3. Therefore I believe the problem reported here happens in this way: 1. mailman writes the tmp file, closes it and the atomically renames. this is atomically from userland point of view (e.g. applications will see the file instantly changed) 2. under the hood, the operating system is running a disk cach to speed up file operations, therefore what really happened is the file has been written to some RAM pages but not yet on disk. 3. at some later time, the disk cache is copied from RAM to disk, effectively making changes permanent. This copy is not atomic, e.g. files bigger than 4k will be written in chunks of 4k pages. A power interruption (or OS crash, or any other unclean shutdown) in phase 2 could lead to a lost transaction (e.g. the file will appear as never overwritten, like phase 1 never happened). A power interruption (or OS crash, or any other unclean shutdown) happening in phase 3 could lead to a corrupted file (e.g. some pages written to disk, some pages not). MTAs usually provide a configuration setting to enable cache flush for each transaction (by use of fsync()), but this is disabled by default because of the severe impact in performance. Use of BerkeleyDB (or similar transactional db libraries) could eliminate the problem of corrupted files without the need to fsync, but to solve the problem in phase 2 we need to guarantee at application level that loosing a file won't make dangling references or bad states in the related data we stored elsewhere. Worst case, when restarting after power outage we should check for transactions to be cancelled because the related file is not on disk. An example could be: we put a message on hold for moderation, therefore we - save the message in a file (or rename from the previous location) - update the moderation queue index in MailList - Save() the list config pickle If the system goes down now because of a power outage, when restarting we could have (even fsync()ing everything): - the index has been regularly updated - the message is not on disk, or it's in a different filename/path this can happen because actual writes on disk can be reordered by the OS, for performance reasons. Accessing the admindb panel now could potentially lead to exceptions. Now, everyone who is serious about administering a server has a big and dependable UPS, automatically triggering clean shutdowns and so on, therefore everything I've described is not as much as a problem. -- [EMAIL PROTECTED] pioppo]$ man women No manual entry for women _______________________________________________ Mailman-Developers mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-developers