Bert Huijben wrote: > > -----Original Message----- > > From: Julian Foad [mailto:julian.f...@wandisco.com] > > > > I'm not clear exactly what problem we would avoid by eliminating the > > "select a unique name" step of this process. Is it what I describe > > below at the end of note [1] - that a scanner might be more likely to > > re-scan the content, and therefore more likely to cause a delay? > > No, the problem I try to avoid is > * you create a file > <virus scanner opens the file to verify that it is not a virus> > * you delete the file (after the virusscanner releases the file) > * you rename a file to be at the old location > > While we really need something like > * rename to a unique name. > <virusscanner ignores the file, because it was already scanned at the > original location>
>From discussing on IRC we agree that the main concern is that this involves too many separate filesystem operations, rather than any specific problem with a particular step. I have committed it as is in r1075942, and would like to improve it as a follow-up. Options for improvement: (a) Don't open a pristine file with FILE_SHARE_DELETE. Instead, accept that an attempt to delete it may fail, and if that happens, leave the there as an orphan. When adding a pristine file, if it already exists on disk then just keep the copy that already exists. When cleaning up orphan files, if a delete fails, just leave the file there. Consider implementing automatic clean-up to prevent orphan files from accumulating indefinitely. (b) Find or write a "rename to a unique name" function that operates in a single step instead of a creating a new file and then overwriting it. (c) Don't rename the file before deleting it; instead, just delete it. On Windows, when adding a new file, if a file with that name already exists, *then* rename the existing file before moving the new file into place. (We can't just keep the existing file because it is pending deletion and will disappear when the reader finishes reading it.) Pros and cons: (a) This would reduce the number of file system operations to a minimum. It would involve bypassing APR to avoid the FILE_SHARE_DELETE flag, which is not ideal but possible. (b) This would remove one of the file system operations. It would involve writing a function similar to APR's fall-back implementation of apr_file_mktemp() that exists for systems that do not provide a "mkstemp" system API. I'm not sure if there are any concrete problems with doing this sort of thing in "user space". (c) This would remove two of the file system operations. It sounds straightforward. Comments? - Julian