wc_db_pristine.c

Julian Foad Tue, 01 Mar 2011 05:34:14 -0800

On Mon, 2011-02-28, Bert Huijben wrote:
> > -----Original Message-----
> > From: Julian Foad [mailto:[email protected]]
> > 
> > On Mon, 2011-02-28, I (Julian Foad) wrote:
> > > On Sat, 2011-02-26, Branko Čibej wrote:
> > > > On 26.02.2011 20:40, Ivan Zhakov wrote:
> > > > > Btw I think it makes sense rename file to tmp directory in working
> > > > > copy instead of pristines directory, since it could be crash/failure
> > > > > between rename and delete. In this case pristines directory will
> > > > > polluted with orphaned pristines.
> > > >
> > > > That works as long as the pristine store lives in the WC root, so yes.
> > >
> > > This seems to be a good plan.  Thanks for the help.  I'll do it right
> > > away.
> > 
> > Please see attached patch.  (I might write a test or two before
> > committing it or I might commit first.)
> 
> If you just created a file in the tempdir it will be scanned by
> virusscanners while you will just want to delete it directly. (Which
> might trigger an access denied and then a wait loop)


I would hope that some (virus/indexing) scanners might avoid re-scanning
the file since all that happened was a rename [1], but yes, if a scanner
does open the file after the rename then there would be a delay.

> I think you can safely assume that the file won't be removed from the
> pristine twice at about the same time, so just using the sha1 as the
> filename should be pretty collision safe. (And the wait loop will
> catch the other cases)

Often, that would be fine.  However, the re-try loop doesn't help if the
file is being held open for a long time.

Let's say I open a graphical diff against a pristine text, and the diff
program holds the file open for reading.  Then, while that's still
displayed, I run

  svn update -rX  # removes the text
  svn update -rY  # re-adds the text
  svn update -rZ  # removes the text

The third update would delay in the re-try loop in the rename, and the
re-try loop would time-out and fail.  Although that may be an uncommon
use case, I think it is reasonable and should work.

I'm not clear exactly what problem we would avoid by eliminating the
"select a unique name" step of this process.  Is it what I describe
below at the end of note [1] - that a scanner might be more likely to
re-scan the content, and therefore more likely to cause a delay?

- Julian


[1] Actually, we don't have an atomic rename-to-a-unique-name function.
We currently use two separate steps, "Create a new file" followed by
"Overwrite the new file".  The scanner might immediately open the new
empty file and that would delay our overwriting of it, but that
shouldn't take long.  Then, I would expect the "overwrite" step to
behave like a delete followed by renaming the original file to the new
file's name, which is equivalent to the hypothetical atomic
rename-to-a-unique-name function.  A scanner might not see the rename in
that way, it might see it instead as a content-change of the new file,
and then it would scan the content, which would be a waste of time.


> Note that we might assume that Subversion opens files with
> FILE_SHARE_DELETE, but we can't assume that other programs -like
> virusscanners, file indexers, etc.- triggered by our disk i.o., do the
> same thing. [...]

RE: svn commit: r1073366 - in /subversion/trunk: notes/wc-ng/pristine-store subversion/libsvn_wc/wc-queries.sql subversion/libsvn_wc/wc_db_pristine.c

Reply via email to