On 2/22/2018 1:48 PM, Junio C Hamano wrote:
Derrick Stolee <sto...@gmail.com> writes:

Teach git-commit-graph to delete the .graph files that are siblings of a
newly-written graph file, except for the file referenced by 'graph-latest'
at the beginning of the process and the newly-written file. If we fail to
delete a graph file, only report a warning because another git process may
be using that file. In a multi-process environment, we expect the previoius
graph file to be used by a concurrent process, so we do not delete it to
avoid race conditions.
I do not understand the later part of the above.  On some operating
systems, you actually can remove a file that is open by another
process without any ill effect.  There are systems that do not allow
removing a file that is in use, and an attempt to unlink it may
fail.  The need to handle such a failure gracefully is not limited
to the case of removing a commit graph file---we need to deal with
it when removing file of _any_ type.

My thought is that we should _warn_ when we fail to delete a .graph file that we think should be safe to delete. However, if we are warning for a file that is currently being accessed (as is the case on Windows, at least), then we will add a lot of noise. This is especially true when using IDEs that run 'status' or 'fetch' in the background, frequently.

Especially the last sentence "we do not delete it to avoid race
conditions" I find problematic.  If a system does not allow removing
a file in use and we detect a failure after an attempt to do so, it
is not "we do not delete it" --- even if you do, you won't succeed
anyway, so there is no point saying that.  And on systems that do
allow safe removal of a file in use (i.e. they allow an open file to
be used by processes that have open filehandles to it after its
removal), there is no point refraining to delete it "to avoid race
conditions", either---in fact it is unlikely that you would even know
somebody else had it open and was using it.

The (unlikely, but possible) race condition involves two processes (P1 and P2):

1. P1 reads from graph-latest to see commit graph file F1.
2. P2 updates graph-latest to point to F2 and deletes F1.
3. P1 tries to read F1 and fails.

I could explicitly mention this condition in the message, or we can just let P2 fail by deleting all files other than the one referenced by 'graph-latest'. Thoughts?

In any case, I do not think '--delete-expired' option that can be
given only when you are writing makes much sense as an API.  An
'expire' command, just like 'set-latest' command, that is a separate
command from 'write',  may make sense, though.

In another message, I proposed dropping the argument and assuming expires happen on every write.


Reply via email to