Re: [PATCH] packed_ref_store: handle a packed-refs file that is a symlink
On Thu, Jul 27, 2017 at 12:40 PM, Junio C Hamanowrote: > Jeff King writes: > >> On Thu, Jul 27, 2017 at 10:19:48AM -0700, Junio C Hamano wrote: >> >>> Makes sense. Makes me wonder why we need a separate .new file >>> (instead of writing into the .lock instead), but that is a different >>> issue. >> >> It comes from 42dfa7ece (commit_packed_refs(): use a staging file >> separate from the lockfile, 2017-06-23). That commit explains that we >> want to be able to put the new contents into service before we release >> the lock. But it doesn't say why that's useful. > > By being able to hold the lock on packed-refs longer, I guess > something like this becomes possible: > > * hold the lock on packed-refs > * hold the lock on loose ref A, B, C, ... > * update packed-refs to include the freshest values of these refs > * start serving packed-refs without releasing the lock > * for X in A, B, C...: delete the loose ref X and unlock X > * unlock the packed-refs Hmmm, I thought that I explained somewhere why this is useful, but I can't find it now. The code even after this patch series is Prepare: 1. lock loose refs Finish: 2. perform creates and updates of loose refs, releasing their locks as we go 3. perform deletes of loose refs 4. lock packed-refs file 5. repack_without_refs() (delete packed versions of refs to be deleted) 6. commit new packed-refs file (which also unlocks it) 7. release locks on loose refs that were deleted This is problematic because after a loose reference is deleted, it briefly (between steps 3 and 5) reveals any packed version of the reference, which might even point at a commit that has been GCed. It is even worse if the attempt to acquire the packed refs lock in step 4 fails, because then the repo could be left in that broken state. Clearly we can avoid the second problem by acquiring the packed-refs lock during the prepare phase. You might also attempt to fix the first problem by deleting the references from the packed-refs file before we delete the corresponding loose references: Prepare: 1. lock loose refs 2. lock packed-refs file Finish: 3. perform creates and updates of loose refs, releasing their locks as we go 4. repack_without_refs() (delete packed versions of refs to be deleted) 5. commit new packed-refs file (which also unlocks it) 6. perform deletes of loose refs 7. release locks on loose refs that were deleted But now we have a different problem. pack-refs does the following: A. lock packed-refs file B. read loose refs C. commit new packed-refs file (which also unlocks it) Then, for each loose reference that should be pruned, delete it in a transaction with REF_ISPRUNING, which means: D. lock the loose reference E. lock packed-refs file F. check that the reference's value is the same as the value just written to packed-refs G. delete the loose ref file if it still exists H. unlock the loose reference If a reference deletion is running at the same time as a pack-refs, you could have a race where steps A and B of a pack-refs run between steps 5 and 6 of a reference deletion. The pack-refs would see the loose reference and would pack it into the packed-refs file, leaving the reference at its old value even though the user was told that the reference was deleted. If a new packed-refs file can be activated while retaining the lock on the file, then a reference update could keep the packed-refs file locked during step 6 of the second algorithm, which would block step A of pack-refs from beginning. I have the feeling that there might be other, more improbably races lurking here, too. The eternal problems with races in the files backend is a main reason that I'm enthusiastic about the reftable proposal. Michael
Re: [PATCH] packed_ref_store: handle a packed-refs file that is a symlink
Jeff Kingwrites: > On Thu, Jul 27, 2017 at 10:19:48AM -0700, Junio C Hamano wrote: > >> Makes sense. Makes me wonder why we need a separate .new file >> (instead of writing into the .lock instead), but that is a different >> issue. > > It comes from 42dfa7ece (commit_packed_refs(): use a staging file > separate from the lockfile, 2017-06-23). That commit explains that we > want to be able to put the new contents into service before we release > the lock. But it doesn't say why that's useful. By being able to hold the lock on packed-refs longer, I guess something like this becomes possible: * hold the lock on packed-refs * hold the lock on loose ref A, B, C, ... * update packed-refs to include the freshest values of these refs * start serving packed-refs without releasing the lock * for X in A, B, C...: delete the loose ref X and unlock X * unlock the packed-refs Other people racing with the sequence to recreate a loose ref that is even fresher than the resulting packed-refs file, while we still hold the lock on packed-refs, is perfectly OK. But we must make sure our packed-refs is visible to others before starting to delete and unlock the loose refs. Hmph, but that is not a sufficient explanation. I am not seeing anything bad to happen if we unlock the packed-refs before deleting loose refs that we have locks on, so there must be something else that needs "new packed-refs is made visible way before we unlock it". > I recall from past discussions that this will help close some races, > and e5cc7d7d2 (repack_without_refs(): don't lock or unlock the packed > refs, 2017-07-01) alludes to this. I think the races in question have to > do with holding the packed-refs lock while pruning the just-packed > files, but I'm having trouble digging up specifics in the archive. > > -Peff
Re: [PATCH] packed_ref_store: handle a packed-refs file that is a symlink
On Thu, Jul 27, 2017 at 10:19:48AM -0700, Junio C Hamano wrote: > Michael Haggertywrites: > > > Change `commit_packed_refs()` to use `get_locked_file_path()` to find > > the path of the file that it should overwrite. Since that path was > > properly resolved when the lockfile was created, this restores the > > pre-42dfa7ecef behavior. > > Because when we take a lock hold_lock_file() eventually calls into > lock_file() which by default takes the lock on the target of the > symbolic link (which is the sensible default, which is triggered in > this codepath), so this change to use the name of that file is all > that is needed. > > Makes sense. Makes me wonder why we need a separate .new file > (instead of writing into the .lock instead), but that is a different > issue. It comes from 42dfa7ece (commit_packed_refs(): use a staging file separate from the lockfile, 2017-06-23). That commit explains that we want to be able to put the new contents into service before we release the lock. But it doesn't say why that's useful. I recall from past discussions that this will help close some races, and e5cc7d7d2 (repack_without_refs(): don't lock or unlock the packed refs, 2017-07-01) alludes to this. I think the races in question have to do with holding the packed-refs lock while pruning the just-packed files, but I'm having trouble digging up specifics in the archive. -Peff
Re: [PATCH] packed_ref_store: handle a packed-refs file that is a symlink
Michael Haggertywrites: > Change `commit_packed_refs()` to use `get_locked_file_path()` to find > the path of the file that it should overwrite. Since that path was > properly resolved when the lockfile was created, this restores the > pre-42dfa7ecef behavior. Because when we take a lock hold_lock_file() eventually calls into lock_file() which by default takes the lock on the target of the symbolic link (which is the sensible default, which is triggered in this codepath), so this change to use the name of that file is all that is needed. Makes sense. Makes me wonder why we need a separate .new file (instead of writing into the .lock instead), but that is a different issue. Thanks. I'll do the SYMLINKS thing while queuing. > diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh > index 2bb4b25ed9..0d8a03e2a9 100755 > --- a/t/t3210-pack-refs.sh > +++ b/t/t3210-pack-refs.sh > @@ -238,4 +238,19 @@ test_expect_success 'retry acquiring packed-refs.lock' ' > git -c core.packedrefstimeout=3000 pack-refs --all --prune > ' > > +test_expect_success 'pack symlinked packed-refs' ' > + # First make sure that symlinking works when reading: > + git update-ref refs/heads/loosy refs/heads/master && > + git for-each-ref >all-refs-before && > + mv .git/packed-refs .git/my-deviant-packed-refs && > + ln -s my-deviant-packed-refs .git/packed-refs && > + git for-each-ref >all-refs-linked && > + test_cmp all-refs-before all-refs-linked && > + git pack-refs --all --prune && > + git for-each-ref >all-refs-packed && > + test_cmp all-refs-before all-refs-packed && > + test -h .git/packed-refs && > + test "$(readlink .git/packed-refs)" = "my-deviant-packed-refs" > +' > + > test_done
Re: [PATCH] packed_ref_store: handle a packed-refs file that is a symlink
Stefan, Jonathan: you're both right, the new test needs the SYMLINKS prerequisite. Junio, would you mind adding that flag? Thanks, Michael
Re: [PATCH] packed_ref_store: handle a packed-refs file that is a symlink
Hi, Michael Haggerty wrote: > One of the tricks that `contrib/workdir/git-new-workdir` plays is to > making `packed-refs` in the new workdir a symlink to the `packed-refs` > file in the original repository. Before > 42dfa7ecef ("commit_packed_refs(): use a staging file separate from > the lockfile", 2017-06-23), a lockfile was used as the staging file, > and because the `LOCK_NO_DEREF` was not used, the pointed-to file was > locked and modified. > > But after that commit, the staging file was created using a tempfile, > with the end result that rewriting the `packed-refs` file in the > workdir overwrote the symlink rather than the original `packed-refs` > file. > > Change `commit_packed_refs()` to use `get_locked_file_path()` to find > the path of the file that it should overwrite. Since that path was > properly resolved when the lockfile was created, this restores the > pre-42dfa7ecef behavior. > > Also add a test case to document this use case and prevent a > regression like this from recurring. > > Signed-off-by: Michael HaggertyReported-by: Dave Walker [...] > refs/packed-backend.c | 24 ++-- > t/t3210-pack-refs.sh | 15 +++ > 2 files changed, 33 insertions(+), 6 deletions(-) The patch looks good, except for one nit marked below (*). I'll apply it locally and ask people to test it, probably tomorrow morning. Reviewed-by: Jonathan Nieder Thanks for writing it. Patch left unsnipped for reference. > diff --git a/refs/packed-backend.c b/refs/packed-backend.c > index a28befbfa3..59e7d1a509 100644 > --- a/refs/packed-backend.c > +++ b/refs/packed-backend.c > @@ -610,19 +610,27 @@ int commit_packed_refs(struct ref_store *ref_store, > struct strbuf *err) > struct packed_ref_cache *packed_ref_cache = > get_packed_ref_cache(refs); > int ok; > + int ret = -1; > struct strbuf sb = STRBUF_INIT; > FILE *out; > struct ref_iterator *iter; > + char *packed_refs_path; > > if (!is_lock_file_locked(>lock)) > die("BUG: commit_packed_refs() called when unlocked"); > > - strbuf_addf(, "%s.new", refs->path); > + /* > + * If packed-refs is a symlink, we want to overwrite the > + * symlinked-to file, not the symlink itself. Also, put the > + * staging file next to it: > + */ > + packed_refs_path = get_locked_file_path(>lock); > + strbuf_addf(, "%s.new", packed_refs_path); > if (create_tempfile(>tempfile, sb.buf) < 0) { > strbuf_addf(err, "unable to create file %s: %s", > sb.buf, strerror(errno)); > strbuf_release(); > - return -1; > + goto out; > } > strbuf_release(); > > @@ -660,17 +668,21 @@ int commit_packed_refs(struct ref_store *ref_store, > struct strbuf *err) > goto error; > } > > - if (rename_tempfile(>tempfile, refs->path)) { > + if (rename_tempfile(>tempfile, packed_refs_path)) { > strbuf_addf(err, "error replacing %s: %s", > refs->path, strerror(errno)); > - return -1; > + goto out; > } > > - return 0; > + ret = 0; > + goto out; > > error: > delete_tempfile(>tempfile); > - return -1; > + > +out: > + free(packed_refs_path); > + return ret; > } > > /* > diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh > index 2bb4b25ed9..0d8a03e2a9 100755 > --- a/t/t3210-pack-refs.sh > +++ b/t/t3210-pack-refs.sh > @@ -238,4 +238,19 @@ test_expect_success 'retry acquiring packed-refs.lock' ' > git -c core.packedrefstimeout=3000 pack-refs --all --prune > ' > > +test_expect_success 'pack symlinked packed-refs' ' (*) Does this need a SYMLINKS prereq to avoid trouble on Windows? > + # First make sure that symlinking works when reading: > + git update-ref refs/heads/loosy refs/heads/master && > + git for-each-ref >all-refs-before && > + mv .git/packed-refs .git/my-deviant-packed-refs && > + ln -s my-deviant-packed-refs .git/packed-refs && > + git for-each-ref >all-refs-linked && > + test_cmp all-refs-before all-refs-linked && > + git pack-refs --all --prune && > + git for-each-ref >all-refs-packed && > + test_cmp all-refs-before all-refs-packed && > + test -h .git/packed-refs && > + test "$(readlink .git/packed-refs)" = "my-deviant-packed-refs" > +' > + > test_done
Re: [PATCH] packed_ref_store: handle a packed-refs file that is a symlink
On Wed, Jul 26, 2017 at 4:39 PM, Michael Haggertywrote: > One of the tricks that `contrib/workdir/git-new-workdir` plays is to > making `packed-refs` in the new workdir a symlink to the `packed-refs` > file in the original repository. Before > 42dfa7ecef ("commit_packed_refs(): use a staging file separate from > the lockfile", 2017-06-23), a lockfile was used as the staging file, > and because the `LOCK_NO_DEREF` was not used, the pointed-to file was > locked and modified. > > But after that commit, the staging file was created using a tempfile, > with the end result that rewriting the `packed-refs` file in the > workdir overwrote the symlink rather than the original `packed-refs` > file. > > Change `commit_packed_refs()` to use `get_locked_file_path()` to find > the path of the file that it should overwrite. Since that path was > properly resolved when the lockfile was created, this restores the > pre-42dfa7ecef behavior. > > Also add a test case to document this use case and prevent a > regression like this from recurring. > > Signed-off-by: Michael Haggerty > --- > Sorry for the slow response; I've been traveling and very busy. > > I didn't even know that a symlinked `packed-refs` file is a thing. The > attached patch should fix the problem. It applies on top of v3 of > mh/packed-ref-store [1] (which is already in next). Thanks for providing a fix. Code looks good to me, I wonder if the test needs SYMLINKS special treatment though? Stefan
[PATCH] packed_ref_store: handle a packed-refs file that is a symlink
One of the tricks that `contrib/workdir/git-new-workdir` plays is to making `packed-refs` in the new workdir a symlink to the `packed-refs` file in the original repository. Before 42dfa7ecef ("commit_packed_refs(): use a staging file separate from the lockfile", 2017-06-23), a lockfile was used as the staging file, and because the `LOCK_NO_DEREF` was not used, the pointed-to file was locked and modified. But after that commit, the staging file was created using a tempfile, with the end result that rewriting the `packed-refs` file in the workdir overwrote the symlink rather than the original `packed-refs` file. Change `commit_packed_refs()` to use `get_locked_file_path()` to find the path of the file that it should overwrite. Since that path was properly resolved when the lockfile was created, this restores the pre-42dfa7ecef behavior. Also add a test case to document this use case and prevent a regression like this from recurring. Signed-off-by: Michael Haggerty--- Sorry for the slow response; I've been traveling and very busy. I didn't even know that a symlinked `packed-refs` file is a thing. The attached patch should fix the problem. It applies on top of v3 of mh/packed-ref-store [1] (which is already in next). Michael [1] http://public-inbox.org/git/cover.1498933362.git.mhag...@alum.mit.edu/ refs/packed-backend.c | 24 ++-- t/t3210-pack-refs.sh | 15 +++ 2 files changed, 33 insertions(+), 6 deletions(-) diff --git a/refs/packed-backend.c b/refs/packed-backend.c index a28befbfa3..59e7d1a509 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -610,19 +610,27 @@ int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) struct packed_ref_cache *packed_ref_cache = get_packed_ref_cache(refs); int ok; + int ret = -1; struct strbuf sb = STRBUF_INIT; FILE *out; struct ref_iterator *iter; + char *packed_refs_path; if (!is_lock_file_locked(>lock)) die("BUG: commit_packed_refs() called when unlocked"); - strbuf_addf(, "%s.new", refs->path); + /* +* If packed-refs is a symlink, we want to overwrite the +* symlinked-to file, not the symlink itself. Also, put the +* staging file next to it: +*/ + packed_refs_path = get_locked_file_path(>lock); + strbuf_addf(, "%s.new", packed_refs_path); if (create_tempfile(>tempfile, sb.buf) < 0) { strbuf_addf(err, "unable to create file %s: %s", sb.buf, strerror(errno)); strbuf_release(); - return -1; + goto out; } strbuf_release(); @@ -660,17 +668,21 @@ int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err) goto error; } - if (rename_tempfile(>tempfile, refs->path)) { + if (rename_tempfile(>tempfile, packed_refs_path)) { strbuf_addf(err, "error replacing %s: %s", refs->path, strerror(errno)); - return -1; + goto out; } - return 0; + ret = 0; + goto out; error: delete_tempfile(>tempfile); - return -1; + +out: + free(packed_refs_path); + return ret; } /* diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh index 2bb4b25ed9..0d8a03e2a9 100755 --- a/t/t3210-pack-refs.sh +++ b/t/t3210-pack-refs.sh @@ -238,4 +238,19 @@ test_expect_success 'retry acquiring packed-refs.lock' ' git -c core.packedrefstimeout=3000 pack-refs --all --prune ' +test_expect_success 'pack symlinked packed-refs' ' + # First make sure that symlinking works when reading: + git update-ref refs/heads/loosy refs/heads/master && + git for-each-ref >all-refs-before && + mv .git/packed-refs .git/my-deviant-packed-refs && + ln -s my-deviant-packed-refs .git/packed-refs && + git for-each-ref >all-refs-linked && + test_cmp all-refs-before all-refs-linked && + git pack-refs --all --prune && + git for-each-ref >all-refs-packed && + test_cmp all-refs-before all-refs-packed && + test -h .git/packed-refs && + test "$(readlink .git/packed-refs)" = "my-deviant-packed-refs" +' + test_done -- 2.11.0