Re: [PATCH 1/4] resolve_ref: close race condition for packed refs
On 05/16/2013 05:47 AM, Jeff King wrote: >> I probably would have separated the rest of the patch, which is a pure >> refactoring, from this last chunk, which is a functional change. But >> that's just me. > > Yeah, I go back and forth on whether it is better to have strict > refactors followed by changes or not. Sometimes it is hard to understand > the motivation for the refactor without seeing the change, and you end > up explaining it twice. A pure refactoring doesn't need much justification. Something like "extract function foo()" plus maybe "this function will soon have multiple callers" is IMO usually adequate, especially if the function is well-named and documented in the patch itself. > My usual rule of thumb is: > > - If you are factoring out some code, and then are going to change > that code, make it two separate changes. That keeps the diffs > readable (the first one is pure movement and you do not need to look > closely, and the second shows a sane diff of the change). > > - If you are factoring out some code, and then adding more callers, > it's OK to do it together. The new caller provides the motivation > for the refactor. > > This is the latter case. But I'm open to arguments that the rule is not > a good one. :) Yes, I see how keeping the changes together makes the justification of the refactoring more obvious. On the other hand, splitting has the following benefits: 1. Reviewers have a single thing to check in each patch: "Did he transcribe the code correctly into a function and choose a good API?" vs. "Does it make sense to call the function from this new place?" The threads of feedback emails about each patch are similarly separated. On the other hand, of course these two changes are not completely independent, because having an idea what new callers want to do with the function affects what its API should be. 2. If the patch series needs to be revised, it is quite possible that the revisions affect only one patch or the other. Therefore, the unaffected patch can be carried along through interactive rebases etc. intact, or might serve as a building block for an alternative solution. 3. If there's a problem, bisect can figure out which half of the change was to blame. That being said, this case is very much in the gray area where it is a matter of personal preference and I don't mind at all if you leave it as a single patch. Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] resolve_ref: close race condition for packed refs
On Mon, May 13, 2013 at 12:56:05AM +0200, Michael Haggerty wrote: > > + * This should be called from resolve_ref_unsafe when a loose ref cannot be > > + * accessed; err must represent the errno from the last attempt to access > > the > > + * loose ref, and the other options are forwarded from > > resolve_safe_unsaefe. > > s/resolve_ref_unsaefe/resolve_ref_unsafe/ Oops, thanks. > > - return NULL; > > + return handle_loose_ref_failure(errno, refname, sha1, > > + reading, flag); > > I probably would have separated the rest of the patch, which is a pure > refactoring, from this last chunk, which is a functional change. But > that's just me. Yeah, I go back and forth on whether it is better to have strict refactors followed by changes or not. Sometimes it is hard to understand the motivation for the refactor without seeing the change, and you end up explaining it twice. My usual rule of thumb is: - If you are factoring out some code, and then are going to change that code, make it two separate changes. That keeps the diffs readable (the first one is pure movement and you do not need to look closely, and the second shows a sane diff of the change). - If you are factoring out some code, and then adding more callers, it's OK to do it together. The new caller provides the motivation for the refactor. This is the latter case. But I'm open to arguments that the rule is not a good one. :) > I suggest adding a comment here mentioning briefly the race condition > that the call to handle_loose_ref_failure() is meant to address; > otherwise somebody not thinking of race conditions might have the clever > idea of "inlining" the call to "return NULL" because it seems redundant > with the check of ENOENT following the lstat() call above. Yeah, I thought I had mentioned that at the top of handle_loose_ref_failure, but I see that I didn't. Probably something like this squashed on top makes sense: diff --git a/refs.c b/refs.c index c127baf..1a7e4ef 100644 --- a/refs.c +++ b/refs.c @@ -,7 +,7 @@ static int get_packed_ref(const char *refname, unsigned char *sha1) /* * This should be called from resolve_ref_unsafe when a loose ref cannot be * accessed; err must represent the errno from the last attempt to access the - * loose ref, and the other options are forwarded from resolve_safe_unsaefe. + * loose ref, and the other options are forwarded from resolve_safe_unsafe. */ static const char *handle_loose_ref_failure(int err, const char *refname, @@ -1200,9 +1200,16 @@ const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int rea * a ref */ fd = open(path, O_RDONLY); - if (fd < 0) + if (fd < 0) { + /* +* We need to check again here for ENOENT and fall back +* to the packed-refs file to avoid a race condition in +* which the ref is packed and pruned between our +* lstat() and open() calls. +*/ return handle_loose_ref_failure(errno, refname, sha1, reading, flag); + } len = read_in_full(fd, buffer, sizeof(buffer)-1); close(fd); if (len < 0) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] resolve_ref: close race condition for packed refs
On 05/07/2013 04:38 AM, Jeff King wrote: > [...] > This patch solves it by factoring out the fallback code from > the lstat() case and calling it from both places. We do not > need to do the same thing for the symlink/readlink code > path, even though it might receive ENOENT, too, because > symlinks cannot be packed (so if it disappears after the > lstat, it is truly being deleted). To be really pedantic, we should handle all kinds of changes that another process might make simultaneously: * symlink -> deleted Was already OK, as you explained above. * regular file -> deleted Handled by your patch * symlink -> regular file Could come from update-ref --no-deref if the old version of the reference were a symlink-based symbolic reference. Oops, wait, that's broken anyway: $ ln -sf master .git/refs/heads/foo $ git for-each-ref 4204906e2ee75f9b7224860c40df712d112c004b commit refs/heads/foo 4204906e2ee75f9b7224860c40df712d112c004b commit refs/heads/master $ git update-ref --no-deref refs/heads/foo master $ dir .git/refs/heads/foo lrwxrwxrwx 1 mhagger mhagger 6 May 13 01:07 .git/refs/heads/foo -> master But supposing that were fixed and it occurred between the call to lstat() and the call to readlink(), then readlink() would fail with errno == EINVAL and we should respond by repeating the code starting with the lstat(). * regular file -> symlink Could come from overwriting a reference with a symlink-based symbolic reference. But this could only happen if the parallel process were an old version of Git I think it's been quite a while since Git has used symlinks for symbolic references, so I doubt that it is worth fixing the second two cases. Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] resolve_ref: close race condition for packed refs
On 05/07/2013 04:38 AM, Jeff King wrote: > When we attempt to resolve a ref, the first thing we do is > call lstat() to see if it is a symlink or a real ref. If we > find that the ref is missing, we fall back to looking it up > in the packed-refs file. If we find the loose ref does exist > (and is a regular file), we continue with trying to open it. > > However, we do not do the same fallback if our open() call > fails; we just report the ref as missing. A "git pack-refs > --prune" process which is simultaneously running may remove > the loose ref between our lstat() and open(). In this case, > we would erroneously report the ref as missing, even though > we could find it if we checked the packed-refs file. > > This patch solves it by factoring out the fallback code from > the lstat() case and calling it from both places. We do not > need to do the same thing for the symlink/readlink code > path, even though it might receive ENOENT, too, because > symlinks cannot be packed (so if it disappears after the > lstat, it is truly being deleted). > > Note that this solves only the part of the race within > resolve_ref_unsafe. In the situation described above, we may > still be depending on a cached view of the packed-refs file; > that race will be dealt with in a future patch. > > Signed-off-by: Jeff King +1, with trivial suggestions below. > --- > > refs.c | 63 ++- > 1 file changed, 42 insertions(+), 21 deletions(-) > > diff --git a/refs.c b/refs.c > index de2d8eb..5a14703 100644 > --- a/refs.c > +++ b/refs.c > @@ -1083,6 +1083,43 @@ static int get_packed_ref(const char *refname, > unsigned char *sha1) > return -1; > } > > +/* > + * This should be called from resolve_ref_unsafe when a loose ref cannot be > + * accessed; err must represent the errno from the last attempt to access the > + * loose ref, and the other options are forwarded from resolve_safe_unsaefe. s/resolve_ref_unsaefe/resolve_ref_unsafe/ > + */ > +static const char *handle_loose_ref_failure(int err, > + const char *refname, > + unsigned char *sha1, > + int reading, > + int *flag) > +{ > + /* > + * If we didn't get ENOENT, something is broken > + * with the loose ref, and we should not fallback, > + * but instead pretend it doesn't exist. > + */ > + if (err != ENOENT) > + return NULL; > + > + /* > + * The loose reference file does not exist; > + * check for a packed reference. > + */ > + if (!get_packed_ref(refname, sha1)) { > + if (flag) > + *flag |= REF_ISPACKED; > + return refname; > + } > + > + /* The reference is not a packed reference, either. */ > + if (reading) > + return NULL; > + > + hashclr(sha1); > + return refname; > +} > + > const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int > reading, int *flag) > { > int depth = MAXDEPTH; > @@ -1107,26 +1144,9 @@ const char *resolve_ref_unsafe(const char *refname, > unsigned char *sha1, int rea > > git_snpath(path, sizeof(path), "%s", refname); > > - if (lstat(path, &st) < 0) { > - if (errno != ENOENT) > - return NULL; > - /* > - * The loose reference file does not exist; > - * check for a packed reference. > - */ > - if (!get_packed_ref(refname, sha1)) { > - if (flag) > - *flag |= REF_ISPACKED; > - return refname; > - } > - /* The reference is not a packed reference, either. */ > - if (reading) { > - return NULL; > - } else { > - hashclr(sha1); > - return refname; > - } > - } > + if (lstat(path, &st) < 0) > + return handle_loose_ref_failure(errno, refname, sha1, > + reading, flag); > > /* Follow "normalized" - ie "refs/.." symlinks by hand */ > if (S_ISLNK(st.st_mode)) { > @@ -1156,7 +1176,8 @@ const char *resolve_ref_unsafe(const char *refname, > unsigned char *sha1, int rea >*/ > fd = open(path, O_RDONLY); > if (fd < 0) > - return NULL; > + return handle_loose_ref_failure(errno, refname, sha1, > + reading, flag); I probably would have separated the rest of the patch, which is
[PATCH 1/4] resolve_ref: close race condition for packed refs
When we attempt to resolve a ref, the first thing we do is call lstat() to see if it is a symlink or a real ref. If we find that the ref is missing, we fall back to looking it up in the packed-refs file. If we find the loose ref does exist (and is a regular file), we continue with trying to open it. However, we do not do the same fallback if our open() call fails; we just report the ref as missing. A "git pack-refs --prune" process which is simultaneously running may remove the loose ref between our lstat() and open(). In this case, we would erroneously report the ref as missing, even though we could find it if we checked the packed-refs file. This patch solves it by factoring out the fallback code from the lstat() case and calling it from both places. We do not need to do the same thing for the symlink/readlink code path, even though it might receive ENOENT, too, because symlinks cannot be packed (so if it disappears after the lstat, it is truly being deleted). Note that this solves only the part of the race within resolve_ref_unsafe. In the situation described above, we may still be depending on a cached view of the packed-refs file; that race will be dealt with in a future patch. Signed-off-by: Jeff King --- refs.c | 63 ++- 1 file changed, 42 insertions(+), 21 deletions(-) diff --git a/refs.c b/refs.c index de2d8eb..5a14703 100644 --- a/refs.c +++ b/refs.c @@ -1083,6 +1083,43 @@ static int get_packed_ref(const char *refname, unsigned char *sha1) return -1; } +/* + * This should be called from resolve_ref_unsafe when a loose ref cannot be + * accessed; err must represent the errno from the last attempt to access the + * loose ref, and the other options are forwarded from resolve_safe_unsaefe. + */ +static const char *handle_loose_ref_failure(int err, + const char *refname, + unsigned char *sha1, + int reading, + int *flag) +{ + /* +* If we didn't get ENOENT, something is broken +* with the loose ref, and we should not fallback, +* but instead pretend it doesn't exist. +*/ + if (err != ENOENT) + return NULL; + + /* +* The loose reference file does not exist; +* check for a packed reference. +*/ + if (!get_packed_ref(refname, sha1)) { + if (flag) + *flag |= REF_ISPACKED; + return refname; + } + + /* The reference is not a packed reference, either. */ + if (reading) + return NULL; + + hashclr(sha1); + return refname; +} + const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int reading, int *flag) { int depth = MAXDEPTH; @@ -1107,26 +1144,9 @@ const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int rea git_snpath(path, sizeof(path), "%s", refname); - if (lstat(path, &st) < 0) { - if (errno != ENOENT) - return NULL; - /* -* The loose reference file does not exist; -* check for a packed reference. -*/ - if (!get_packed_ref(refname, sha1)) { - if (flag) - *flag |= REF_ISPACKED; - return refname; - } - /* The reference is not a packed reference, either. */ - if (reading) { - return NULL; - } else { - hashclr(sha1); - return refname; - } - } + if (lstat(path, &st) < 0) + return handle_loose_ref_failure(errno, refname, sha1, + reading, flag); /* Follow "normalized" - ie "refs/.." symlinks by hand */ if (S_ISLNK(st.st_mode)) { @@ -1156,7 +1176,8 @@ const char *resolve_ref_unsafe(const char *refname, unsigned char *sha1, int rea */ fd = open(path, O_RDONLY); if (fd < 0) - return NULL; + return handle_loose_ref_failure(errno, refname, sha1, + reading, flag); len = read_in_full(fd, buffer, sizeof(buffer)-1); close(fd); if (len < 0) -- 1.8.3.rc1.2.g12db477 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel