Re: [PATCH 07/18] link_alt_odb_entry: handle normalize_path errors
> >> Is there anything I can do to help? I'm happy to test out changes. > > The patch at the end of his mail obviously passes the newly-added tests > for me, but please confirm that it fixes your test suite. > > I gather your suite is about noticing behavior changes between different > versions. For cases where we know there is an obvious right behavior, it > would be nice if you could contribute them as patches to git's test > suite. This case was overlooked because there was no test coverage at > all. > > Barring that, running your suite and giving easily-reproducible problem > reports is valuable. The earlier the better. So I am happy to see this > on -rc0, and not on the final release. Periodically running it on > "master" during the development cycle would have caught it even sooner. I've applied your patch to the tip of the 2.11.0-rc0 tag (just to make sure I don't accidentally pick up anything else on master; I'll test that separately) and my full test suite passes without issue. I'm going to investigate whether I can setup a version of this build that runs "periodically" (I'm not sure what that period will be) against git/git master. I've got a lot of the infrastructure in place, but I'm going to need to automate a few things to make it really viable. As for contributing extensions to the test suite, that's a good idea. I need to fast track getting a development environment setup.
Re: [PATCH 07/18] link_alt_odb_entry: handle normalize_path errors
On Mon, Nov 07, 2016 at 05:12:43PM -0800, Bryan Turner wrote: > > The obvious solution is one of: > > > > 1. Stop calling normalize() at all when we do not have a relative base > > and the path is not absolute. This restores the original quirky > > behavior (plus makes the "foo/../../bar" case work). Actually, I think we want to keep normalizing, as it is possible for relative paths to normalize correctly (e.g., "foo/../bar"). We just need to ignore the error, which is easy. The patch is below, and is the absolute minimum I think we'd need for v2.11. Beyond that, we could go further: a. Actually make a real absolute path based on getcwd(), which would protect against later chdir() calls, and possibly help with duplicate suppression. I'm not sure there are actually any triggerable bugs here, so I went for the minimal fix. b. Pick a more sane base for the absolute path, like $GIT_DIR. If we did so, then people using relative paths in GIT_ALTERNATE_OBJECT_DIRECTORIES from a bare repo would continue to work, and people in non-bare repositories would have to add an extra ".." to most of their paths. So a slight regression, but saner overall semantics. Making it relative to the object directory ($GIT_DIR/objects, or even whatever is in $GIT_OBJECT_DIRECTORY) makes even more sense to me, but would regress even the bare case (and would probably be "interesting" along with the tmp-objdir stuff, which sets $GIT_OBJECT_DIRECTORY on the fly, as that would invalidate $GIT_ALTERNATE_OBJECT_DIRECTORIES). I'm inclined to leave those to anybody interested post-v2.11 (or never, if nobody cares). But it would be pretty trivial to do (a) as part of this initial fix if anybody feels strongly. > Is there anything I can do to help? I'm happy to test out changes. The patch at the end of his mail obviously passes the newly-added tests for me, but please confirm that it fixes your test suite. I gather your suite is about noticing behavior changes between different versions. For cases where we know there is an obvious right behavior, it would be nice if you could contribute them as patches to git's test suite. This case was overlooked because there was no test coverage at all. Barring that, running your suite and giving easily-reproducible problem reports is valuable. The earlier the better. So I am happy to see this on -rc0, and not on the final release. Periodically running it on "master" during the development cycle would have caught it even sooner. > At the moment I have limited ability to actually try to submit patches > myself. I really need to sit down and setup a working development > environment for Git. (My current dream, if I could get such an > environment running, is to follow up on your git blame-tree work. I would be happy for somebody to pick that up, too. It has been powering GitHub's tree-view for several years now, but I know there are some rough edges as well as opportunities to optimize it. -- >8 -- Subject: [PATCH] alternates: re-allow relative paths from environment Commit 670c359da (link_alt_odb_entry: handle normalize_path errors, 2016-10-03) regressed the handling of relative paths in the GIT_ALTERNATE_OBJECT_DIRECTORIES variable. It's not entirely clear this was ever meant to work, but it _has_ worked for several years, so this commit restores the original behavior. When we get a path in GIT_ALTERNATE_OBJECT_DIRECTORIES, we add it the path to the list of alternate object directories as if it were found in objects/info/alternates, but with one difference: we do not provide the link_alt_odb_entry() function with a base for relative paths. That function doesn't turn it into an absolute path, and we end up feeding the relative path to the strbuf_normalize_path() function. Most relative paths break out of the top-level directory (e.g., "../foo.git/objects"), and thus normalizing fails. Prior to 670c359da, we simply ignored the error, and due to the way normalize_path_copy() was implemented it happened to return the original path in this case. We then accessed the alternate objects using this relative path. By storing the relative path in the alt_odb list, the path is relative to wherever we happen to be at the time we do an object lookup. That means we look from $GIT_DIR in a bare repository, and from the top of the worktree in a non-bare repository. If this were being designed from scratch, it would make sense to pick a stable location (probably $GIT_DIR, or even the object directory) and use that as the relative base, turning the result into an absolute path. However, given the history, at this point the minimal fix is to match the pre-670c359da behavior. We can do this simply by ignoring the error when we have no relative base and using the original value (which we now reliably have, thanks to strbuf_normalize_path()). That still leaves us with a relative path that foils our duplicate detection, and m
Re: [PATCH 07/18] link_alt_odb_entry: handle normalize_path errors
On Mon, Nov 7, 2016 at 4:30 PM, Jeff King wrote: > On Mon, Nov 07, 2016 at 03:42:35PM -0800, Bryan Turner wrote: > >> > @@ -335,7 +340,9 @@ static void link_alt_odb_entries(const char *alt, int >> > len, int sep, >> > } >> > >> > strbuf_add_absolute_path(&objdirbuf, get_object_directory()); >> > - normalize_path_copy(objdirbuf.buf, objdirbuf.buf); >> > + if (strbuf_normalize_path(&objdirbuf) < 0) >> > + die("unable to normalize object directory: %s", >> > + objdirbuf.buf); >> >> This appears to break the ability to use a relative alternate via an >> environment variable, since normalize_path_copy_len is explicitly >> documented "Returns failure (non-zero) if a ".." component appears as >> first path" > > That shouldn't happen, though, because the path we are normalizing has > been converted to an absolute path via strbuf_add_absolute_path. IOW, if > your relative path is "../../../foo", we should be feeding something > like "/path/to/repo/.git/objects/../../../foo" and normalizing that to > "/path/to/foo". > > But in your example, you see: > > error: unable to normalize alternate object path: ../0/objects > > which cannot come from the code above, which calls die(). It should be > coming from the call in link_alt_odb_entry(). Ah, of course. I should have looked more closely at the call. > No, I had no intention of disallowing relative alternates (and as you > noticed, a commit from the same series actually expands the use of > relative alternates). My use has been entirely within info/alternates > files, though, not via the environment. > > As I said, I'm not sure this was ever meant to work, but as far as I can > tell it mostly _has_ worked, modulo some quirks. So I think we should > consider it a regression for it to stop working in v2.11. > > The obvious solution is one of: > > 1. Stop calling normalize() at all when we do not have a relative base > and the path is not absolute. This restores the original quirky > behavior (plus makes the "foo/../../bar" case work). > > If we want to do the minimum before releasing v2.11, it would be > that. I'm not sure it leaves things in a very sane state, but at > least v2.11 does no harm, and anybody who cares can build saner > semantics for v2.12. > > 2. Fix it for real. Pass a real relative_base when linking from the > environment. The question is: what is the correct relative base? I > suppose "getcwd() at the time we prepare the alt odb" is > reasonable, and would behave similarly to older versions ($GIT_DIR > for bare repos, top of the working tree otherwise). > > If we were designing from scratch, I think saner semantics would > probably be always relative from $GIT_DIR, or even always relative > from the object directory (i.e., behave as if the paths were given > in objects/info/alternates). But that breaks compatibility with > older versions. If we are treating this as a regression, it is not > very friendly to say "you are still broken, but you might just need > to add an extra '..' to your path". > > So I dunno. I guess that inclines me towards (1), as it lets us punt on > the harder question. Is there anything I can do to help? I'm happy to test out changes. I've got a set of ~1,040 tests that verify all sorts of different Git behaviors (those tests flagged this change, for example, and found a regression in git diff-tree in 2.0.2/2.0.3, among other things). I run them on the "newest" patch release for every feature-bearing line of Git from 1.8.x up to 2.10 (currently 1.8.0.3, 1.8.1.5, 1.8.2.3, 1.8.3.4, 1.8.4.5, 1.8.5.6, 1.9.5, 2.0.5, 2.1.4, 2.2.3, 2.3.10, 2.4.11, 2.5.5, 2.6.6, 2.7.4, 2.8.4, 2.9.3 and 2.10.2), and I add in RCs of new as soon as they become available. (I also test Git for Windows; at the moment I've got 1.8.0, 1.8.1.2, 1.8.3, 1.8.4, 1.8.5.2 and 1.9.5.1 from msysgit and 2.3.7.1, 2.4.6, 2.5.3, 2.6.4, 2.7.4, 2.8.4, 2.9.3 and 2.10.2 from Git for Windows. 2.11.0-rc0 on Windows passes my test suite; it looks like it's not tagging the same git/git commit as v2.11.0-rc0 is.) I wish there was an easy way for me to open this up. At the moment, it's something I can really only run in-house, as it were. At the moment I have limited ability to actually try to submit patches myself. I really need to sit down and setup a working development environment for Git. (My current dream, if I could get such an environment running, is to follow up on your git blame-tree work. > > -Peff
Re: [PATCH 07/18] link_alt_odb_entry: handle normalize_path errors
On Mon, Nov 07, 2016 at 03:42:35PM -0800, Bryan Turner wrote: > > @@ -335,7 +340,9 @@ static void link_alt_odb_entries(const char *alt, int > > len, int sep, > > } > > > > strbuf_add_absolute_path(&objdirbuf, get_object_directory()); > > - normalize_path_copy(objdirbuf.buf, objdirbuf.buf); > > + if (strbuf_normalize_path(&objdirbuf) < 0) > > + die("unable to normalize object directory: %s", > > + objdirbuf.buf); > > This appears to break the ability to use a relative alternate via an > environment variable, since normalize_path_copy_len is explicitly > documented "Returns failure (non-zero) if a ".." component appears as > first path" That shouldn't happen, though, because the path we are normalizing has been converted to an absolute path via strbuf_add_absolute_path. IOW, if your relative path is "../../../foo", we should be feeding something like "/path/to/repo/.git/objects/../../../foo" and normalizing that to "/path/to/foo". But in your example, you see: error: unable to normalize alternate object path: ../0/objects which cannot come from the code above, which calls die(). It should be coming from the call in link_alt_odb_entry(). I think what is happening is that relative paths via environment variables have always been slightly broken, but happened to mostly work. In prepare_alt_odb(), we call link_alt_odb_entries() with a NULL relative_base. That function does two things with it: - it may unconditionally dereference it for an error message, which would cause a segfault. This is impossible to trigger in practice, though, because the error message is related to the depth, which we know will always be 0 here. - we pass the NULL along to the singular link_alt_odb_entry(). That function only creates an absolute path if given a non-NULL relative_base; otherwise we have always fed the path to normalize_path_copy, which is nonsense for a relative path. So normalize_path_copy() was _always_ returning an error there, but we ignored it and used whatever happened to be left in the buffer anyway. And because of the way normalize_path_copy() is implemented, that happened to be the untouched original string in most cases. But that's mostly an accident. I think it would not be for something like "foo/../../bar", which is technically valid (if done from a relative base that has at least one path component). Moreover, it means we don't have an absolute path to our alternate odb. So the path is taken as relative whenever we do an object lookup, meaning it will behave differently between a bare repository (where we chdir to $GIT_DIR) and one with a working tree (where we are generally in the root of the working tree). It can even behave differently in the same process if we chdir between object lookups. So it did happen to work, but I'm not sure it was planned (and obviously we have no test coverage for it). More on that below. > Other commits, like [1], suggest the ability to use relative paths in > alternates is something still actively developed and enhanced. Is it > intentional that this breaks the ability to use relative alternates? > If this is to be the "new normal", is there any other option when > using environment variables besides using absolute paths? No, I had no intention of disallowing relative alternates (and as you noticed, a commit from the same series actually expands the use of relative alternates). My use has been entirely within info/alternates files, though, not via the environment. As I said, I'm not sure this was ever meant to work, but as far as I can tell it mostly _has_ worked, modulo some quirks. So I think we should consider it a regression for it to stop working in v2.11. The obvious solution is one of: 1. Stop calling normalize() at all when we do not have a relative base and the path is not absolute. This restores the original quirky behavior (plus makes the "foo/../../bar" case work). If we want to do the minimum before releasing v2.11, it would be that. I'm not sure it leaves things in a very sane state, but at least v2.11 does no harm, and anybody who cares can build saner semantics for v2.12. 2. Fix it for real. Pass a real relative_base when linking from the environment. The question is: what is the correct relative base? I suppose "getcwd() at the time we prepare the alt odb" is reasonable, and would behave similarly to older versions ($GIT_DIR for bare repos, top of the working tree otherwise). If we were designing from scratch, I think saner semantics would probably be always relative from $GIT_DIR, or even always relative from the object directory (i.e., behave as if the paths were given in objects/info/alternates). But that breaks compatibility with older versions. If we are treating this as a regression, it is not very friendly to sa
Re: [PATCH 07/18] link_alt_odb_entry: handle normalize_path errors
On Mon, Oct 3, 2016 at 1:34 PM, Jeff King wrote: > When we add a new alternate to the list, we try to normalize > out any redundant "..", etc. However, we do not look at the > return value of normalize_path_copy(), and will happily > continue with a path that could not be normalized. Worse, > the normalizing process is done in-place, so we are left > with whatever half-finished working state the normalizing > function was in. > > @@ -335,7 +340,9 @@ static void link_alt_odb_entries(const char *alt, int > len, int sep, > } > > strbuf_add_absolute_path(&objdirbuf, get_object_directory()); > - normalize_path_copy(objdirbuf.buf, objdirbuf.buf); > + if (strbuf_normalize_path(&objdirbuf) < 0) > + die("unable to normalize object directory: %s", > + objdirbuf.buf); This appears to break the ability to use a relative alternate via an environment variable, since normalize_path_copy_len is explicitly documented "Returns failure (non-zero) if a ".." component appears as first path" For example, when trying to run a rev-list over commits in two repositories using GIT_ALTERNATE_OBJECT_DIRECTORIES, in 2.10.x and prior the following command works. I know the alternate worked previously because I'm passing a commit that does not exist in the repository I'm running the command in; it only exists in a repository linked by alternate, as shown by the "fatal: bad object" when the alternates are rejected. Before, using Git 2.7.4 (but I've verified this behavior through to and including 2.10.2): bturner@elysoun /tmp/1478561282706-0/shared/data/repositories/3 $ GIT_ALTERNATE_OBJECT_DIRECTORIES=../0/objects:../1/objects git rev-list --format="%H" 2d8897c9ac29ce42c3442cf80ac977057045e7f6 74de5497dfca9731e455d60552f9a8906e5dc1ac ^6053a1eaa1c009dd11092d09a72f3c41af1b59ad ^017caf31eca7c46eb3d1800fcac431cfa7147a01 -- commit 74de5497dfca9731e455d60552f9a8906e5dc1ac 74de5497dfca9731e455d60552f9a8906e5dc1ac commit 3528cf690cb37f6adb85b7bd40cc7a6118d4b598 3528cf690cb37f6adb85b7bd40cc7a6118d4b598 commit 2d8897c9ac29ce42c3442cf80ac977057045e7f6 2d8897c9ac29ce42c3442cf80ac977057045e7f6 commit 9c05f43f859375e392d90d23a13717c16d0fdcda 9c05f43f859375e392d90d23a13717c16d0fdcda Now, using Git 2.11.0-rc0 bturner@elysoun /tmp/1478561282706-0/shared/data/repositories/3 $ GIT_ALTERNATE_OBJECT_DIRECTORIES=../0/objects:../1/objects /opt/git/2.11.0-rc0/bin/git rev-list --format="%H" 2d8897c9ac29ce42c3442cf80ac977057045e7f6 74de5497dfca9731e455d60552f9a8906e5dc1ac ^6053a1eaa1c009dd11092d09a72f3c41af1b59ad ^017caf31eca7c46eb3d1800fcac431cfa7147a01 -- error: unable to normalize alternate object path: ../0/objects error: unable to normalize alternate object path: ../1/objects fatal: bad object 74de5497dfca9731e455d60552f9a8906e5dc1ac Other commits, like [1], suggest the ability to use relative paths in alternates is something still actively developed and enhanced. Is it intentional that this breaks the ability to use relative alternates? If this is to be the "new normal", is there any other option when using environment variables besides using absolute paths? Best regards, Bryan Turner [1]: https://github.com/git/git/commit/087b6d584062f5b704356286d6445bcc84d686fb -- Also newly tagged in 2.11.0-rc0
Re: [PATCH 07/18] link_alt_odb_entry: handle normalize_path errors
On Wed, Oct 05, 2016 at 08:47:29PM +0200, René Scharfe wrote: > > Instead, let's provide a strbuf helper that does an in-place > > normalize, but restores the original contents on error. This > > uses a second buffer under the hood, which is slightly less > > efficient, but this is not a performance-critical code path. > > Hmm, in-place functions are quite rare in the strbuf collection. It looks > like a good fit for the two callers and makes sense in general, though. Yeah, I almost wrote "strbuf_add_normalized_path()" instead. But then the callers end up having to do the allocate-and-swap thing themselves. And I think we're still set in the future to add that if somebody wants it (and we can then implement the in-place version in terms of it). Another alternative is to observe that the strbuf is generally used in the first place to make the path absolute. So another interface is perhaps something like: strbuf_add_path(struct strbuf *sb, const char *path, const char *relative_base) { struct strbuf scratch = STRBUF_INIT; int ret; if (is_absolute_path(path)) strbuf_grow(sb, strlen(path)); else { if (relative_path) strbuf_addstr(&scratch, path); else { if (strbuf_getcwd(&scratch)) return -1; } strbuf_addch(&scratch, '/'); strbuf_addstr(&scratch, path); strbuf_grow(sb, scratch.len); path = scratch.buf; } ret = normalize_path_copy(sb.buf + sb.len, path); strbuf_release(&scratch); return ret; } I don't think its worth the complexity of interface for the spots in this series, but maybe there are other places that could use it. I'll leave that to somebody else to explore if the ywant to. -Peff
Re: [PATCH 07/18] link_alt_odb_entry: handle normalize_path errors
Am 03.10.2016 um 22:34 schrieb Jeff King: When we add a new alternate to the list, we try to normalize out any redundant "..", etc. However, we do not look at the return value of normalize_path_copy(), and will happily continue with a path that could not be normalized. Worse, the normalizing process is done in-place, so we are left with whatever half-finished working state the normalizing function was in. Fortunately, this cannot cause us to read past the end of our buffer, as that working state will always leave the NUL from the original path in place. And we do tend to notice problems when we check is_directory() on the path. But you can see the nonsense that we feed to is_directory with an entry like: this/../../is/../../way/../../too/../../deep/../../to/../../resolve in your objects/info/alternates, which yields: error: object directory /to/e/deep/too/way//ects/this/../../is/../../way/../../too/../../deep/../../to/../../resolve does not exist; check .git/objects/info/alternates. We can easily fix this just by checking the return value. But that makes it hard to generate a good error message, since we're normalizing in-place and our input value has been overwritten by cruft. Instead, let's provide a strbuf helper that does an in-place normalize, but restores the original contents on error. This uses a second buffer under the hood, which is slightly less efficient, but this is not a performance-critical code path. Hmm, in-place functions are quite rare in the strbuf collection. It looks like a good fit for the two callers and makes sense in general, though.
Re: [PATCH 07/18] link_alt_odb_entry: handle normalize_path errors
Jeff King writes: > +int strbuf_normalize_path(struct strbuf *src) > +{ > + struct strbuf dst = STRBUF_INIT; > + > + strbuf_grow(&dst, src->len); > + if (normalize_path_copy(dst.buf, src->buf) < 0) { > + strbuf_release(&dst); > + return -1; > + } > + > + /* > + * normalize_path does not tell us the new length, so we have to > + * compute it by looking for the new NUL it placed > + */ > + strbuf_setlen(&dst, strlen(dst.buf)); > + strbuf_swap(src, &dst); > + strbuf_release(&dst); > + return 0; > +} Makes sense.
Re: [PATCH 07/18] link_alt_odb_entry: handle normalize_path errors
On Mon, Oct 3, 2016 at 1:34 PM, Jeff King wrote: > When we add a new alternate to the list, we try to normalize > out any redundant "..", etc. However, we do not look at the > return value of normalize_path_copy(), and will happily > continue with a path that could not be normalized. Worse, > the normalizing process is done in-place, so we are left > with whatever half-finished working state the normalizing > function was in. > > Fortunately, this cannot cause us to read past the end of > our buffer, as that working state will always leave the > NUL from the original path in place. And we do tend to > notice problems when we check is_directory() on the path. > But you can see the nonsense that we feed to is_directory > with an entry like: > > this/../../is/../../way/../../too/../../deep/../../to/../../resolve > > in your objects/info/alternates, which yields: > > error: object directory > > /to/e/deep/too/way//ects/this/../../is/../../way/../../too/../../deep/../../to/../../resolve > does not exist; check .git/objects/info/alternates. > Yikes, that doesn't seem helpful. > We can easily fix this just by checking the return value. > But that makes it hard to generate a good error message, > since we're normalizing in-place and our input value has > been overwritten by cruft. Right. Definitely want to check the return value here... > > Instead, let's provide a strbuf helper that does an in-place > normalize, but restores the original contents on error. This > uses a second buffer under the hood, which is slightly less > efficient, but this is not a performance-critical code path. > I agree, I don't think this duplication is really a big deal, since it helps ensure that the function doesn't modify its arguments on error. > The strbuf helper can also properly set the "len" parameter > of the strbuf before returning. Just doing: > > normalize_path_copy(buf.buf, buf.buf); > > will shorten the string, but leave buf.len at the original > length. That may be confusing to later code which uses the > strbuf. > Makes sense here. Properly setting the length will help prevent future issues. Thanks, Jake > Signed-off-by: Jeff King > --- > sha1_file.c | 11 +-- > strbuf.c| 20 > strbuf.h| 8 > 3 files changed, 37 insertions(+), 2 deletions(-) > > diff --git a/sha1_file.c b/sha1_file.c > index b9c1fa3..68571bd 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -263,7 +263,12 @@ static int link_alt_odb_entry(const char *entry, const > char *relative_base, > } > strbuf_addstr(&pathbuf, entry); > > - normalize_path_copy(pathbuf.buf, pathbuf.buf); > + if (strbuf_normalize_path(&pathbuf) < 0) { > + error("unable to normalize alternate object path: %s", > + pathbuf.buf); > + strbuf_release(&pathbuf); > + return -1; > + } > > pfxlen = strlen(pathbuf.buf); > > @@ -335,7 +340,9 @@ static void link_alt_odb_entries(const char *alt, int > len, int sep, > } > > strbuf_add_absolute_path(&objdirbuf, get_object_directory()); > - normalize_path_copy(objdirbuf.buf, objdirbuf.buf); > + if (strbuf_normalize_path(&objdirbuf) < 0) > + die("unable to normalize object directory: %s", > + objdirbuf.buf); > > alt_copy = xmemdupz(alt, len); > string_list_split_in_place(&entries, alt_copy, sep, -1); > diff --git a/strbuf.c b/strbuf.c > index b839be4..8fec657 100644 > --- a/strbuf.c > +++ b/strbuf.c > @@ -870,3 +870,23 @@ void strbuf_stripspace(struct strbuf *sb, int > skip_comments) > > strbuf_setlen(sb, j); > } > + > +int strbuf_normalize_path(struct strbuf *src) > +{ > + struct strbuf dst = STRBUF_INIT; > + > + strbuf_grow(&dst, src->len); > + if (normalize_path_copy(dst.buf, src->buf) < 0) { > + strbuf_release(&dst); > + return -1; > + } > + > + /* > +* normalize_path does not tell us the new length, so we have to > +* compute it by looking for the new NUL it placed > +*/ And we can't correctly set the length inside normalize_path_copy because it just takes C strings directly and not actually a strbuf. Ok so it makes sense that we have to set it here. Thanks, Jake > + strbuf_setlen(&dst, strlen(dst.buf)); > + strbuf_swap(src, &dst); > + strbuf_release(&dst); > + return 0; > +} > diff --git a/strbuf.h b/strbuf.h > index ba8d5f1..2262b12 100644 > --- a/strbuf.h > +++ b/strbuf.h > @@ -443,6 +443,14 @@ extern int strbuf_getcwd(struct strbuf *sb); > */ > extern void strbuf_add_absolute_path(struct strbuf *sb, const char *path); >
[PATCH 07/18] link_alt_odb_entry: handle normalize_path errors
When we add a new alternate to the list, we try to normalize out any redundant "..", etc. However, we do not look at the return value of normalize_path_copy(), and will happily continue with a path that could not be normalized. Worse, the normalizing process is done in-place, so we are left with whatever half-finished working state the normalizing function was in. Fortunately, this cannot cause us to read past the end of our buffer, as that working state will always leave the NUL from the original path in place. And we do tend to notice problems when we check is_directory() on the path. But you can see the nonsense that we feed to is_directory with an entry like: this/../../is/../../way/../../too/../../deep/../../to/../../resolve in your objects/info/alternates, which yields: error: object directory /to/e/deep/too/way//ects/this/../../is/../../way/../../too/../../deep/../../to/../../resolve does not exist; check .git/objects/info/alternates. We can easily fix this just by checking the return value. But that makes it hard to generate a good error message, since we're normalizing in-place and our input value has been overwritten by cruft. Instead, let's provide a strbuf helper that does an in-place normalize, but restores the original contents on error. This uses a second buffer under the hood, which is slightly less efficient, but this is not a performance-critical code path. The strbuf helper can also properly set the "len" parameter of the strbuf before returning. Just doing: normalize_path_copy(buf.buf, buf.buf); will shorten the string, but leave buf.len at the original length. That may be confusing to later code which uses the strbuf. Signed-off-by: Jeff King --- sha1_file.c | 11 +-- strbuf.c| 20 strbuf.h| 8 3 files changed, 37 insertions(+), 2 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index b9c1fa3..68571bd 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -263,7 +263,12 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base, } strbuf_addstr(&pathbuf, entry); - normalize_path_copy(pathbuf.buf, pathbuf.buf); + if (strbuf_normalize_path(&pathbuf) < 0) { + error("unable to normalize alternate object path: %s", + pathbuf.buf); + strbuf_release(&pathbuf); + return -1; + } pfxlen = strlen(pathbuf.buf); @@ -335,7 +340,9 @@ static void link_alt_odb_entries(const char *alt, int len, int sep, } strbuf_add_absolute_path(&objdirbuf, get_object_directory()); - normalize_path_copy(objdirbuf.buf, objdirbuf.buf); + if (strbuf_normalize_path(&objdirbuf) < 0) + die("unable to normalize object directory: %s", + objdirbuf.buf); alt_copy = xmemdupz(alt, len); string_list_split_in_place(&entries, alt_copy, sep, -1); diff --git a/strbuf.c b/strbuf.c index b839be4..8fec657 100644 --- a/strbuf.c +++ b/strbuf.c @@ -870,3 +870,23 @@ void strbuf_stripspace(struct strbuf *sb, int skip_comments) strbuf_setlen(sb, j); } + +int strbuf_normalize_path(struct strbuf *src) +{ + struct strbuf dst = STRBUF_INIT; + + strbuf_grow(&dst, src->len); + if (normalize_path_copy(dst.buf, src->buf) < 0) { + strbuf_release(&dst); + return -1; + } + + /* +* normalize_path does not tell us the new length, so we have to +* compute it by looking for the new NUL it placed +*/ + strbuf_setlen(&dst, strlen(dst.buf)); + strbuf_swap(src, &dst); + strbuf_release(&dst); + return 0; +} diff --git a/strbuf.h b/strbuf.h index ba8d5f1..2262b12 100644 --- a/strbuf.h +++ b/strbuf.h @@ -443,6 +443,14 @@ extern int strbuf_getcwd(struct strbuf *sb); */ extern void strbuf_add_absolute_path(struct strbuf *sb, const char *path); + +/** + * Normalize in-place the path contained in the strbuf. See + * normalize_path_copy() for details. If an error occurs, the contents of "sb" + * are left untouched, and -1 is returned. + */ +extern int strbuf_normalize_path(struct strbuf *sb); + /** * Strip whitespace from a buffer. The second parameter controls if * comments are considered contents to be removed or not. -- 2.10.0.618.g82cc264