Re: [PATCH 7/8] longest_ancestor_length(): resolve symlinks before comparing paths
On 09/28/2012 12:51 AM, Junio C Hamano wrote: Michael Haggerty mhag...@alum.mit.edu writes: longest_ancestor_length() relies on a textual comparison of directory parts to find the part of path that overlaps with one of the paths in prefix_list. But this doesn't work if any of the prefixes involves a symbolic link, because the directories will look different even though they might logically refer to the same directory. So canonicalize the paths listed in prefix_list using real_path_if_valid() before trying to find matches. path is already in canonical form, so doesn't need to be canonicalized again. This fixes some problems with using GIT_CEILING_DIRECTORIES that contains paths involving symlinks, including t4035 if run with --root set to a path involving symlinks. Remove a number of tests of longest_ancestor_length(). It is awkward to test longest_ancestor_length() now, because its new path normalization behavior depends on the contents of the whole filesystem. But we can live without the tests, because longest_ancestor_length() is now built of reusable components that are themselves tested separately: string_list_split(), string_list_longest_prefix(), and real_path_if_valid(). Errr, components may be correct but the way to combine and construct could go faulty, so... I don't see a realistic alternative. Testing real_path() is itself is already quite awkward (see t0060), so testing longest_ancestor_length() would be even more so. Of course, the GIT_CEILING_DIRECTORIES tests indirectly test longest_ancestor_length(), though not systematically. If you have a better suggestion, please let me know. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- path.c| 17 -- t/t0060-path-utils.sh | 64 --- 2 files changed, 10 insertions(+), 71 deletions(-) diff --git a/path.c b/path.c index 5cace83..981bb06 100644 --- a/path.c +++ b/path.c @@ -570,22 +570,25 @@ int normalize_path_copy(char *dst, const char *src) static int normalize_path_callback(struct string_list_item *item, void *cb_data) { -char buf[PATH_MAX+2]; +char *buf; const char *ceil = item-string; -int len = strlen(ceil); +const char *realpath; +int len; -if (len == 0 || len PATH_MAX || !is_absolute_path(ceil)) +if (!*ceil || !is_absolute_path(ceil)) return 0; -memcpy(buf, ceil, len+1); -if (normalize_path_copy(buf, buf) 0) +realpath = real_path_if_valid(ceil); +if (!realpath) return 0; -len = strlen(buf); +len = strlen(realpath); +buf = xmalloc(len + 2); /* Leave space for possible trailing slash */ +strcpy(buf, realpath); if (len == 0 || buf[len-1] != '/') { buf[len++] = '/'; buf[len++] = '\0'; } Nice. I just noticed that the second len++ in the final if is misleading. I will fix that in v2. Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] longest_ancestor_length(): resolve symlinks before comparing paths
Michael Haggerty mhag...@alum.mit.edu writes: longest_ancestor_length() relies on a textual comparison of directory parts to find the part of path that overlaps with one of the paths in prefix_list. But this doesn't work if any of the prefixes involves a symbolic link, because the directories will look different even though they might logically refer to the same directory. So canonicalize the paths listed in prefix_list using real_path_if_valid() before trying to find matches. path is already in canonical form, so doesn't need to be canonicalized again. This fixes some problems with using GIT_CEILING_DIRECTORIES that contains paths involving symlinks, including t4035 if run with --root set to a path involving symlinks. Remove a number of tests of longest_ancestor_length(). It is awkward to test longest_ancestor_length() now, because its new path normalization behavior depends on the contents of the whole filesystem. But we can live without the tests, because longest_ancestor_length() is now built of reusable components that are themselves tested separately: string_list_split(), string_list_longest_prefix(), and real_path_if_valid(). Errr, components may be correct but the way to combine and construct could go faulty, so... Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- path.c| 17 -- t/t0060-path-utils.sh | 64 --- 2 files changed, 10 insertions(+), 71 deletions(-) diff --git a/path.c b/path.c index 5cace83..981bb06 100644 --- a/path.c +++ b/path.c @@ -570,22 +570,25 @@ int normalize_path_copy(char *dst, const char *src) static int normalize_path_callback(struct string_list_item *item, void *cb_data) { - char buf[PATH_MAX+2]; + char *buf; const char *ceil = item-string; - int len = strlen(ceil); + const char *realpath; + int len; - if (len == 0 || len PATH_MAX || !is_absolute_path(ceil)) + if (!*ceil || !is_absolute_path(ceil)) return 0; - memcpy(buf, ceil, len+1); - if (normalize_path_copy(buf, buf) 0) + realpath = real_path_if_valid(ceil); + if (!realpath) return 0; - len = strlen(buf); + len = strlen(realpath); + buf = xmalloc(len + 2); /* Leave space for possible trailing slash */ + strcpy(buf, realpath); if (len == 0 || buf[len-1] != '/') { buf[len++] = '/'; buf[len++] = '\0'; } Nice. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/8] longest_ancestor_length(): resolve symlinks before comparing paths
longest_ancestor_length() relies on a textual comparison of directory parts to find the part of path that overlaps with one of the paths in prefix_list. But this doesn't work if any of the prefixes involves a symbolic link, because the directories will look different even though they might logically refer to the same directory. So canonicalize the paths listed in prefix_list using real_path_if_valid() before trying to find matches. path is already in canonical form, so doesn't need to be canonicalized again. This fixes some problems with using GIT_CEILING_DIRECTORIES that contains paths involving symlinks, including t4035 if run with --root set to a path involving symlinks. Remove a number of tests of longest_ancestor_length(). It is awkward to test longest_ancestor_length() now, because its new path normalization behavior depends on the contents of the whole filesystem. But we can live without the tests, because longest_ancestor_length() is now built of reusable components that are themselves tested separately: string_list_split(), string_list_longest_prefix(), and real_path_if_valid(). Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- path.c| 17 -- t/t0060-path-utils.sh | 64 --- 2 files changed, 10 insertions(+), 71 deletions(-) diff --git a/path.c b/path.c index 5cace83..981bb06 100644 --- a/path.c +++ b/path.c @@ -570,22 +570,25 @@ int normalize_path_copy(char *dst, const char *src) static int normalize_path_callback(struct string_list_item *item, void *cb_data) { - char buf[PATH_MAX+2]; + char *buf; const char *ceil = item-string; - int len = strlen(ceil); + const char *realpath; + int len; - if (len == 0 || len PATH_MAX || !is_absolute_path(ceil)) + if (!*ceil || !is_absolute_path(ceil)) return 0; - memcpy(buf, ceil, len+1); - if (normalize_path_copy(buf, buf) 0) + realpath = real_path_if_valid(ceil); + if (!realpath) return 0; - len = strlen(buf); + len = strlen(realpath); + buf = xmalloc(len + 2); /* Leave space for possible trailing slash */ + strcpy(buf, realpath); if (len == 0 || buf[len-1] != '/') { buf[len++] = '/'; buf[len++] = '\0'; } free(item-string); - item-string = xstrdup(buf); + item-string = buf; return 1; } diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh index 4ef2345..c97bbf2 100755 --- a/t/t0060-path-utils.sh +++ b/t/t0060-path-utils.sh @@ -12,28 +12,6 @@ norm_path() { test \\$(test-path-utils normalize_path_copy '$1')\ = '$2' } -# On Windows, we are using MSYS's bash, which mangles the paths. -# Absolute paths are anchored at the MSYS installation directory, -# which means that the path / accounts for this many characters: -rootoff=$(test-path-utils normalize_path_copy / | wc -c) -# Account for the trailing LF: -if test $rootoff = 2; then - rootoff=# we are on Unix -else - rootoff=$(($rootoff-1)) -fi - -ancestor() { - # We do some math with the expected ancestor length. - expected=$3 - if test -n $rootoff test x$expected != x-1; then - expected=$(($expected+$rootoff)) - fi - test_expect_success longest ancestor: $1 $2 = $expected \ - actual=\$(test-path-utils longest_ancestor_length '$1' '$2') -test \\$actual\ = '$expected' -} - # Absolute path tests must be skipped on Windows because due to path mangling # the test program never sees a POSIX-style absolute path case $(uname -s) in @@ -93,48 +71,6 @@ norm_path /d1/s1//../s2/../../d2 /d2 POSIX norm_path /d1/.../d2 /d1/.../d2 POSIX norm_path /d1/..././../d2 /d1/d2 POSIX -ancestor / -1 -ancestor / / -1 -ancestor /foo -1 -ancestor /foo : -1 -ancestor /foo ::. -1 -ancestor /foo ::..:: -1 -ancestor /foo / 0 -ancestor /foo /fo -1 -ancestor /foo /foo -1 -ancestor /foo /foo/ -1 -ancestor /foo /bar -1 -ancestor /foo /bar/ -1 -ancestor /foo /foo/bar -1 -ancestor /foo /foo:/bar/ -1 -ancestor /foo /foo/:/bar/ -1 -ancestor /foo /foo::/bar/ -1 -ancestor /foo /:/foo:/bar/ 0 -ancestor /foo /foo:/:/bar/ 0 -ancestor /foo /:/bar/:/foo 0 -ancestor /foo/bar -1 -ancestor /foo/bar / 0 -ancestor /foo/bar /fo -1 -ancestor /foo/bar foo -1 -ancestor /foo/bar /foo 4 -ancestor /foo/bar /foo/ 4 -ancestor /foo/bar /foo/ba -1 -ancestor /foo/bar /:/fo 0 -ancestor /foo/bar /foo:/foo/ba 4 -ancestor /foo/bar /bar -1 -ancestor /foo/bar /bar/ -1 -ancestor /foo/bar /fo: -1 -ancestor /foo/bar :/fo -1 -ancestor /foo/bar /foo:/bar/ 4 -ancestor /foo/bar /:/foo:/bar/ 4 -ancestor /foo/bar /foo:/:/bar/ 4 -ancestor /foo/bar /:/bar/:/fo 0 -ancestor /foo/bar /:/bar/ 0 -ancestor /foo/bar .:/foo/. 4 -ancestor /foo/bar .:/foo/.:.: 4 -ancestor /foo/bar /foo/./:.:/bar 4 -ancestor /foo/bar .:/bar -1 - test_expect_success 'strip_path_suffix' ' test c:/msysgit