Re: [PATCH 7/8] longest_ancestor_length(): resolve symlinks before comparing paths

2012-09-28 Thread Michael Haggerty
On 09/28/2012 12:51 AM, Junio C Hamano wrote:
 Michael Haggerty mhag...@alum.mit.edu writes:
 
 longest_ancestor_length() relies on a textual comparison of directory
 parts to find the part of path that overlaps with one of the paths in
 prefix_list.  But this doesn't work if any of the prefixes involves a
 symbolic link, because the directories will look different even though
 they might logically refer to the same directory.  So canonicalize the
 paths listed in prefix_list using real_path_if_valid() before trying
 to find matches.

 path is already in canonical form, so doesn't need to be canonicalized
 again.

 This fixes some problems with using GIT_CEILING_DIRECTORIES that
 contains paths involving symlinks, including t4035 if run with --root
 set to a path involving symlinks.

 Remove a number of tests of longest_ancestor_length().  It is awkward
 to test longest_ancestor_length() now, because its new path
 normalization behavior depends on the contents of the whole
 filesystem.  But we can live without the tests, because
 longest_ancestor_length() is now built of reusable components that are
 themselves tested separately: string_list_split(),
 string_list_longest_prefix(), and real_path_if_valid().
 
 Errr, components may be correct but the way to combine and construct
 could go faulty, so...

I don't see a realistic alternative.  Testing real_path() is itself is
already quite awkward (see t0060), so testing longest_ancestor_length()
would be even more so.  Of course, the GIT_CEILING_DIRECTORIES tests
indirectly test longest_ancestor_length(), though not systematically.

If you have a better suggestion, please let me know.

 Signed-off-by: Michael Haggerty mhag...@alum.mit.edu
 ---
  path.c| 17 --
  t/t0060-path-utils.sh | 64 
 ---
  2 files changed, 10 insertions(+), 71 deletions(-)

 diff --git a/path.c b/path.c
 index 5cace83..981bb06 100644
 --- a/path.c
 +++ b/path.c
 @@ -570,22 +570,25 @@ int normalize_path_copy(char *dst, const char *src)
  
  static int normalize_path_callback(struct string_list_item *item, void 
 *cb_data)
  {
 -char buf[PATH_MAX+2];
 +char *buf;
  const char *ceil = item-string;
 -int len = strlen(ceil);
 +const char *realpath;
 +int len;
  
 -if (len == 0 || len  PATH_MAX || !is_absolute_path(ceil))
 +if (!*ceil || !is_absolute_path(ceil))
  return 0;
 -memcpy(buf, ceil, len+1);
 -if (normalize_path_copy(buf, buf)  0)
 +realpath = real_path_if_valid(ceil);
 +if (!realpath)
  return 0;
 -len = strlen(buf);
 +len = strlen(realpath);
 +buf = xmalloc(len + 2); /* Leave space for possible trailing slash */
 +strcpy(buf, realpath);
  if (len == 0 || buf[len-1] != '/') {
  buf[len++] = '/';
  buf[len++] = '\0';
  }
 
 Nice.

I just noticed that the second len++ in the final if is misleading.
 I will fix that in v2.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] longest_ancestor_length(): resolve symlinks before comparing paths

2012-09-27 Thread Junio C Hamano
Michael Haggerty mhag...@alum.mit.edu writes:

 longest_ancestor_length() relies on a textual comparison of directory
 parts to find the part of path that overlaps with one of the paths in
 prefix_list.  But this doesn't work if any of the prefixes involves a
 symbolic link, because the directories will look different even though
 they might logically refer to the same directory.  So canonicalize the
 paths listed in prefix_list using real_path_if_valid() before trying
 to find matches.

 path is already in canonical form, so doesn't need to be canonicalized
 again.

 This fixes some problems with using GIT_CEILING_DIRECTORIES that
 contains paths involving symlinks, including t4035 if run with --root
 set to a path involving symlinks.

 Remove a number of tests of longest_ancestor_length().  It is awkward
 to test longest_ancestor_length() now, because its new path
 normalization behavior depends on the contents of the whole
 filesystem.  But we can live without the tests, because
 longest_ancestor_length() is now built of reusable components that are
 themselves tested separately: string_list_split(),
 string_list_longest_prefix(), and real_path_if_valid().

Errr, components may be correct but the way to combine and construct
could go faulty, so...

 Signed-off-by: Michael Haggerty mhag...@alum.mit.edu
 ---
  path.c| 17 --
  t/t0060-path-utils.sh | 64 
 ---
  2 files changed, 10 insertions(+), 71 deletions(-)

 diff --git a/path.c b/path.c
 index 5cace83..981bb06 100644
 --- a/path.c
 +++ b/path.c
 @@ -570,22 +570,25 @@ int normalize_path_copy(char *dst, const char *src)
  
  static int normalize_path_callback(struct string_list_item *item, void 
 *cb_data)
  {
 - char buf[PATH_MAX+2];
 + char *buf;
   const char *ceil = item-string;
 - int len = strlen(ceil);
 + const char *realpath;
 + int len;
  
 - if (len == 0 || len  PATH_MAX || !is_absolute_path(ceil))
 + if (!*ceil || !is_absolute_path(ceil))
   return 0;
 - memcpy(buf, ceil, len+1);
 - if (normalize_path_copy(buf, buf)  0)
 + realpath = real_path_if_valid(ceil);
 + if (!realpath)
   return 0;
 - len = strlen(buf);
 + len = strlen(realpath);
 + buf = xmalloc(len + 2); /* Leave space for possible trailing slash */
 + strcpy(buf, realpath);
   if (len == 0 || buf[len-1] != '/') {
   buf[len++] = '/';
   buf[len++] = '\0';
   }

Nice.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/8] longest_ancestor_length(): resolve symlinks before comparing paths

2012-09-26 Thread Michael Haggerty
longest_ancestor_length() relies on a textual comparison of directory
parts to find the part of path that overlaps with one of the paths in
prefix_list.  But this doesn't work if any of the prefixes involves a
symbolic link, because the directories will look different even though
they might logically refer to the same directory.  So canonicalize the
paths listed in prefix_list using real_path_if_valid() before trying
to find matches.

path is already in canonical form, so doesn't need to be canonicalized
again.

This fixes some problems with using GIT_CEILING_DIRECTORIES that
contains paths involving symlinks, including t4035 if run with --root
set to a path involving symlinks.

Remove a number of tests of longest_ancestor_length().  It is awkward
to test longest_ancestor_length() now, because its new path
normalization behavior depends on the contents of the whole
filesystem.  But we can live without the tests, because
longest_ancestor_length() is now built of reusable components that are
themselves tested separately: string_list_split(),
string_list_longest_prefix(), and real_path_if_valid().

Signed-off-by: Michael Haggerty mhag...@alum.mit.edu
---
 path.c| 17 --
 t/t0060-path-utils.sh | 64 ---
 2 files changed, 10 insertions(+), 71 deletions(-)

diff --git a/path.c b/path.c
index 5cace83..981bb06 100644
--- a/path.c
+++ b/path.c
@@ -570,22 +570,25 @@ int normalize_path_copy(char *dst, const char *src)
 
 static int normalize_path_callback(struct string_list_item *item, void 
*cb_data)
 {
-   char buf[PATH_MAX+2];
+   char *buf;
const char *ceil = item-string;
-   int len = strlen(ceil);
+   const char *realpath;
+   int len;
 
-   if (len == 0 || len  PATH_MAX || !is_absolute_path(ceil))
+   if (!*ceil || !is_absolute_path(ceil))
return 0;
-   memcpy(buf, ceil, len+1);
-   if (normalize_path_copy(buf, buf)  0)
+   realpath = real_path_if_valid(ceil);
+   if (!realpath)
return 0;
-   len = strlen(buf);
+   len = strlen(realpath);
+   buf = xmalloc(len + 2); /* Leave space for possible trailing slash */
+   strcpy(buf, realpath);
if (len == 0 || buf[len-1] != '/') {
buf[len++] = '/';
buf[len++] = '\0';
}
free(item-string);
-   item-string = xstrdup(buf);
+   item-string = buf;
return 1;
 }
 
diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 4ef2345..c97bbf2 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -12,28 +12,6 @@ norm_path() {
test \\$(test-path-utils normalize_path_copy '$1')\ = '$2'
 }
 
-# On Windows, we are using MSYS's bash, which mangles the paths.
-# Absolute paths are anchored at the MSYS installation directory,
-# which means that the path / accounts for this many characters:
-rootoff=$(test-path-utils normalize_path_copy / | wc -c)
-# Account for the trailing LF:
-if test $rootoff = 2; then
-   rootoff=# we are on Unix
-else
-   rootoff=$(($rootoff-1))
-fi
-
-ancestor() {
-   # We do some math with the expected ancestor length.
-   expected=$3
-   if test -n $rootoff  test x$expected != x-1; then
-   expected=$(($expected+$rootoff))
-   fi
-   test_expect_success longest ancestor: $1 $2 = $expected \
-   actual=\$(test-path-utils longest_ancestor_length '$1' '$2') 
-test \\$actual\ = '$expected'
-}
-
 # Absolute path tests must be skipped on Windows because due to path mangling
 # the test program never sees a POSIX-style absolute path
 case $(uname -s) in
@@ -93,48 +71,6 @@ norm_path /d1/s1//../s2/../../d2 /d2 POSIX
 norm_path /d1/.../d2 /d1/.../d2 POSIX
 norm_path /d1/..././../d2 /d1/d2 POSIX
 
-ancestor /  -1
-ancestor / / -1
-ancestor /foo  -1
-ancestor /foo : -1
-ancestor /foo ::. -1
-ancestor /foo ::..:: -1
-ancestor /foo / 0
-ancestor /foo /fo -1
-ancestor /foo /foo -1
-ancestor /foo /foo/ -1
-ancestor /foo /bar -1
-ancestor /foo /bar/ -1
-ancestor /foo /foo/bar -1
-ancestor /foo /foo:/bar/ -1
-ancestor /foo /foo/:/bar/ -1
-ancestor /foo /foo::/bar/ -1
-ancestor /foo /:/foo:/bar/ 0
-ancestor /foo /foo:/:/bar/ 0
-ancestor /foo /:/bar/:/foo 0
-ancestor /foo/bar  -1
-ancestor /foo/bar / 0
-ancestor /foo/bar /fo -1
-ancestor /foo/bar foo -1
-ancestor /foo/bar /foo 4
-ancestor /foo/bar /foo/ 4
-ancestor /foo/bar /foo/ba -1
-ancestor /foo/bar /:/fo 0
-ancestor /foo/bar /foo:/foo/ba 4
-ancestor /foo/bar /bar -1
-ancestor /foo/bar /bar/ -1
-ancestor /foo/bar /fo: -1
-ancestor /foo/bar :/fo -1
-ancestor /foo/bar /foo:/bar/ 4
-ancestor /foo/bar /:/foo:/bar/ 4
-ancestor /foo/bar /foo:/:/bar/ 4
-ancestor /foo/bar /:/bar/:/fo 0
-ancestor /foo/bar /:/bar/ 0
-ancestor /foo/bar .:/foo/. 4
-ancestor /foo/bar .:/foo/.:.: 4
-ancestor /foo/bar /foo/./:.:/bar 4
-ancestor /foo/bar .:/bar -1
-
 test_expect_success 'strip_path_suffix' '
test c:/msysgit