Re: t3010 broken by 2eac2a4

2013-08-23 Thread Eric Sunshine
On Fri, Aug 23, 2013 at 1:36 AM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 I sent a patch [1] which resolves the problem, although the solution
 is not especially pretty (due to some ugliness in the existing
 implementation).

 Yeah, thanks.

 I tend to agree with you that fixing the icase callee not to rely
 on having the trailing slash (which is looking past the end of the
 given string), instead of working that breakage around on the
 caller's side like your patch did, would be a better alternative,
 though.

My concern with fixing directory_exists_in_index_icase() to add the
'/' itself was that it would have to copy the string to make space for
the '/', which could be expensive. However, I reworked the code so
that the existing strbufs now get passed to
directory_exists_in_index_icase(), which allows it to add its needed
'/' without duplicating the string. So, the trailing '/' requirement
of directory_exists_in_index_icase() is now a private implementation
detail, placing no burden on the caller.

I'll send the revised patch series later today since the commit
message needs a rewrite. Also, I'd like to try to split the change
into a couple patches -- one to pass around strbufs, which has a noisy
diff, and one to fix the actual bug -- though I fear that splitting
may not be possible.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-23 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 On Fri, Aug 23, 2013 at 1:36 AM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 I sent a patch [1] which resolves the problem, although the solution
 is not especially pretty (due to some ugliness in the existing
 implementation).

 Yeah, thanks.

 I tend to agree with you that fixing the icase callee not to rely
 on having the trailing slash (which is looking past the end of the
 given string), instead of working that breakage around on the
 caller's side like your patch did, would be a better alternative,
 though.

 My concern with fixing directory_exists_in_index_icase() to add the
 '/' itself was that it would have to copy the string to make space for
 the '/', which could be expensive. However, I reworked the code so
 that the existing strbufs now get passed to
 directory_exists_in_index_icase(), which allows it to add its needed
 '/' without duplicating the string. So, the trailing '/' requirement
 of directory_exists_in_index_icase() is now a private implementation
 detail, placing no burden on the caller.

When 5102c617 (Add case insensitivity support for directories when
using git status, 2010-10-03) added the directories to the name-hash
with trailing slash, there was only a single name hash table to
which both real cache entries and leading directory prefixes are
registered, so it made some sense to register them with trailing
slashes so that we can tell what kind of entry is being returned.

But since 2092678c (name-hash.c: fix endless loop with
core.ignorecase=true, 2013-02-28), these directory entries that are
not the cache entries are kept track of in a separate hashtable,
which makes me wonder if it still makes sense to register
directories with trailing slashes.

And if we stop doing that (and instead if we shrunk the namelen when
an unconverted caller asks for a name with a trailing slash to see
if a directory exists in the index), wouldn't it automatically fix
the directory_exists_in_index_icase()?  It does not need to assume
that dirname[len] has '/'; after all, it may not even be a valid
memory location in the first place.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-23 Thread Jeff King
On Fri, Aug 23, 2013 at 10:15:55AM -0700, Junio C Hamano wrote:

 When 5102c617 (Add case insensitivity support for directories when
 using git status, 2010-10-03) added the directories to the name-hash
 with trailing slash, there was only a single name hash table to
 which both real cache entries and leading directory prefixes are
 registered, so it made some sense to register them with trailing
 slashes so that we can tell what kind of entry is being returned.
 
 But since 2092678c (name-hash.c: fix endless loop with
 core.ignorecase=true, 2013-02-28), these directory entries that are
 not the cache entries are kept track of in a separate hashtable,
 which makes me wonder if it still makes sense to register
 directories with trailing slashes.
 
 And if we stop doing that (and instead if we shrunk the namelen when
 an unconverted caller asks for a name with a trailing slash to see
 if a directory exists in the index), wouldn't it automatically fix
 the directory_exists_in_index_icase()?  It does not need to assume
 that dirname[len] has '/'; after all, it may not even be a valid
 memory location in the first place.

Yeah, I think that is sane overall direction. When I did the sketch that
eventually turned into Karsten's 2092678c, that was one of the goals.
But I did not keep up with his response and the final patch, and I'm not
sure if the slashes still serve some function. So it would definitely
need somebody looking carefully at the current logic.

More details are in this sub-thread:

  http://thread.gmane.org/gmane.comp.version-control.git/215820/focus=216284

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Eric Sunshine
On Wed, Aug 21, 2013 at 5:41 PM, Junio C Hamano gits...@pobox.com wrote:
 Brian Gernhardt br...@gernhardtsoftware.com writes:

 With 2eac2a4: ls-files -k: a directory only can be killed if the index has 
 a non-directory applied, t3010 fails test 3 validate git ls-files -k 
 output.  It ends up missing the pathx/ju/nk file.

 OS X 10.8.4
 Xcode 4.6.3
 clang Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn)

 Very interesting, as it obviously does not reproduce for me.

I can confirm this failure on OS X, however, I am somewhat confused by
the follow-up t3010 changes in 3c56875176390eee. Are the t3010 changes
supposed to fail without 2eac2a4cc4bdc8d7 applied? For me, on Linux,
the tests succeed whether 2eac2a4cc4bdc8d7 is applied or not. On OS X,
the tests succeed without 2eac2a4cc4bdc8d7 but fail with it applied.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 I can confirm this failure on OS X, however, I am somewhat confused by
 the follow-up t3010 changes in 3c56875176390eee. Are the t3010 changes
 supposed to fail without 2eac2a4cc4bdc8d7 applied? For me, on Linux,
 the tests succeed whether 2eac2a4cc4bdc8d7 is applied or not. On OS X,
 the tests succeed without 2eac2a4cc4bdc8d7 but fail with it applied.

The 2eac2a4c (ls-files -k: a directory only can be killed if the
index has a non-directory, 2013-08-15) is NOT a correctness fix.

It is an optimization to avoid scanning directories that are known
not to be killed when ls-files -k is asked to list killed
paths. The original code without the patch is correct already; it
just is too inefficient because it scans all the directories.  It is
not surprising if the test added by 3c568751 (t3010: update to
demonstrate ls-files -k optimization pitfalls, 2013-08-15) passes
without 2eac2a4c.

As its log message explains, 3c568751 (t3010: update to demonstrate
ls-files -k optimization pitfalls, 2013-08-15) is to catch a case
where an earlier something like this patch (which is the draft for
2eac2a4c) posted to the list would have broken.  That draft patch
was correct only for the case where the top-level directory is
killed, but was broken when a subdirectory (e.g. pathx/ju) is
killed.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Eric Sunshine
On Thu, Aug 22, 2013 at 5:16 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 I can confirm this failure on OS X, however, I am somewhat confused by
 the follow-up t3010 changes in 3c56875176390eee. Are the t3010 changes
 supposed to fail without 2eac2a4cc4bdc8d7 applied? For me, on Linux,
 the tests succeed whether 2eac2a4cc4bdc8d7 is applied or not. On OS X,
 the tests succeed without 2eac2a4cc4bdc8d7 but fail with it applied.

 The 2eac2a4c (ls-files -k: a directory only can be killed if the
 index has a non-directory, 2013-08-15) is NOT a correctness fix.

 It is an optimization to avoid scanning directories that are known
 not to be killed when ls-files -k is asked to list killed
 paths. The original code without the patch is correct already; it
 just is too inefficient because it scans all the directories.  It is
 not surprising if the test added by 3c568751 (t3010: update to
 demonstrate ls-files -k optimization pitfalls, 2013-08-15) passes
 without 2eac2a4c.

 As its log message explains, 3c568751 (t3010: update to demonstrate
 ls-files -k optimization pitfalls, 2013-08-15) is to catch a case
 where an earlier something like this patch (which is the draft for
 2eac2a4c) posted to the list would have broken.  That draft patch
 was correct only for the case where the top-level directory is
 killed, but was broken when a subdirectory (e.g. pathx/ju) is
 killed.

Thanks for the explanation.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 On Thu, Aug 22, 2013 at 5:16 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 I can confirm this failure on OS X, however,...

 Thanks for the explanation.

Now, I am curious how it breaks on OS X.

My suspition is that ignore_case may have something to do with it,
but what 2eac2a4c (ls-files -k: a directory only can be killed if
the index has a non-directory, 2013-08-15) uses are the bog-standard
cache_name_exists() and directory_exists_in_index(), so one of these
internal API implementation has trouble on case insensitive
filesystems, perhaps?  I dunno.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Eric Sunshine
On Thu, Aug 22, 2013 at 5:32 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 On Thu, Aug 22, 2013 at 5:16 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 I can confirm this failure on OS X, however,...

 Thanks for the explanation.

 Now, I am curious how it breaks on OS X.

 My suspition is that ignore_case may have something to do with it,
 but what 2eac2a4c (ls-files -k: a directory only can be killed if
 the index has a non-directory, 2013-08-15) uses are the bog-standard
 cache_name_exists() and directory_exists_in_index(), so one of these
 internal API implementation has trouble on case insensitive
 filesystems, perhaps?  I dunno.

That's exactly my suspicion at the moment. It's an obvious difference
between Linux and OS X. I'm just in the process of trying to compare
between the two platforms.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 On Thu, Aug 22, 2013 at 5:32 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 On Thu, Aug 22, 2013 at 5:16 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 I can confirm this failure on OS X, however,...

 Thanks for the explanation.

 Now, I am curious how it breaks on OS X.

 My suspition is that ignore_case may have something to do with it,
 but what 2eac2a4c (ls-files -k: a directory only can be killed if
 the index has a non-directory, 2013-08-15) uses are the bog-standard
 cache_name_exists() and directory_exists_in_index(), so one of these
 internal API implementation has trouble on case insensitive
 filesystems, perhaps?  I dunno.

 That's exactly my suspicion at the moment. It's an obvious difference
 between Linux and OS X. I'm just in the process of trying to compare
 between the two platforms.

Or perhaps de-d_type does not exist?  In such a case, we end up
doing get_index_dtype() via get_dtype(), but in this codepath I
suspect that we do not want to.  We are interested in the type of
the entity on the filesystem.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Eric Sunshine
On Thu, Aug 22, 2013 at 5:43 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:
 On Thu, Aug 22, 2013 at 5:32 PM, Junio C Hamano gits...@pobox.com wrote:
 Now, I am curious how it breaks on OS X.

 My suspition is that ignore_case may have something to do with it,
 but what 2eac2a4c (ls-files -k: a directory only can be killed if
 the index has a non-directory, 2013-08-15) uses are the bog-standard
 cache_name_exists() and directory_exists_in_index(), so one of these
 internal API implementation has trouble on case insensitive
 filesystems, perhaps?  I dunno.

 That's exactly my suspicion at the moment. It's an obvious difference
 between Linux and OS X. I'm just in the process of trying to compare
 between the two platforms.

 Or perhaps de-d_type does not exist?  In such a case, we end up
 doing get_index_dtype() via get_dtype(), but in this codepath I
 suspect that we do not want to.  We are interested in the type of
 the entity on the filesystem.

de-d_type exists on both platforms. get_dtype() is never called.

However, I did discover that treat_path() is being invoked fewer times
on OSX than on Linux. For instance, in the repository created by
t3010, treat_path() is called 19 times on Linux, but only 17 times on
OSX.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 Status update: For the 'pathx' directory created by the t3010 test,
 directory_exists_in_index() returns false on OSX, but true is returned
 on Linux.

Because a regular pathx/ju is in the index at that point, the
correct answer directory_exists_in_index() should give for 'pathx'
is index_directory, not index_nonexistent, I think.



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Eric Sunshine
On Thu, Aug 22, 2013 at 7:12 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 Status update: For the 'pathx' directory created by the t3010 test,
 directory_exists_in_index() returns false on OSX, but true is returned
 on Linux.

 Because a regular pathx/ju is in the index at that point, the
 correct answer directory_exists_in_index() should give for 'pathx'
 is index_directory, not index_nonexistent, I think.

directory_exists_in_index() and directory_exists_in_index_icase() are
behaving differently. You can replicate the problem on Linux by
enabling core.ignorecase in the test (sans gmail whitespace damage):

--8--
diff --git a/t/t3010-ls-files-killed-modified.sh b/t/t3010-ls-files-killed-modif
index 3120efd..8c76160 100755
--- a/t/t3010-ls-files-killed-modified.sh
+++ b/t/t3010-ls-files-killed-modified.sh
@@ -89,7 +89,7 @@ test_expect_success 'git ls-files -k to show killed files.' '
: path9 
touch path10 
pathx/ju/nk 
-   git ls-files -k .output
+   git -c core.ignorecase=true ls-files -k .output
 '
--8--
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Eric Sunshine
On Thu, Aug 22, 2013 at 7:15 PM, Eric Sunshine sunsh...@sunshineco.com wrote:
 On Thu, Aug 22, 2013 at 7:12 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 Status update: For the 'pathx' directory created by the t3010 test,
 directory_exists_in_index() returns false on OSX, but true is returned
 on Linux.

 Because a regular pathx/ju is in the index at that point, the
 correct answer directory_exists_in_index() should give for 'pathx'
 is index_directory, not index_nonexistent, I think.

 directory_exists_in_index() and directory_exists_in_index_icase() are
 behaving differently. You can replicate the problem on Linux by
 enabling core.ignorecase in the test (sans gmail whitespace damage):

 --8--
 diff --git a/t/t3010-ls-files-killed-modified.sh 
 b/t/t3010-ls-files-killed-modif
 index 3120efd..8c76160 100755
 --- a/t/t3010-ls-files-killed-modified.sh
 +++ b/t/t3010-ls-files-killed-modified.sh
 @@ -89,7 +89,7 @@ test_expect_success 'git ls-files -k to show killed files.' 
 '
 : path9 
 touch path10 
 pathx/ju/nk 
 -   git ls-files -k .output
 +   git -c core.ignorecase=true ls-files -k .output
  '
 --8--

I sent a patch [1] which resolves the problem, although the solution
is not especially pretty (due to some ugliness in the existing
implementation).

[1]: http://thread.gmane.org/gmane.comp.version-control.git/232796
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-22 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 I sent a patch [1] which resolves the problem, although the solution
 is not especially pretty (due to some ugliness in the existing
 implementation).

Yeah, thanks.

I tend to agree with you that fixing the icase callee not to rely
on having the trailing slash (which is looking past the end of the
given string), instead of working that breakage around on the
caller's side like your patch did, would be a better alternative,
though.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


t3010 broken by 2eac2a4

2013-08-21 Thread Brian Gernhardt
With 2eac2a4: ls-files -k: a directory only can be killed if the index has a 
non-directory applied, t3010 fails test 3 validate git ls-files -k output.  
It ends up missing the pathx/ju/nk file.

OS X 10.8.4
Xcode 4.6.3
clang Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn) 

~~ Brian Gernhardt

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-21 Thread Junio C Hamano
Brian Gernhardt br...@gernhardtsoftware.com writes:

 With 2eac2a4: ls-files -k: a directory only can be killed if the index has a 
 non-directory applied, t3010 fails test 3 validate git ls-files -k output. 
  It ends up missing the pathx/ju/nk file.

 OS X 10.8.4
 Xcode 4.6.3
 clang Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn) 

Very interesting, as it obviously does not reproduce for me.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html