Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-05-07 Thread Torsten Bögershausen
On 2014-05-05 23.46, Jeff King wrote: [] 2. Do all index filename comparisons under Mac OS X using a UTF-8 aware comparison function regardless if core.precomposeunicode is set. This would probably have bad performance, and somewhat defeats the point of converting the

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-05-06 Thread Erik Faye-Lund
On Mon, May 5, 2014 at 11:46 PM, Jeff King p...@peff.net wrote: On Sun, May 04, 2014 at 08:13:15AM +0200, Torsten Bögershausen wrote: 1. Tell everyone that NFD in the git repo is wrong, and they should make a new commit to normalize all their in-repo files to be precomposed.

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-05-05 Thread Jeff King
On Sun, May 04, 2014 at 08:13:15AM +0200, Torsten Bögershausen wrote: 1. Tell everyone that NFD in the git repo is wrong, and they should make a new commit to normalize all their in-repo files to be precomposed. This is probably not the right thing to do, because it

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-05-04 Thread Torsten Bögershausen
(Sorry for the late reply, I'm handicapped by some local IT problems) Peff, do you know if the fix below helps ? On 2014-04-28 18.16, Jeff King wrote: If you have existing decomposed filenames in your git repository (e.g., that were created with older versions of git that did not precompose

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-05-04 Thread Torsten Bögershausen
On 2014-04-30 16.57, Torsten Bögershausen wrote: There is something wrong with the patch, (test suite hangs or TC fail), so I need to com back later. Sorry for the noise. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-30 Thread Torsten Bögershausen
On 29.04.14 20:02, Jeff King wrote: On Tue, Apr 29, 2014 at 10:12:52AM -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: This patch just adds a test to demonstrate the breakage. Some possible fixes are: 1. Tell everyone that NFD in the git repo is wrong, and they should

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-29 Thread Torsten Bögershausen
On 04/29/2014 05:23 AM, Jeff King wrote: On Mon, Apr 28, 2014 at 10:49:30PM +0200, Torsten Bögershausen wrote: OK, thanks for the description. In theory we can make Git composition ignoring by changing index_file_exists() in name-hash.c. (Both names must be precomposed first and compared then)

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-29 Thread Junio C Hamano
Jeff King p...@peff.net writes: This patch just adds a test to demonstrate the breakage. Some possible fixes are: 1. Tell everyone that NFD in the git repo is wrong, and they should make a new commit to normalize all their in-repo files to be precomposed. This is probably

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-29 Thread Jeff King
On Tue, Apr 29, 2014 at 10:12:52AM -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: This patch just adds a test to demonstrate the breakage. Some possible fixes are: 1. Tell everyone that NFD in the git repo is wrong, and they should make a new commit to normalize

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-29 Thread Junio C Hamano
Jeff King p...@peff.net writes: I don't think we have a str_utf8_cmp that ignores normalizations (or maybe strcoll will do this?). But in theory we could use it everywhere we use strcasecmp for ignore_case. And then we would not need to have our readdir wrapper, maybe? I admit I haven't

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-29 Thread Jeff King
On Tue, Apr 29, 2014 at 11:49:30AM -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: I don't think we have a str_utf8_cmp that ignores normalizations (or maybe strcoll will do this?). But in theory we could use it everywhere we use strcasecmp for ignore_case. And then we would

[PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-28 Thread Jeff King
If you have existing decomposed filenames in your git repository (e.g., that were created with older versions of git that did not precompose unicode), a modern git with core.precomposeunicode set does not handle them well. The problem is that we normalize the paths coming from the disk into their

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-28 Thread Junio C Hamano
Jeff King p...@peff.net writes: This patch just adds a test to demonstrate the breakage. Some possible fixes are: ... 2. Do all index filename comparisons using a UTF-8 aware comparison function when core.precomposeunicode is set. This would probably have bad performance, and

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-28 Thread Jeff King
On Mon, Apr 28, 2014 at 12:17:28PM -0700, Junio C Hamano wrote: 3. Convert index filenames to their precomposed form when we read the index from disk. This would be efficient, but we would have to be careful not to write the precomposed forms back out to disk. I think

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-28 Thread Torsten Bögershausen
On 28.04.14 21:35, Jeff King wrote: On Mon, Apr 28, 2014 at 12:17:28PM -0700, Junio C Hamano wrote: 3. Convert index filenames to their precomposed form when we read the index from disk. This would be efficient, but we would have to be careful not to write the precomposed

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-28 Thread Jeff King
On Mon, Apr 28, 2014 at 09:52:07PM +0200, Torsten Bögershausen wrote: To my knowledge repos with decomposed unicode should be rare in practice. I only can speak for european (or latin based) or cyrillic languages myself: I've run across several cases in the past few months, but only just

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-28 Thread Torsten Bögershausen
On 2014-04-28 22.03, Jeff King wrote: On Mon, Apr 28, 2014 at 09:52:07PM +0200, Torsten Bögershausen wrote: To my knowledge repos with decomposed unicode should be rare in practice. I only can speak for european (or latin based) or cyrillic languages myself: I've run across several cases in

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-28 Thread Jeff King
On Mon, Apr 28, 2014 at 03:35:02PM -0400, Jeff King wrote: Since such entries are in the minority, and because cache_entry is already a variable-length struct, I think you could get away with sticking it after the name field, and then comparing like: const char *ce_normalized_name(struct

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-28 Thread Jeff King
On Mon, Apr 28, 2014 at 10:49:30PM +0200, Torsten Bögershausen wrote: OK, thanks for the description. In theory we can make Git composition ignoring by changing index_file_exists() in name-hash.c. (Both names must be precomposed first and compared then) Yeah, we could perhaps get away