2012/2/6 Stefan Sperling <s...@elego.de>: > On Mon, Feb 06, 2012 at 02:28:40PM +0100, Branko Čibej wrote: >> On 06.02.2012 14:10, Hiroaki Nakamura wrote: >> > Hi, all. >> > >> > It seems there is no further discussion. >> > >> > I think the conclusion for the short term solution is: >> > We convert unnormalized paths to NFC normalized paths on clients only, >> > that is, svn_path_cstring_to_utf8. >> > >> > It is the same approach as utf8precompose_macosx_2.patch in >> > http://subversion.tigris.org/issues/show_bug.cgi?id=2464 >> > >> > It is proven to work as it is included in MacPorts unicode_path variant >> > and Homebrew --unicode-path option. >> >> You'll note that MacPorts also warns you that using this option may >> cause interoperability issues with other clients that aren't using it, >> right? So this is hardly a universal solution that will not affect >> existing users and repositories. > > Exactly. This is what I meant when I said that we cannot apply the > submitted patch as it is, at the very beginning of this thread. > The submitted patch simply copies the MacPorts solution and has > the same compatibility problems. > > I think the discussion made clear that there are two ways > to move forward: > > 1) Implement a client-side mapping table which maps server-provided > paths to local filesystem paths. It translates between one or more > server-side and local representations of the same path. This could > be done only on Mac OS X (or, preferrably, only on HFS+ filesystems) > because only Mac OS X has problems. > The idea here is to not change existing paths in repositories at all, > no matter which way they are encoded, and to teach Mac OS X clients > to cope with the problem locally. This way, other existing clients > won't notice a difference. The only thing that won't work is to create > a working copy on Mac OS X which contains the same name multiple times, > in NFD and in some other normalised or non-normalised form. > This approach was suggested by Peter.
The Unicode Standard says canonical equivalent sequences should be interpreted the same way. * 1.1 Canonical and Compatibility Equivalence http://unicode.org/reports/tr15/#Canonical_Equivalence * 2.12 Equivalent Sequences and Normalization http://www.unicode.org/versions/Unicode6.0.0/ch02.pdf So we should not have the same name multiple times in repositories and working copies. Therefore subversion servers and clients does not need to handle them. Rather I think we should fix subversion to reject the same name in a different form. To handle existing repositories and working copies, maybe we should create a tool which checks repositories and working copies have the same name multiple times. If they have, users must rename files manually. In reality, I think this is extremely rare. > We'd need either a working patch or a more detailed implementation > design document to move forward here. OK. Peter, or somebody else, please give us either one of them. > > 2) Do something else that effects repositories, too, and provide > a clean upgrade path for everyone (servers and clients). > AFAIK nobody has made a suggestion as to what could be done here. What do you mean by a clean upgrade? Is it clean if we do dump and load for repositories and re-checkout for working copies? -- )Hiroaki Nakamura) hnaka...@gmail.com