I've granted you write access to the wiki.
Thomas Åkesson wrote on Tue, Feb 14, 2012 at 12:36:23 +0100: > Thanks Julian and Markus for providing feedback. > > I am not commenting below because all the feedback is very good and I will > try to address it as best I can in the next iteration. Describing the > behaviour changes to the WC is the most challenging since I lack that kind of > detailed knowledge. I will instead try to draft the structure of that section > to make it easier for someone with that level of detail to assist. > > Regarding use cases, what can I say... it was towards the end of a long > stretch. > > I think it would help with the upcoming iterations if I could move this > "document" into the wiki. If you find that this first draft shows promise, > please consider granting edit access in the wiki. My user name is "Thomas > Åkesson", which exercises the Unicode awareness of MoinMoin... > > /Thomas Å. > > > On 14 feb 2012, at 11:25, Julian Foad wrote: > > > Hi Thomas. It's fantastic that you're taking the trouble to write up this > > proposal. That's just what we need. Just a few initial comments below... > > > > Thomas Åkesson wrote: > > > >> Context > >> === > >> > >> [...] A unicode string (e.g. a file name) can be represented > >> in 2 normalized forms (NFC/NFD) or mixed, i.e. multiple such > >> characters where some are composed and others decomposed (rare). > > > > > > What's "rare"? We have to assume that input is in mixed composition in any > > system that doesn't explicitly normalize it, which (I think) includes most > > operating systems. While it may be rare for any single string to contain > > characters in both compositions, it is very common to be processing a > > string that *might* have characters in both compositions -- in other words, > > that is not guaranteed to be normalized. I think it would be clearer to > > drop the "(rare)" and just say "... normalized forms (NFC/NFD) or mixed > > (not normalized).". > > > > > >> A minority of file systems (currently Mac OS X HFS+ only) will > >> normalize the paths. In the case of HFS+, the path will be > >> normalized into NFD and it will even be given back that way when > >> listing the filesystem. > > > > > > Drop the word "even"? The statement is not surprising. > > > > > > [...] > > > >> Similarities to case-sensitivity > >> === > >> > >> - If two Unicode strings differ only by letter case/composition, > > > > Drop "/composition" -- it's the subject of the following sentence. > > > >> on some > > computer systems they refer to the same file, while on > >> other systems > > they refer to different files. The same applies > >> if two Unicode strings > > differ only by composition. > > > > > >> [...] > > > >> Client Changes > >> === > >> > >> [...] An abstraction between the repository path and the file > >> system path can be achieved by ensuring that there is a column > >> in wc.db that contains the file system path in exactly the same > >> form that the file system gives back. APIs in wc needs to be > >> extended to ensure that all interaction with the file system is > >> performed with the file system path. > > > > [...] > > > > This part seems to be the heart of the whole proposal. You describe the > > data that we need, but the behaviour will also need to be described in > > detail. Presumably much of the behaviour is boring and obvious (when we > > check out a new path and create it on disk, we store the disk path), but > > I'm sure there will be some less obvious parts (do we need to find out what > > the disk path of an 'excluded' node would be, even though we're not > > actually creating it on disk, for example). > > > > > >> Use Cases > >> === > >> > >> This change will only affect use cases which rely on creating > >> paths that look like duplicates but use different unicode > >> composition. It is highly unlikely anyone is relying on this.. > > > > > > Uh... it sounds like you are saying there are no interesting use cases for > > this proposal! No, on the contrary, this proposal also affects checking > > out and using a WC on Mac HFS+ where the repository paths were created on > > another system and are not in NFD, and it allows that case to work. That's > > the more interesting use case, is it not? It's definitely worth writing > > out the interesting case in full, including steps like checkout (or update) > > that brings in a non-NFD path, create a new file on the Mac, and commit. > > > > - Julian > > >