https://bz.mercurial-scm.org/show_bug.cgi?id=6968
Bug ID: 6968 Summary: Unicode normalization in file names Product: Mercurial Version: 6.8.1 Hardware: PC OS: NetBSD Status: UNCONFIRMED Severity: feature Priority: wish Component: Mercurial Assignee: bugzi...@mercurial-scm.org Reporter: mercurial-bugzi...@campbell.mumble.net CC: mercurial-de...@mercurial-scm.org Python Version: --- I would like to make sure a repository is safe for use on multiple file systems and multiple operating systems including ufs/ffs and similar Unix-oriented file systems (bag of bytes), zfs with utf8only, Apple HFS+, Apple APFS, and others. I am willing to accept some constraints, set some configuration options, and install hooks that enforce rules. Here is how I think I would like it to work, but I haven't tested yet: 1. The repository stores only files with paths that are valid UTF-8 strings in NFC, internally. 2. When hg operates on a file in the file system, it uses NFC paths. 3. When hg lists directories to discover new files, it normalizes them into NFC (and rejects/ignores files whose names have invalid UTF-8). 4. Any tree cannot have two paths that are equivalent modulo normalization and case. I reviewed https://wiki.mercurial-scm.org/EncodingStrategy and I'm not sure it addresses how to achieve this. I believe the git option core.precomposeUnicode=true will do (3): https://git-scm.com/docs/git-config/2.47.1#Documentation/git-config.txt-coreprecomposeUnicode Some constraints that may make this simpler than a grand unified theory of pathname encoding questions: - The makefile issue is not relevant at present -- non-ASCII file names won't appear in makefiles. - Users will use a central server to enforce rules on changesets when pushing. - Users can be asked to use particular .hg/hgrc configuration and hooks (though ideally it would be just a .hg/hgrc config line). - I can rewrite the complete existing history for now (though that will change when the flag day of conversion comes, so I want to make sure that I have careful -- and thoroughly tested -- input validation to make sure it doesn't become a problem in the future). So: Are there any existing hg config options I can enable for this, or for a similar goal that prior experience suggests is better? -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ Mercurial-devel mailing list Mercurial-devel@lists.mercurial-scm.org https://lists.mercurial-scm.org/mailman/listinfo/mercurial-devel