https://bz.mercurial-scm.org/show_bug.cgi?id=6970

            Bug ID: 6970
           Summary: case-folding collision warning misses cases
           Product: Mercurial
           Version: 6.8.1
          Hardware: PC
                OS: NetBSD
            Status: UNCONFIRMED
          Severity: feature
          Priority: wish
         Component: Mercurial
          Assignee: bugzi...@mercurial-scm.org
          Reporter: mercurial-bugzi...@campbell.mumble.net
                CC: mercurial-de...@mercurial-scm.org
    Python Version: ---

On macOS, APFS file system (roughly) treats paths equivalent modulo
case-folding.  Thus, the path `hello' is equivalent to the path `HELLO'; «
éclair » is equivalent to « ÉCLAIR »; and „Nußschnecke”, „NUSSSCHNECKE”, and
„nussschnecke” are all equivalent.

Mercurial has logic to warn when adding paths that might collide like this --
but while the warning message says `case-folding collision', what it actually
checks for is equivalence modulo _lowercasing_, not modulo _case-folding_:

class casecollisionauditor:
...
        fl = encoding.lower(f)
        if fl in self._loweredfiles and f not in self._dirstate:
            msg = _(b'possible case-folding collision for %s') % f
            if self._abort:
                raise error.StateError(msg)
            self._ui.warn(_(b"warning: %s\n") % msg)
        self._loweredfiles.add(fl)


https://repo.mercurial-scm.org/hg/file/af86422250b2/mercurial/scmutil.py#l354

Thus, while Mercurial warns of a possible collision between „Nußschnecke” and
„nußschnecke”, it fails to detect a possible collision between „Nußschnecke”
and either „NUSSSCHNECKE” or „nussschnecke”, because:

>>> 'Nußschnecke'.lower()
'nußschnecke'
>>> 'Nußschnecke'.casefold()
'nussschnecke'

I'm guessing the casecollisionauditor logic was written before APFS, when the
dominant file system in the Apple world was HFS+, which does use equivalence
modulo lowercasing rather than equivalence modulo case-folding.  (The story is
a little more complicated; see
https://web.archive.org/web/20220610125250/https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/APFS_Guide/FAQ/FAQ.html
for the gruesome details.)

The following test case demonstrates this:

#!/usr/bin/env cram

  $ export HGRCPATH="$(pwd)/.hgrc"
  $ cat <<EOF >$HGRCPATH
  > [ui]
  > username = al...@example.com
  > interactive = no
  > quiet = yes
  > verbose = no
  > EOF
  $ hg init repo

hg successfully warns of possible collision between éclair and ÉCLAIR:

  $ echo chocolate >repo/éclair
  $ hg -R repo add
  $ hg -R repo commit -d '1 0' -m foo
  $ rm -f repo/éclair
  $ echo CHOCOLATE >repo/ÉCLAIR
  $ hg -R repo add
  warning: possible case-folding collision for \xc3\x89CLAIR (esc)
  $ hg -R repo commit -d '2 0' -m bar

But hg fails to warn about Nußschnecke and NUSSSCHNECKE:

  $ echo om nom nom >repo/Nußschnecke
  $ hg -R repo add
  $ hg -R repo commit -d '3 0' -m baz
  $ rm -f repo/Nußschnecke
  $ echo NOM NOM >repo/NUSSSCHNECKE
  $ hg -R repo add
  $ hg -R repo commit -d '4 0' -m quux

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@lists.mercurial-scm.org
https://lists.mercurial-scm.org/mailman/listinfo/mercurial-devel

Reply via email to