Interesting that there appears to be no setting that makes all names "normalized on-disk" (the way MacOS does it). I vaguely recall that being discussed (long ago) but I guess that was never implemented.
On Sun, Nov 15, 2015 at 2:46 AM, Yuri Pankov <yuri.pan...@nexenta.com> wrote: > On Sun, 15 Nov 2015 01:10:43 -0600, Richard Laager wrote: >> >> On 11/14/2015 09:34 PM, Yuri Pankov wrote: >>> >>> So we are NOT normalization-insensitive by default, and treat filenames >>> as just byte sequences with normalization=none? >> >> >> ZFS *is* normalization-insensitive (treats filenames as opaque byte >> sequences) by default, because the default is normalization=none. > > > I'd call it normalization-sensitive - when we treat 2 unicode-equivalent > strings normalized to different forms as different. Compare to the > casesensitivity - there with "insensitive" setting we treat strings as the > same, no matter the case. > >>>> normalize on lookup. >>> >>> >>> That was my question - WHAT do we normalize? eg, we have a filename >>> stored in NFC, and there's a request to delete same filename, but in >>> NFD, do we normalize stored filename? do we normalize the one in >>> request? do we normalize both (that would make no sense to have >>> different normalization values then)? >> >> >> The two strings are normalized before doing the comparison. > > > OK. > >>> Given the above answers, what is the difference between formC and formD >>> setting then? >> >> >> In terms of observable effects on the lookup, they should be >> functionally equivalent. formC requires an extra step, so it may be >> slightly slower. > > > OK. > >> Honestly, I'm not sure why formC and formKC exist as choices. > > > So my understanding now is: > > - none - normalization-sensitive, names are compared as byte sequences. > - formD - normalization-insensitive, both names are normalized to NFD (fully > decomposed), and then compared as byte sequences - should be used eg for > interoperability between Windows (which doesn't do any normalization on > filenames) and MacOS X (which does normalize filenames and lookups to NFD). > - formC - normalization-insensitive, both names are normalized to NFC (NFD + > fully composed), not sure when/why it should be used instead of NFD. Are > there any strings that are equivalent when normalized to NFD and are NOT > equivalent when normalized to NFC? > > Note that I'm trying to be end-user here, not looking into any > implementation details or the code. > > > > ------------------------------------------- > illumos-zfs > Archives: https://www.listbox.com/member/archive/182191/=now > RSS Feed: > https://www.listbox.com/member/archive/rss/182191/22050030-47af814e > Modify Your Subscription: > https://www.listbox.com/member/?member_id=22050030&id_secret=22050030-0e5a2b89 > Powered by Listbox: http://www.listbox.com _______________________________________________ developer mailing list developer@open-zfs.org http://lists.open-zfs.org/mailman/listinfo/developer