Interesting that there appears to be no setting that makes all names
"normalized on-disk" (the way MacOS does it).  I vaguely recall that
being discussed (long ago) but I guess that was never implemented.


On Sun, Nov 15, 2015 at 2:46 AM, Yuri Pankov <yuri.pan...@nexenta.com> wrote:
> On Sun, 15 Nov 2015 01:10:43 -0600, Richard Laager wrote:
>>
>> On 11/14/2015 09:34 PM, Yuri Pankov wrote:
>>>
>>> So we are NOT normalization-insensitive by default, and treat filenames
>>> as just byte sequences with normalization=none?
>>
>>
>> ZFS *is* normalization-insensitive (treats filenames as opaque byte
>> sequences) by default, because the default is normalization=none.
>
>
> I'd call it normalization-sensitive - when we treat 2 unicode-equivalent
> strings normalized to different forms as different. Compare to the
> casesensitivity - there with "insensitive" setting we treat strings as the
> same, no matter the case.
>
>>>> normalize on lookup.
>>>
>>>
>>> That was my question - WHAT do we normalize?  eg, we have a filename
>>> stored in NFC, and there's a request to delete same filename, but in
>>> NFD, do we normalize stored filename? do we normalize the one in
>>> request? do we normalize both (that would make no sense to have
>>> different normalization values then)?
>>
>>
>> The two strings are normalized before doing the comparison.
>
>
> OK.
>
>>> Given the above answers, what is the difference between formC and formD
>>> setting then?
>>
>>
>> In terms of observable effects on the lookup, they should be
>> functionally equivalent. formC requires an extra step, so it may be
>> slightly slower.
>
>
> OK.
>
>> Honestly, I'm not sure why formC and formKC exist as choices.
>
>
> So my understanding now is:
>
> - none - normalization-sensitive, names are compared as byte sequences.
> - formD - normalization-insensitive, both names are normalized to NFD (fully
> decomposed), and then compared as byte sequences - should be used eg for
> interoperability between Windows (which doesn't do any normalization on
> filenames) and MacOS X (which does normalize filenames and lookups to NFD).
> - formC - normalization-insensitive, both names are normalized to NFC (NFD +
> fully composed), not sure when/why it should be used instead of NFD. Are
> there any strings that are equivalent when normalized to NFD and are NOT
> equivalent when normalized to NFC?
>
> Note that I'm trying to be end-user here, not looking into any
> implementation details or the code.
>
>
>
> -------------------------------------------
> illumos-zfs
> Archives: https://www.listbox.com/member/archive/182191/=now
> RSS Feed:
> https://www.listbox.com/member/archive/rss/182191/22050030-47af814e
> Modify Your Subscription:
> https://www.listbox.com/member/?member_id=22050030&id_secret=22050030-0e5a2b89
> Powered by Listbox: http://www.listbox.com
_______________________________________________
developer mailing list
developer@open-zfs.org
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to