On Fri, 27 Jun 2014, Jan Cholasta wrote:
>On 27.6.2014 10:58, Alexander Bokovoy wrote:
>>On Fri, 27 Jun 2014, Jan Cholasta wrote:
>>>On 27.6.2014 10:29, Alexander Bokovoy wrote:
>>>>On Fri, 27 Jun 2014, Jan Cholasta wrote:
>>>>>On 27.6.2014 10:15, Alexander Bokovoy wrote:
>>>>>>On Fri, 20 Jun 2014, Martin Basti wrote:
>>>>>>>On Fri, 2014-06-20 at 10:32 +0200, Jan Cholasta wrote:
>>>>>>>>On 18.6.2014 16:49, Martin Basti wrote:
>>>>>>>>>Due to compability with older versions, only IDNA domains should be
>>>>>>>>>checked
>>>>>>>>>Patch attached.
>>>>>>>>
>>>>>>>>I'm not particularly happy about the u'\xdf' special case. Isn't
>>>>>>>>there a
>>>>>>>>better way to do this check?
>>>>>>>I cant find better way. u'\xdf' is mapped to ss, and ss is not IDN
>>>>>>>string.
>>>>>>>
>>>>>>>Or just remove this validation.
>>>>>>>
>>>>>>>>(BTW I really think this should be a warning, not an error, but that
>>>>>>>>would require larger amount of work, so I guess it's OK for now.)
>>>>>>>(More pain than gain)
>>>>>>Main thing in this patch is that the check should not be done against
>>>>>>non-IDN strings. I want this version of the patch to go in for that
>>>>>>reason as currently you cannot even complete ipa-adtrust-install
>>>>>>run due
>>>>>>to IDN normalisation check being applied to non-IDN domains.
>>>>>
>>>>>On non-IDN domains, the only effect of IDN normalization is that it
>>>>>lower-cases the names (right?), so the check should compare
>>>>>lower-cased original name with the normalized name, instead of
>>>>>special-casing certain characters etc.
>>>>.. what's the reason to do such comparison then? lower-cased non-IDN
>>>>name will be equal to lower-cased normalized non-IDN name by definition,
>>>>so the check is not needed in this case, at all.
>>>
>>>The point is that it works for both IDN and non-IDN, without
>>>u'\xdf'-style hacks.
>>No, your proposal of comparing low-cased value and normalized value is
>>not going to work because low-cased value is in general not equal to
>>normalized value for IDN names, only for non-IDN ones, due to the fact
>>that lower case for non-ASCII Unicode character may map to a completely
>>different character than in normalization situation. Take, for example,
>>Turkish alphabet where there are six letters with different case rules
>>(uppercase dotted i, dottless lowercase i, upper- and lowercase G with
>>breve accent, and upper- and lowercase S with cedilla), which will break
>>your generalized check.
>>So you'll anyway will need to split these cases.
>>
>
>I see.
>
>I'm still not comfortable with carrying the bit of knowledge about
>u'\xdf' in this particular spot. Can we check that a name is IDN some
>other way than "domain_name.is_idn() or u'\xdf' in value"?
Attached patch works for me. Note that nameprep() use should be
per-label, not the whole DNS name, and we can use
encodings.idna.ToASCII() per lable to check if the label is IDN. If any
of the label is IDN, we force strict check.