On 6/25/18 1:35 PM, swchang10--- via dev-security-policy wrote:
> On Friday, August 11, 2017 at 6:54:22 AM UTC-7, Peter Bowen wrote:
>> On Thu, Aug 10, 2017 at 1:22 PM, Jonathan Rudenberg via
>> dev-security-policy <[email protected]> wrote:
>>> RFC 5280 section 7.2 and the associated IDNA RFC requires that 
>>> Internationalized Domain Names are normalized before encoding to punycode.
>>>
>>> Let’s Encrypt appears to have issued at least three certificates that have 
>>> at least one dnsName without the proper Unicode normalization applied.
>>>
>>> It’s also worth noting that RFC 3491 (referenced by RFC 5280 via RFC 3490) 
>>> requires normalization form KC, but RFC 5891 which replaces RFC 3491 
>>> requires normalization form C. I believe that the BRs and/or RFC 5280 
>>> should be updated to reference RFC 5890 and by extension RFC 5891 instead.
>>
>> I did some reading on Unicode normalization today, and it strongly
>> appears that any string that has been normalized to normalization form
>> KC is by definition also in normalization form C.  Normalization is
>> idempotent, so doing toNFKC(toNKFC()) will result in the same string
>> as just doing toNFKC() and toNFC(toNFC()) is the same as toNFC().
>> Additionally toNFKC is the same as toNFC(toK()).
>>
>> This means that checking that a string matches the result of
>> toNFC(string) is a valid check regardless of whether using the 349* or
>> 589* RFCs.  It does mean that Certlint will not catch strings that are
>> in NFC but not in NFKC.
>>
>> Thanks,
>> Peter
>>
>> P.S. I've yet to find a registered domain name not in NFC, and that
>> includes checking every name in the the zone files for all ICANN gTLDs
>> and a few ccTLDs
> 
> Hi,
> I have an example international domain that is NFC but not NFKC, 
> "xn--ttt-8fa.pumesa.com" (this is a fake domain and my focus is on the 
> general pattern).
> The pattern that will cause a domain to be NFC but not NFKC in Golang is: 
> "xn--" followed by any same three letters followed by a single "-" followed 
> by any single digit number followed by "fa"; now I know this pattern doesn't 
> describe real unicode, however the behavior in the programming language is 
> curious (below).
> The pattern described above causes strings to be NFC positive but not NFKC in 
> Golang; furthermore, I ran a few tests using Golang (version go1.10.3 darwin) 
> and Java (version "1.8.0_60") and here is the key parts of the code I used:
> 1) Golang (Used "ToUnicode" to mimic how Zlint tests):
> package main
> import (
> "fmt"
> "golang.org/x/net/idna"
> "golang.org/x/text/unicode/norm"
> )
> func main(){
>       str := "xn--xxx-7fa.pumesa.com"
>       punycode,err := idna.ToUnicode(str)
>       if err != nil {
>               fmt.Println(err)
>       }
>       fmt.Println("Is NFC ", norm.NFC.IsNormalString(punycode))
>       fmt.Println("Is NFKC ", norm.NFKC.IsNormalString(punycode))
> }
> 
> The last NFKC check is what causes Zlint to throw an error, stating that the 
> unicode is not in compliance, seems that Zlint needs to be updated to follow 
> the latest BR (RFC 5891), meaning check if the unicode in question is NFC 
> compliant rather than NFKC? 
> 
> Below is something even more interesting.
> 
> 2) Java:
> import java.net.IDN;
> import java.text.Normalizer;
> public class Main{
> public static void main(String args[]){
>       String cn = "xn--www-0xx.pumesa.com";
>       String punycode = IDN.toASCII(cn);
>         //punycode = IDN.toUnicode(punycode);
>         System.out.println("is NFC " + Normalizer.isNormalized(punycode, 
> Normalizer.Form.NFC));
>       System.out.println("is NFKC " + Normalizer.isNormalized(punycode, 
> Normalizer.Form.NFKC));
> }
> }
> 
> Per Oracle doc, java.net.IDN.toASCII conforms with RFC 3490, and it throws no 
> error, this can be double checked within the language by converting the 
> punycode back to Unicode, both print statements return true.
> 
> So to reiterate, the two main questions are:
> 1) Should there be a discussion about why Oracle Java and Golang don't agree 
> on whether this pattern causes unicode to be NFKC compliant?
> The potential impact is that results obtained from a Java system may not be 
> Zlint compliant.
> 2) Should Zlint be updated to the latest BR (RFC 5891) regardless of question 
> #1?

Probably. However, please be aware that the change from RFC 3490 (IDNA)
to RFC 5891 (IDNA2008) involved more than just a change from Unicode
normalization form C to Unicode normalization form KC.

Also relevant:

https://tools.ietf.org/html/rfc8399

Peter

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
dev-security-policy mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-security-policy

Reply via email to