On Friday, August 11, 2017 at 6:54:22 AM UTC-7, Peter Bowen wrote:
> On Thu, Aug 10, 2017 at 1:22 PM, Jonathan Rudenberg via
> dev-security-policy <dev-security-policy@lists.mozilla.org> wrote:
> > RFC 5280 section 7.2 and the associated IDNA RFC requires that 
> > Internationalized Domain Names are normalized before encoding to punycode.
> >
> > Let’s Encrypt appears to have issued at least three certificates that have 
> > at least one dnsName without the proper Unicode normalization applied.
> >
> > It’s also worth noting that RFC 3491 (referenced by RFC 5280 via RFC 3490) 
> > requires normalization form KC, but RFC 5891 which replaces RFC 3491 
> > requires normalization form C. I believe that the BRs and/or RFC 5280 
> > should be updated to reference RFC 5890 and by extension RFC 5891 instead.
> 
> I did some reading on Unicode normalization today, and it strongly
> appears that any string that has been normalized to normalization form
> KC is by definition also in normalization form C.  Normalization is
> idempotent, so doing toNFKC(toNKFC()) will result in the same string
> as just doing toNFKC() and toNFC(toNFC()) is the same as toNFC().
> Additionally toNFKC is the same as toNFC(toK()).
> 
> This means that checking that a string matches the result of
> toNFC(string) is a valid check regardless of whether using the 349* or
> 589* RFCs.  It does mean that Certlint will not catch strings that are
> in NFC but not in NFKC.
> 
> Thanks,
> Peter
> 
> P.S. I've yet to find a registered domain name not in NFC, and that
> includes checking every name in the the zone files for all ICANN gTLDs
> and a few ccTLDs

Hi,
I have an example international domain that is NFC but not NFKC, 
"xn--ttt-8fa.pumesa.com" (this is a fake domain and my focus is on the general 
pattern).
The pattern that will cause a domain to be NFC but not NFKC in Golang is: 
"xn--" followed by any same three letters followed by a single "-" followed by 
any single digit number followed by "fa"; now I know this pattern doesn't 
describe real unicode, however the behavior in the programming language is 
curious (below).
The pattern described above causes strings to be NFC positive but not NFKC in 
Golang; furthermore, I ran a few tests using Golang (version go1.10.3 darwin) 
and Java (version "1.8.0_60") and here is the key parts of the code I used:
1) Golang (Used "ToUnicode" to mimic how Zlint tests):
package main
import (
"fmt"
"golang.org/x/net/idna"
"golang.org/x/text/unicode/norm"
)
func main(){
        str := "xn--xxx-7fa.pumesa.com"
        punycode,err := idna.ToUnicode(str)
        if err != nil {
                fmt.Println(err)
        }
        fmt.Println("Is NFC ", norm.NFC.IsNormalString(punycode))
        fmt.Println("Is NFKC ", norm.NFKC.IsNormalString(punycode))
}

The last NFKC check is what causes Zlint to throw an error, stating that the 
unicode is not in compliance, seems that Zlint needs to be updated to follow 
the latest BR (RFC 5891), meaning check if the unicode in question is NFC 
compliant rather than NFKC? 

Below is something even more interesting.

2) Java:
import java.net.IDN;
import java.text.Normalizer;
public class Main{
public static void main(String args[]){
        String cn = "xn--www-0xx.pumesa.com";
        String punycode = IDN.toASCII(cn);
        //punycode = IDN.toUnicode(punycode);
        System.out.println("is NFC " + Normalizer.isNormalized(punycode, 
Normalizer.Form.NFC));
        System.out.println("is NFKC " + Normalizer.isNormalized(punycode, 
Normalizer.Form.NFKC));
}
}

Per Oracle doc, java.net.IDN.toASCII conforms with RFC 3490, and it throws no 
error, this can be double checked within the language by converting the 
punycode back to Unicode, both print statements return true.

So to reiterate, the two main questions are:
1) Should there be a discussion about why Oracle Java and Golang don't agree on 
whether this pattern causes unicode to be NFKC compliant?
The potential impact is that results obtained from a Java system may not be 
Zlint compliant.
2) Should Zlint be updated to the latest BR (RFC 5891) regardless of question 
#1?
_______________________________________________
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy

Reply via email to