On 9/19/2016 9:46 AM, Wang Weijun wrote:
Normalization is not a part of RFC 2253. The spec is described in
I am not sure of this change for several reasons:
1. I cannot find anywhere in RFC 2253 (or its new versions) mentioning
normalizations. Do you know elsewhere?
ASN.1 and RFC 2253 require UTF-8 encoding. "Hello, world!" is encoded
as "Hello%2C%20world%21". "Hello， world!" is encoded as
"Hello%EF%BC%8C%20world%21". The encoded code should be different.
When signing a certificate, "，" is not converted to ",", I don't think
it is fine to convert it while parsing the field.
2. It's not obvious to say "Hello, world!" and "Hello， world!" should be
different if NFKD thinks they are.
Actually, I'm not sure why normalization is required here. So I don't
want to update the code too much. The previous form is NFKD. If
removing the "compatibility" impact part, the form is NFD, then.
3. Why not NFC? Although I did't find normalization on X500 names in RFC 5280,
I do see in several other cases NFV is used.
What's the form of NFV? Any typo?
Yes. I though about this case. The current fix comes from the fact
that UTF-8 "Hello, world!" and "Hello， world!" should be different.
Parsing them as the same thing may result in unexpected serious issues.
4. Is it possible to perform normalization before escaping special characters?
I have to say I agree with this point. I don't see the point to use
normalization. But I'm not sure I get the full information to remove
the normalization. I don't want to fix it until it is broken.
5. Why is normalization necessary? At least in RFC 5280 18.104.22.168, it says
When the subject of the certificate is a CA, the subject
field MUST be encoded in the same way as it is encoded in the
issuer field (Section 22.214.171.124 ) in all certificates issued by
the subject CA.
which implies comparison should be on encoding instead of toString.
On Sep 15, 2016, at 8:09 AM, Xuelei Fan <xuelei....@oracle.com> wrote:
Please review this fix:
The Normalizer.Form.NFKD is used to normalize attribute-value assertion in X.509 certificate processing. The normalizer may convert some
UTF-8 character into ASCII code. For example, "，"(two bytes) will be converted to ","(one byte), and "Hello，
world!" is normalize to "Hello, world!". However, "Hello, world!" and "Hello， world!" should be
different because of the comma code. This conversion may result in unexpected weird behaviors for name comparing and conversions.
This fix will update to use "Normalizer.Form.NFD".