UTF-8 *IS* perfectly valid Unicode -- it's one of the main Unicode encodings, and seems entirely appropriate for use in certs, although I personally have no knowledge of the support in OpenSSL or the X509 standard. UTF-8 is a variable length encoding where the valid UTF-8 characters are from 1 to 6 bytes in length.
UTF-8 encodes the first 128 ASCII characters identically to 7-bit ASCII, and UTF-8 strings preserve the notion of a null-terminated character string, such that the zero byte terminates a UTF-8 string compatibly with ASCII null-terminated strings. So the warning that a null character is not allowed in a string really means it can't be embedded in the 'middle' of a string, since the null will be interpreted to *terminate* the string. This is NOT the case with UTF-16. individual bytes in UTF-16 encoding may certainly be zero, and they do NOT terminate a string. So it makes sense that UTF-16 would not be supported in the Issuer and Subject fields. But UTF-8 seems like an excellent fit to me. The trick is getting the native characters from the user converted to UTF-8 for storage in the certificate. Presumably the user enters the Issuer and Subject data in a GUI or at a command line in a shell that is using Big5 or GB-18030 character encoding. The application must convert the entered data into UTF-8 to pass to the cert creation process. There's a million ways to do that conversion (an excellent best tool is ICU). Fascinating. Good luck with it. I'd like to hear what your progress is +-+-+-+-+-+-+ Dave McLellan, Symmetrix Software EMC Corporation, 228 South St, Hopkinton MA Mail Stop LL/AA-24 office 508-249-1257, fax 508-544-2129 cell 978-500-2546, IM: mclellan_d...@yahoo.com +-+-+-+-+-+-+ -----Original Message----- From: owner-openssl-us...@openssl.org [mailto:owner-openssl-us...@openssl.org] On Behalf Of Shaw Graham George Sent: Thursday, November 19, 2009 8:08 AM To: openssl-users@openssl.org Subject: Creating a certificate with Unicode characters in Issuer and Subject Hi, I have a requirement to make some test keys/certificates that contain Unicode (Chinese) data in the Issuer and Subject fields. Print-out from an example certificate using "openssl x509" is: Issuer: C=\x00C\x00N, ST=\x00G\x00u\x00a\x00n\x00g\x00d\x00o\x00n\x00g, L=\x00G\x00u\x00a\x00n\x00g\x00z\x00h\x00o\x00u, O=\x00G\x00D\x00C\x00A\x00 \x00C\x00e\x00r\x00t\x00i\x00f\x00i\x00c\x00a\x00t\x00e\x00 \x00A\x00u\x00t\x00h\x00o\x00r\x00i\x00t\x00y Subject: C=\x00C\x00N, ST=^\x7FN\x1Cw\x01, L=^\x7F]\xDE^\x02, ... Is this at all possible using the openssl tool? From the manual pages it seems that UTF-8 is supported, but not Unicode - for example the config man page says that null characters in strings is not allowed. If not, then does anybody know of any other tools that I could use to make my test keys/certificates. Thanks in advance, George. ______________________________________________________________________ OpenSSL Project http://www.openssl.org User Support Mailing List openssl-users@openssl.org Automated List Manager majord...@openssl.org ______________________________________________________________________ OpenSSL Project http://www.openssl.org User Support Mailing List openssl-users@openssl.org Automated List Manager majord...@openssl.org