2011/7/29 Howard Chu <[email protected]>: > Howard Chu wrote: >> Erwann ABALEA wrote: >>> Do you have any document or pointer to understand the task of >>> converting to/from T.61, and incompatible character sets you talked >>> about? I Googled for this, but I'm not sure of what I found (what I >>> found reminds me of old character sets we used many years ago in >>> France for the Minitel, with G1/G2 character groups, etc, not that far >>> from VT consoles). >> >> You can reference this old draft; I wrote Appendix A and B to document the >> mapping as we understood it at that time. These Appendices were dropped >> from >> the final version because it was considered futile to attempt to document >> the >> T.61 character encoding rules. >> >> http://tools.ietf.org/html/draft-ietf-ldapbis-strprep-00#appendix-A >> >> You can also read libldap/t61.c; the code has been present in every >> OpenLDAP >> release since 2002 but is not compiled or used. >> > This Guide has a pretty good discussion of the issues. > > http://www.cs.auckland.ac.nz/~pgut001/pubs/x509guide.txt > > The section on "Character Sets" is particularly relevant. The section on > "Comparing DNs" is somewhat relevant, though in fact OpenLDAP has already > solved this problem (for all the string types besides T61String) by doing > all matching in UTF-8.
Thank you for the pointers. I appreciate Peter's writings, and already read this text, some time ago, but wasn't focused on T.61 then. OpenSSL in its 1.0.0 version internally stores the named in UTF8, "semi-normalized" form (useless spaces removed, everything is converted to lowercase, but no NFC/NFD normalization is done). I'm reading now libldap/t61.c. I just read the IETF draft, and the numerous tables... What a mess. X.680 has a reference to T.61 recommendation, which was deleted some years ago, and I'm not clever enough to make Google find a copy of the standard. It can't be bought anymore from ITU, but it's still referenced by later standards. Nice. Meanwhile, I still haven't found the Czech CSCA certificate, but I know what to do with the remaining 1% uncertainty. The CN field is encoded as T61String, to hold the "CSCA_CZ" value. That fits well within the 7bits limit. If everything is internally converted to UTF8 and t61.c seems to provide a lossless T.61 to UTF8 conversion, why isn't it used? -- Erwann.
