The following text is quoted from:
draft-ietf-krb-wg-kerberos-clarifications-04.txt
5.2.1. KerberosString
The original specification of the Kerberos protocol in RFC 1510 uses GeneralString in numerous places for human-readable string data. Historical implementations of Kerberos cannot utilize the full power of GeneralString. This ASN.1 type requires the use of designation and invocation escape sequences as specified in ISO-2022/ECMA-35 [ISO-2022/ECMA-35] to switch character sets, and the default character set that is designated as G0 is the ISO-646/ECMA-6 [ISO-646,ECMA-6] International Reference Version (IRV) (aka U.S. ASCII), which mostly works.
ISO-2022/ECMA-35 defines four character-set code elements (G0..G3) and two Control-function code elements (C0..C1). DER prohibits the designation of character sets as any but the G0 and C0 sets. Unfortunately, this seems to have the side effect of prohibiting the use of ISO-8859 (ISO Latin) [ISO-8859] character-sets or any other character-sets that utilize a 96-character set, since it is prohibited by ISO-2022/ECMA-35 to designate them as the G0 code element. This side effect is being investigated in the ASN.1 standards community.
In practice, many implementations treat GeneralStrings as if they were 8-bit strings of whichever character set the implementation defaults to, without regard for correct usage of character-set designation escape sequences. The default character set is often determined by the current user's operating system dependent locale. At least one major implementation places unescaped UTF-8 encoded Unicode characters in the GeneralString. This failure to adhere to the GeneralString specifications results in interoperability issues when conflicting character encodings are utilized by the Kerberos clients, services, and KDC.
This unfortunate situation is the result of improper documentation of the restrictions of the ASN.1 GeneralString type in prior Kerberos specifications.
The new (post-RFC 1510) type KerberosString, defined below, is a GeneralString that is constrained to only contain characters in IA5String
KerberosString ::= GeneralString (IA5String)
US-ASCII control characters should in general not be used in KerberosString, except for cases such as newlines in lengthy error messages. Control characters SHOULD NOT be used in principal names or realm names.
For compatibility, implementations MAY choose to accept GeneralString values that contain characters other than those permitted by IA5String, but they should be aware that character set designation codes will likely be absent, and that the encoding should probably be treated as locale-specific in almost every way. Implementations MAY also choose to emit GeneralString values that are beyond those permitted by IA5String, but should be aware that doing so is extraordinarily risky from an interoperability perspective.
Some existing implementations use GeneralString to encode unescaped locale-specific characters. This is a violation of the ASN.1 standard. Most of these implementations encode US-ASCII in the left- hand half, so as long the implementation transmits only US-ASCII, the ASN.1 standard is not violated in this regard. As soon as such an implementation encodes unescaped locale-specific characters with the high bit set, it violates the ASN.1 standard.
Other implementations have been known to use GeneralString to contain a UTF-8 encoding. This also violates the ASN.1 standard, since UTF-8 is a different encoding, not a 94 or 96 character "G" set as defined by ISO 2022. It is believed that these implementations do not even use the ISO 2022 escape sequence to change the character encoding. Even if implementations were to announce the change of encoding by using that escape sequence, the ASN.1 standard prohibits the use of any escape sequences other than those used to designate/invoke "G" or "C" sets allowed by GeneralString.
Future revisions to this protocol will almost certainly allow for a more interoperable representation of principal names, probably including UTF8String.
Note that applying a new constraint to a previously unconstrained type constitutes creation of a new ASN.1 type. In this particular case, the change does not result in a changed encoding under DER.
Gustavo Rios wrote:
Dear gentleman/madam,
i am studing kerberosV (RFC1510) protocol specification. Some data types for communication are specified as GeneralString encoding. Then i started studying ASN. It came to surprise my that, not only the sources of documentation advice against the usage of GeneralString as also the own ITU standard. Since, I respectfully request your clarification towards this.
How does current Kerberos implementations deal with this? I mean, how the current encoding performing... What are the valid characters used by the current implementation on the market. (And i am considering MIT and HEIMDAL implementation at least, if you known some one else, let me know).
Thanks a lot for your time.
PS: My source of information are:
http://asn1.elibel.tm.fr/en/book/index.htm http://asn1.elibel.tm.fr/en/standards/index.htm
________________________________________________ Kerberos mailing list [EMAIL PROTECTED] https://mailman.mit.edu/mailman/listinfo/kerberos
