Re: [Emu] EAP, RADIUS, UTF-8, RFC 4282 and SASLPREP: the interop nightmare
Hi, Unfortunately, in at least some implementations, this is not the case. However, I'd be interested if there exist implementations that handle UTF-8 usernames. That would provide a reference to test a fix against. Indeed. After some more tests: Lancom Client Utility (same Windows XP instance): - behaviour is the same as the built-in supplicant: encoded on the wire in locale, cyrillic input possible but transscribed to ?. KNetworkManager (openSUSE Linux 11.0, 32-Bit) --- encoding of @müller.de to @m[0xC3][0xBC]ller.de (UTF-8, no punycode) encoding of cryillic characters to 2-byte encodings starting with d0 and d1 - looks like cyrillic area of UTF-8, no punycode in realm That looks like a good UTF-8 test case. KNetworkManager uses wpa_supplicant as a backend. Greetings, Stefan Winter P.S.: add $OPEN_SOURCE_SALES_PITCH_FOR_WPA_SUPPLICANT here ;-) -- Stefan WINTER Ingenieur de Recherche Fondation RESTENA - Réseau Téléinformatique de l'Education Nationale et de la Recherche 6, rue Richard Coudenhove-Kalergi L-1359 Luxembourg Tel: +352 424409 1 Fax: +352 422473 ___ Emu mailing list Emu@ietf.org https://www.ietf.org/mailman/listinfo/emu
Re: [Emu] EAP, RADIUS, UTF-8, RFC 4282 and SASLPREP: the interop nightmare
Stefan Winter wrote: KNetworkManager (openSUSE Linux 11.0, 32-Bit) --- encoding of @müller.de to @m[0xC3][0xBC]ller.de (UTF-8, no punycode) encoding of cryillic characters to 2-byte encodings starting with d0 and d1 - looks like cyrillic area of UTF-8, no punycode in realm That looks like a good UTF-8 test case. KNetworkManager uses wpa_supplicant as a backend. It appears that KNetworkManager is responsible for encoding the name as UTF-8. Many Linux distributions have bypassed the various non-UTF-8 encodings, and just use UTF-8 everywhere. This makes conversion easy. P.S.: add $OPEN_SOURCE_SALES_PITCH_FOR_WPA_SUPPLICANT here ;-) Nice, but it's not related to open source. wpa_supplicant is just inheriting the encoding used by the host OS, which is UTF-8: http://lists.shmoo.com/pipermail/hostap/2008-August/018219.html ... wpa_supplicant does not really care about the encoding of the identity field, i.e., it is just sent out as arbitrary binary data. ...In addition, you can set the identity value as a hex string (identity=68656c6c6f); of course this is assuming that you know what binary data the authentication server expects to see. Checking the source, there are no references to UTF-8 in anything other than comments. Alan DeKok. ___ Emu mailing list Emu@ietf.org https://www.ietf.org/mailman/listinfo/emu
Re: [Emu] EAP, RADIUS, UTF-8, RFC 4282 and SASLPREP: the interop nightmare
Alan DeKok said: Or, it was easier to say 'ASCII', and to avoid any unknowns that might occur of 8-bit data is used. Given Stefan's test of MS-CHAP ISO-8895-15 encodings, I think the ASCII limitation in the spec is not matched by any similar limitations in the code. Unfortunately, in at least some implementations, this is not the case. However, I'd be interested if there exist implementations that handle UTF-8 usernames. That would provide a reference to test a fix against. The CUI is often created as [EMAIL PROTECTED]. i.e. based off of the User-Name. So it's worth double-checking the effects of changing User-Name on all down-stream uses. Presumably the hash can be calculated on UTF-8 as well as ASCII, no? ___ Emu mailing list Emu@ietf.org https://www.ietf.org/mailman/listinfo/emu
Re: [Emu] EAP, RADIUS, UTF-8, RFC 4282 and SASLPREP: the interop nightmare
Bernard Aboba wrote: The CUI is often created as [EMAIL PROTECTED]. i.e. based off of the User-Name. So it's worth double-checking the effects of changing User-Name on all down-stream uses. Presumably the hash can be calculated on UTF-8 as well as ASCII, no? Yes. If the example.com portion is interpreted by any party, it has to be dealt with the same as the corresponding portion of the User-Name. Alan DeKok. ___ Emu mailing list Emu@ietf.org https://www.ietf.org/mailman/listinfo/emu
Re: [Emu] EAP, RADIUS, UTF-8, RFC 4282 and SASLPREP: the interop nightmare
Hi, * User-Name in GUI: some cyrillic letters * encoded on wire: all transcribed to the same symbol ? in ISO-8859-15 or similar encoding (which is not very helpful!) To get to the cyrillic letters, I installed multi-language support and complex IMEs, i.e. everything I could find in System Settings, thinking that it may help the system to move to UTF-8 encodings. [BA] What version of Windows was this? XP? Vista? Ah, sorry: XP SP3. Stefan Winter said: So... if for MS-CHAPv2, the behaviour for non-ASCII is unspecified, then it's alright for it to transscribe unexpected input to whatever character it likes. So not the supplicant is to blame, but rather the fact of life that MS-CHAPv2 lives in an ASCII world. Hmmm... is an update to 2759 in any way feasible? Considering its deployed base that appears difficult at best. [BA] I'm trying to understand why the ASCII limitation exists in the first place. Presumably there are security protocols out there that utilize UTF-8 encoded usernames or NAIs (perhaps after some normalization procedure), right? I don't have any insight on the amount of use of non-ASCII NAIs. For eduroam I can say: no usage known, and from last week on I will heavily discourage anyone from deploying that until the situation gets better. Greetings, Stefan -- Stefan WINTER Ingenieur de Recherche Fondation RESTENA - Réseau Téléinformatique de l'Education Nationale et de la Recherche 6, rue Richard Coudenhove-Kalergi L-1359 Luxembourg Tel: +352 424409 1 Fax: +352 422473 ___ Emu mailing list Emu@ietf.org https://www.ietf.org/mailman/listinfo/emu
Re: [Emu] EAP, RADIUS, UTF-8, RFC 4282 and SASLPREP: the interop nightmare
Bernard Aboba wrote: [BA] RFC 4282 actually proposes that the realm portion of the NAI be encoded in punycode, not UTF-8. That's just wrong. No AAA client or server does that. At the last IETF, I had proposed in a hallway conversation, to update portions RFC 4282 to describe what implementors actually do. It looks like it's time for that document to get written. ...it is hard for me tosee how the NAI in EAP or RADIUS could be encoded in anything other than UTF-8. I agree. RFC 5335 Section 4.4 defines a utf8-addr-spec, which is: utf8-local-part @ utf8-domain That's probably a good start for this document. realm portion of the NAI.It **is** reasonable to say that if and when the realm is included in a DNS query that it should be converted to punycode (e.g. an A-label) beforehand. Yes. [BA] The more I’ve looked into this, the more likely it seems that this problem is real and potentially wide in scope, affecting not only EAP, RADIUS, Diameter but also EAP methods. For example, RFC 2759 (MS-CHAPv2) Section 4 states: Potentially anywhere a user identifier is used. User-Name, CUI, and other protocols such as Kerberos. [BA] So what do we do about this? ... a. A document on NAI internationalization, updating RFC 4282. This would address the (IMHO incorrect) punycode encoding of the realm portion. I'll start on that. b. A document on EAP internationalization, updating RFC 3748. This would cover the EAP-Response/Identity as well as potentially giving advice on issues such as password internationalization and internationalization of the EAP Peer-Id and Server-Ids. I'll stay away from that. :( Alan DeKok. ___ Emu mailing list Emu@ietf.org https://www.ietf.org/mailman/listinfo/emu
Re: [Emu] EAP, RADIUS, UTF-8, RFC 4282 and SASLPREP: the interop nightmare
Alan DeKok said: [BA] RFC 4282 actually proposes that the realm portion of the NAI be encoded in punycode, not UTF-8. That's just wrong. [BA] I agree. I don't know of any EAP peers that encode the NAI this way (although, based on Stefan's tests, they may not use UTF-8 either). ...it is hard for me tosee how the NAI in EAP or RADIUS could be encoded in anything other than UTF-8. I agree. RFC 5335 Section 4.4 defines a utf8-addr-spec, which is: utf8-local-part @ utf8-domain That's probably a good start for this document. [BA] Interesting. NAIs and e-mail addresses are similar; one of the reasons that we got in trouble with RFC 4282 was perhaps that we didn't wait until the EAI discussion was further along. At this point, in 8-bit clean situations, my understanding is that EAI utilizes UTF-8 for both the username and realm portion. Since both EAP Identity and RADIUS User-Name are 8-bit clean, the same logic (and probably, much of the ABNF) would seem to apply here. Stefan Winter said: Windows built-in supplicant --- * User-Name in GUI: @müller.de * encoded on wire: ü ::= 0xFC (ISO-8859-15/Windows-1252 of ü) * User-Name in GUI: some cyrillic letters * encoded on wire: all transcribed to the same symbol ? in ISO-8859-15 or similar encoding (which is not very helpful!) To get to the cyrillic letters, I installed multi-language support and complex IMEs, i.e. everything I could find in System Settings, thinking that it may help the system to move to UTF-8 encodings. [BA] What version of Windows was this? XP? Vista? Stefan Winter said: So... if for MS-CHAPv2, the behaviour for non-ASCII is unspecified, then it's alright for it to transscribe unexpected input to whatever character it likes. So not the supplicant is to blame, but rather the fact of life that MS-CHAPv2 lives in an ASCII world. Hmmm... is an update to 2759 in any way feasible? Considering its deployed base that appears difficult at best. [BA] I'm trying to understand why the ASCII limitation exists in the first place. Presumably there are security protocols out there that utilize UTF-8 encoded usernames or NAIs (perhaps after some normalization procedure), right? Potentially anywhere a user identifier is used. User-Name, CUI, and other protocols such as Kerberos. RFC 4372 (CUI) Section 2.2 doesn't say anything at all about internationalization: String: The string identifies the CUI of the end-user. This string value is a reference to a particular user. The format and content of the string value are determined by the Home RADIUS server. The binding lifetime of the reference to the user is determined based on business agreements. For example, the lifetime can be set to one billing period. RADIUS entities other than the Home RADIUS server MUST treat the CUI content as an opaque token, and SHOULD NOT perform operations on its content other than a binary equality comparison test, between two instances of CUI. In cases where the attribute is used to indicate the NAS support for the CUI, the string value contains a nul character. ___ Emu mailing list Emu@ietf.org https://www.ietf.org/mailman/listinfo/emu
Re: [Emu] EAP, RADIUS, UTF-8, RFC 4282 and SASLPREP: the interop nightmare
Bernard Aboba wrote: [BA] I agree. I don't know of any EAP peers that encode the NAI this way (although, based on Stefan's tests, they may not use UTF-8 either). I think the correct term is memcpy. [BA] Interesting. NAIs and e-mail addresses are similar; ... Often the same. Leveraging EAI would be beneficial. Since both EAP Identity and RADIUS User-Name are 8-bit clean, the same logic (and probably, much of the ABNF) would seem to apply here. I would like very much to know if anyone thinks that they *cannot* be applied here. [BA] I'm trying to understand why the ASCII limitation exists in the first place. Presumably there are security protocols out there that utilize UTF-8 encoded usernames or NAIs (perhaps after some normalization procedure), right? Or, it was easier to say ASCII, and to avoid any unknowns that might occur of 8-bit data is used. Given Stefan's test of MS-CHAP ISO-8895-15 encodings, I think the ASCII limitation in the spec is not matched by any similar limitations in the code. Potentially anywhere a user identifier is used. User-Name, CUI, and other protocols such as Kerberos. RFC 4372 (CUI) Section 2.2 doesn't say anything at all about internationalization: The CUI is often created as [EMAIL PROTECTED]. i.e. based off of the User-Name. So it's worth double-checking the effects of changing User-Name on all down-stream uses. Alan DeKok. ___ Emu mailing list Emu@ietf.org https://www.ietf.org/mailman/listinfo/emu
Re: [Emu] EAP, RADIUS, UTF-8, RFC 4282 and SASLPREP: the interop nightmare
Hi Bernard, thanks for providing more insight. What a mess. I got an encoding of ü ::= 0xfc, which hinted that the supplicant was not using UTF-8 but some locale (I expect it to be either ISO-8859-15 or Windows-1252, not that this matters).” [BA] Can you provide more details on the EAP implementation/operating system on which the test was conducted? I tried: Intel supplicant - * User-Name in GUI: @müller.de * encoded on wire: ü ::= 0xFC (ISO-8859-15/Windows-1252 of ü) * User-Name in GUI: tried cyrillic letters - couldn't even enter them in the dialog box in spite of Uzbek (Cyrillic) IME Windows built-in supplicant --- * User-Name in GUI: @müller.de * encoded on wire: ü ::= 0xFC (ISO-8859-15/Windows-1252 of ü) * User-Name in GUI: some cyrillic letters * encoded on wire: all transcribed to the same symbol ? in ISO-8859-15 or similar encoding (which is not very helpful!) To get to the cyrillic letters, I installed multi-language support and complex IMEs, i.e. everything I could find in System Settings, thinking that it may help the system to move to UTF-8 encodings. The transscript to ? now makes at least a little bit of sense to me, after your statement: EAP methods. For example, RFC 2759 (MS-CHAPv2) Section 4 states: “The Name field is a string of 0 to (theoretically) 256 case-sensitive ASCII characters which identifies the peer's user account name.” Yup. ASCII, **not** UTF-8! This actually can cause an authentication failure for a user with an NAI of [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]. So... if for MS-CHAPv2, the behaviour for non-ASCII is unspecified, then it's alright for it to transscribe unexpected input to whatever character it likes. So not the supplicant is to blame, but rather the fact of life that MS-CHAPv2 lives in an ASCII world. Hmmm... is an update to 2759 in any way feasible? Considering its deployed base that appears difficult at best. [BA] So what do we do about this? Some of the following may be needed to fix the problem: a. A document on NAI internationalization, updating RFC 4282. This would address the (IMHO incorrect) punycode encoding of the realm portion. b. A document on EAP internationalization, updating RFC 3748. This would cover the EAP-Response/Identity as well as potentially giving advice on issues such as password internationalization and internationalization of the EAP Peer-Id and Server-Ids. I didn't notice so far that 4282 allows both UTF-8 characters AND demands punycode conversion on the realm part. That adds another bit to the confusion indeed. I also think the punycode translation is wrong at this place. It should rather be done by an application if it needs to look up the realm in DNS by the time it is looked up in DNS, not before that on the wire. Especially since 4282 does allow UTF-8 encoding to be transported literally. NAIs can also be used outside of EAP (right?), so the issue of fixing punycode in NAI is independent of fixing the character encoding in EAP. Fixing EAP character encoding for proper internationalisation is also needed IMHO, for all the reasons outlined in the thread before. So, in short, both a) and b) seem necessary to me. UTF-8 endcoded Grüße, Stefan Winter -- Stefan WINTER Ingenieur de Recherche Fondation RESTENA - Réseau Téléinformatique de l'Education Nationale et de la Recherche 6, rue Richard Coudenhove-Kalergi L-1359 Luxembourg Tel: +352 424409 1 Fax: +352 422473 ___ Emu mailing list Emu@ietf.org https://www.ietf.org/mailman/listinfo/emu