Er, I was distracted by being called for supper. The astute reader will note that the bytes I gave as the UTF-8 encoding were actually UCS2-LE.
Which doesn't negate the fact that we still need to care about legacy character sets unless we just want to be buggy for all non-ASCII content, surely? On 20 July 2022 18:05:41 BST, David Woodhouse via curl-library <curl-library@lists.haxx.se> wrote: >On Wed, 2022-07-20 at 12:29 +0200, Patrick Monnerat via curl-library wrote: >> On 7/20/22 12:11, Daniel Stenberg via curl-library wrote: >> > On Wed, 20 Jul 2022, Stephan Mühlstrasser via curl-library wrote: >> > > Do I understand it correctly that EBCDIC support is gone now in general? >> > >> > Yes, that is correct. >> >> This is true: libcurl internally only works on ASCII-based character sets. > >Hm, I don't understand how "works on ASCII-based character sets" is a >thing that makes sense to say. > >If there is *any* legacy 8-bit character set in use, whether it's >ASCII-based or not, we still have to take care with conversions, >surely? > >In OpenConnect I have specific tests for this kind of thing. I have a >test client cert with a password 'ĂŻ', which is two characters: > > U+0102 LATIN CAPITAL LETTER A WITH BREVE > U+017B LATIN CAPITAL LETTER Z WITH DOT ABOVE > >In UTF-8 that's represented by the bytes 0x02 0x01 0x7b 0x01. >In ISO8859-2 it would just be two bytes: 0xc3 0xaf. > >I have tests which set $LANG and try to use the correct password. You >can't just say they're both "ASCII-based" and pray to the deity of your >choice, surely? > >In any setup that actually *works,* EBCDIC is just a trivially-special >case of the general case of needing conversion. > >If it only works for 7-bit ASCII characters, that isn't "works". >
-- Unsubscribe: https://lists.haxx.se/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html