Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.
On Tue, 2016-08-30 at 12:38 +0200, Andy Polyakov wrote: > > So for other applications to try to read OpenSSL's PKCs#12 files, what > > we need to do is first convert the sane Unicode rendition of the > > password into the native locale charset (e.g. Windows-1252), then take > > that bytestream and *pretend* it's ISO8859-1, and convert to UTF-16BE > > to check the MAC. Then if that fails, take the same Windows-1252 > > bytestream and *pretend* it's UTF-8, and convert *that* to UTF-16BE to > > see if it works. > > Are you sure you want to know? I mean you already said "I wish I hadn't > ask", and now you are bringing Windows into conversation :-) :-) :-) I concede, I am a masochist. :) > Yes, it's as bad as you can possibly imagine, and trouble is that there > is no "right thing" to do *if* you formulate "keep old data accessible" > as goal. Yeah, if you want to be able to create PKCS#12 files that'll work (in the same locale) with older versions of OpenSSL, you're kind of stuck. I can live with that historical accident, and the workaround of "convert to the locale charset, then pretend that's ISO8859-1 and try again". I can even live with the fact that, for the reasons you've stated, OpenSSL *still* produces files which need that workaround. But I *really* wish you hadn't added *another* bug, and required another workaround > At the same time a way to access and generate > specification-compliant data is provided (on *ix it takes *an* UTF8 > locale, which is default in absolute majority of cases bu now, and on > Windows it's an *opt-in* environment variable for the time being). ... so instead of doing the UTF-8 thing whenever the incoming bytestream happens to be interpretable as valid UTF-8, why not do it only if LC_CTYPE actually *is* UTF-8? So my (admittedly contrived and unlikely) example of "ĂŻ" in an ISO8859-2 locale would only continue to do the *old* wrong thing, and not a *new* wrong thing that needs an additional workaround :) -- dwmw2 smime.p7s Description: S/MIME cryptographic signature -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.
> Hm, words fail me. > > Well, that's not entirely true. But *polite* words fail me... :-) > Let me try to understand this... you have always ignored, and still > ignore, the actual LC_CTYPE which tells you the character set in which > the password was provided from the user. > > You *used* to convert it through your horribly misnamed > OPENSSL_asc2uni() function, which actually converts ISO8859-1 to > UTF16BE by simply expanding it and inserting 0x00 between each byte of > the original. So effectively you were "assuming" the input was > ISO8859-1. Unfortunately yes. > Now you assume it's UTF-8, and convert it with OPENSSL_utf8touni(). > (And now what happens when the locale input *isn't* valid UTF-8, > because it's a legacy 8-bit charset? That conversion should fail, > right?) Right. Though more correct formulation probably "is likely to fail" instead of "should fail". > Your latest workaround (from which I started this thread) is addressing > the first problem *purely* for the case of the UTF-8 locale. It checks > for the "we treated UTF-8 as if it were ISO8859-1" case, which is what > leads to the 006e 0061 00c3 00af 0076 0065 example you gave above. Yes. > But you *haven't* actually implemented any workaround for the other > whole set of "we treated locale X as if it were ISO8859-1" bugs > that your original code had. Or the whole set of "we treated local > X as if it were UTF-8" bugs that the new code has. Yes. > So for other applications to try to read OpenSSL's PKCs#12 files, what > we need to do is first convert the sane Unicode rendition of the > password into the native locale charset (e.g. Windows-1252), then take > that bytestream and *pretend* it's ISO8859-1, and convert to UTF-16BE > to check the MAC. Then if that fails, take the same Windows-1252 > bytestream and *pretend* it's UTF-8, and convert *that* to UTF-16BE to > see if it works. Are you sure you want to know? I mean you already said "I wish I hadn't ask", and now you are bringing Windows into conversation :-) :-) :-) Yes, it's as bad as you can possibly imagine, and trouble is that there is no "right thing" to do *if* you formulate "keep old data accessible" as goal. I mean the right thing to do is to perform all due conversions to obtain locale-neutral big-endian UTF-16 (though it's not obvious if it's actually desired to bring in dependency on something like iconv into libcrypto), but trouble is that it will more than likely render old data inaccessible. That's why somewhat opportunistic approach is taken in this version, unfortunate as it is. Main goal is that given otherwise unchanged circumstances new version *has to* be able to read old data generated by previous version on the *same* system, irregardless how broken it is. At the same time a way to access and generate specification-compliant data is provided (on *ix it takes *an* UTF8 locale, which is default in absolute majority of cases bu now, and on Windows it's an *opt-in* environment variable for the time being). -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.
On Mon, 2016-08-29 at 23:01 +0200, Andy Polyakov wrote: > > > > So let's try a better example. The password is "ĂŻ" (U+0102 U+017b), > > and the locale (not that it *should* matter) is ISO8859-2. > When it comes to locale in *this* case you should rather wonder what > does your terminal emulator program do and how does it interact with > your shell. Because these two are responsible for composing the string > that OPENSSL_[asc|utf8]2uni gets. Right. And for the purpose of my example I have moved to eastern Europe and fallen through a time-warp into the 20th century, so I'm using an ISO8859-2 locale. > > The correct rendition is 01 02 01 7b. > Yes. And now note that it's passed as 'c4 82 c5 bb' to openssl pkcs12 as > argument or console input under an UTF-8 locale. Otherwise it would have > been passed as 'c3 af'. No. My locale is ISO8859-2 so the password "ĂŻ" *is* passed as 'c3 af'. Old OpenSSL will ignore the fact that the locale is ISO8859-2, and perform an ISO8859-1 to UCS16BE conversion using the doubly-misnamed OPENSSL_asc2uni() function. So it'll use '00 c3 00 af'. New OpenSSL will ignore the fact that the locale is ISO8859-2, and perform a UTF-8 to UCS16BE conversion using the only singly renamed OPENSSL_utf82uni() function. So it'll use '00 ef'. You had a bug because you made assumptions about the input data and didn't properly convert from the locale charset to UTF16BE as you should have done. Instead of fixing the bug, you just changed the assumption you made to one that's slightly more likely to be valid in today's world. Leaving the rest of us with *two* buggy behaviours to try to work around. > > > > The "old buggy OpenSSL" rendition, in the ISO8859-2 locale, would be > > to *convert* to ISO8859-2 (giving c3 af). > No, no conversion from UTF-8 to ISO8859-x or any other conversion was > *ever* performed. Nor is it performed now. It was and still is all about > how string is composed by *somebody else*. That's why I said that "we > assume that you don't change locale when you upgrade OpenSSL". I'm talking about how other libraries should attempt to read the PKCS#12 files created by OpenSSL. In my code I have the string "ĂŻ" which the user has provided as the password. It doesn't matter what charset it's in, as long as I *know* what charset it's in, and haven't wantonly thrown that information away and started making *assumptions* about how to interpret the stream of bytes. So in order to try to emulate the old OpenSSL bug and read the file, I need to convert to ISO8859-2, then "forget" that it's ISO8859-2 and treat it as ISO8859-1 for the purpose of converting to UTF16BE and trying to decrypt. Which gives me the '00 c3 00 af' as above. And in order to emulate the new OpenSSL bug and read the file, I need to convert to ISO8859-2, then "forget" that it's ISO8859-2 and treat it as UTF-8 for the purpose of converting to UTF16BE and trying to decrypt. Which gives me the '00 ef' that current OpenSSL will use. This latter buggy behaviour hasn't actually been released yet, has it? Is it too late to fix it properly? -- dwmw2 smime.p7s Description: S/MIME cryptographic signature -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.
> So let's try a better example. The password is "ĂŻ" (U+0102 U+017b), > and the locale (not that it *should* matter) is ISO8859-2. When it comes to locale in *this* case you should rather wonder what does your terminal emulator program do and how does it interact with your shell. Because these two are responsible for composing the string that OPENSSL_[asc|utf8]2uni gets. > The correct rendition is 01 02 01 7b. Yes. And now note that it's passed as 'c4 82 c5 bb' to openssl pkcs12 as argument or console input under an UTF-8 locale. Otherwise it would have been passed as 'c3 af'. > The "old buggy OpenSSL" rendition, in the ISO8859-2 locale, would be > to *convert* to ISO8859-2 (giving c3 af). No, no conversion from UTF-8 to ISO8859-x or any other conversion was *ever* performed. Nor is it performed now. It was and still is all about how string is composed by *somebody else*. That's why I said that "we assume that you don't change locale when you upgrade OpenSSL". > Then to interpret those bytes > as ISO8859-1 (in which they would mean "ï") and convert *that* to > UTF16LE to give 00 c3 00 af Previously OpenSSL would convert 'c4 82 c5 bb' to '00 c4 00 82 00 c5 00 bb'. Now it converts it to '01 02 01 7b', but attempts even '00 c4 00 82 00 c5 00 bb' for "old times sake". And for 'c3 af' question is if it's valid UTF-8 encoding. If it is (it is in this example), then it would attempt '00 ef' (in this example) and then '00 c3 00 af', and if not, then it would go straight to '00 c3 00 af'. -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.
On Mon, 2016-08-29 at 18:40 +0100, David Woodhouse wrote: > > So... let's have a password 'nałve', in a ISO8859-2 environment where > that is represented by the bytes 6e 61 b3 76 65. > > First I should try the correct version according to the spec: > 006e 0061 0142 0076 0065 > > Then we try the "OpenSSL assumed it was ISO8859-1" version: > 006e 0061 00b3 0076 0065 > > And finally we try the "OpenSSL assumed it was UTF-8" version: > 006e 0061 ... actually you *can't* convert that byte sequence "from" > UTF-8 since it isn't valid UTF-8. So what will new OpenSSL do here, > again? In this specific example (the byte stream is not valid UTF-8 it looks like OPENSSL_utf8touni() will assume it's actually ISO8859-1 and thus the final case will be identical to the previous one. So let's try a better example. The password is "ĂŻ" (U+0102 U+017b), and the locale (not that it *should* matter) is ISO8859-2. The correct rendition is 01 02 01 7b. The "old buggy OpenSSL" rendition, in the ISO8859-2 locale, would be to *convert* to ISO8859-2 (giving c3 af). Then to interpret those bytes as ISO8859-1 (in which they would mean "ï") and convert *that* to UTF16LE to give 00 c3 00 af The "new buggy OpenSSL" rendition, again in the ISO8859-2 locale, would again be to convert to ISO8859-2 (c3 af). Then to interpret those bytes as UTF-8 (in which they would mean "ï") and convert *that* to UTF16LE to give 00 ef. Right? -- dwmw2 smime.p7s Description: S/MIME cryptographic signature -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.
On Mon, 2016-08-29 at 15:25 +0200, Andy Polyakov wrote: > First of all. *Everything* that is said below (and what modifications in > question are about) applies to non-ASCII passwords. ASCII-only passwords > are not affected by this and keep working as they were. > > > > > > > > > commit 647ac8d3d7143e3721d55e1f57730b6f26e72fc9 > > > > > > OpenSSL versions before 1.1.0 didn't convert non-ASCII > > > UTF8 PKCS#12 passwords to Unicode correctly. > > > > > > To correctly decrypt older files, if MAC verification fails > > > with the supplied password attempt to use the broken format > > > which is compatible with earlier versions of OpenSSL. > > > > > > Reviewed-by: Richard Levitte> > > > Hm, this sounds like something that other crypto libraries also ought > > to try to work around. > > > > Please could you elaborate on the specific problem, and/or show a test > > case? > > Note that there is 80-test_pkcs12.t that refers to shibboleth.pfx. Thanks. > > I'm not quite sure how to interpret the patch itself. You pass the > > password through OPENSSL_asc2uni() and then OPENSSL_uni2utf8() — which > > is essentially converting ISO8859-1 to UTF-8. > > You make it sound like these two are called one after another. But > that's not what happens. It *tries* to access data with passwords > converted with OPENSSL_asc2uni *or* OPENSSL_utf82uni, effectively > independently of each other, in order to see which one is right. So that > you can *transparently* access old broken data, as well as > specification-compliant one. Hm... at line 541 of pkcs12.c we call PKCS12_verify_mac(…mpass…) with the original provided password. Which is in the locale-native character set, is it not? No attempt to convert to anything known. Then if that *fails* we do indeed convert it via OPENSSL_asc2uni() and OPENSSL_uni2utf8() to make 'badpass' and try again: } else if (!PKCS12_verify_mac(p12, mpass, -1)) { /* * May be UTF8 from previous version of OpenSSL: * convert to a UTF8 form which will translate * to the same Unicode password. */ unsigned char *utmp; int utmplen; utmp = OPENSSL_asc2uni(mpass, -1, NULL, ); if (utmp == NULL) goto end; badpass = OPENSSL_uni2utf8(utmp, utmplen); OPENSSL_free(utmp); if (!PKCS12_verify_mac(p12, badpass, -1)) { BIO_printf(bio_err, "Mac verify error: invalid password?\n"); ERR_print_errors(bio_err); goto end; } else { BIO_printf(bio_err, "Warning: using broken algorithm\n"); if (!twopass) cpass = badpass; } The shibboleth.pfx test seems to pass on the *first* call to PKCS12_verify_mac() — it *isn't* testing this fallback. If I add a space to the end of the correct password and stick a breakpoint on PKCS12_verify_mac(), I see the following calls: #0 PKCS12_verify_mac (p12=0x956e40, pass=0x956a30 "σύνθημα γνώρισμα ", passlen=-1) at crypto/pkcs12/p12_mutl.c:152 #1 0x00425567 in pkcs12_main (argc=0, argv=0x7fffdd90) at apps/pkcs12.c:541 And then it converts that string from ISO8859-1 (which it wasn't) to UTF-8, and calls PKCS12_verify_mac() again: #0 PKCS12_verify_mac (p12=0x956e40, pass=0x9597e0 "Ï\302\203Ï\302\215νθημα γνÏ\302\216Ï\302\201ιÏ\302\203μα ", passlen=-1) at crypto/pkcs12/p12_mutl.c:152 #1 0x004255fc in pkcs12_main (argc=0, argv=0x7fffdd90) at apps/pkcs12.c:554 > > > > So, if my password is "naïve". In UTF-8 that's 6e 61 c3 af 76 65, which > > is the correct sequence of bytes to use for the password? > > 00 6e 00 61 00 ef 00 76 00 65, big-endian UTF-16. Is that conversion done inside PKCS12_verify_mac()? Because we definitely aren't passing UTF-16-BE into PKCS12_verify_mac(). > > > > And you now convert that sequence of bytes to 6e 61 c3 83 c2 af 76 65 > > by assuming it's ISO8859-1 (which would be 'naïve') and converting to > > UTF-8? > > I don't follow. Why would it have to be converted to this sequence? That's what it's doing. Changing the example above and showing the same breakpoints as they get hit again... Starting program: /home/dwmw2/git/openssl/apps/openssl pkcs12 -noout -password pass:naïve -in test/shibboleth.pfx [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Breakpoint 1, PKCS12_verify_mac (p12=0x956e30, pass=0x956a30 "naïve", passlen=-1) at crypto/pkcs12/p12_mutl.c:152 152 if (p12->mac == NULL) { (gdb) x/7bx pass 0x956a30: 0x6e0x610xc30xaf0x760x650x00 (gdb) c Continuing. Breakpoint 1, PKCS12_verify_mac (p12=0x956e30, pass=0x959960 "naïve", passlen=-1) at crypto/pkcs12/p12_mutl.c:152 152 if (p12->mac == NULL) { (gdb) x/8bx pass 0x959960: 0x6e0x610xc30x83
Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.
First of all. *Everything* that is said below (and what modifications in question are about) applies to non-ASCII passwords. ASCII-only passwords are not affected by this and keep working as they were. >> commit 647ac8d3d7143e3721d55e1f57730b6f26e72fc9 >> >> OpenSSL versions before 1.1.0 didn't convert non-ASCII >> UTF8 PKCS#12 passwords to Unicode correctly. >> >> To correctly decrypt older files, if MAC verification fails >> with the supplied password attempt to use the broken format >> which is compatible with earlier versions of OpenSSL. >> >> Reviewed-by: Richard Levitte> > Hm, this sounds like something that other crypto libraries also ought > to try to work around. > > Please could you elaborate on the specific problem, and/or show a test > case? Note that there is 80-test_pkcs12.t that refers to shibboleth.pfx. > I'm not quite sure how to interpret the patch itself. You pass the > password through OPENSSL_asc2uni() and then OPENSSL_uni2utf8() — which > is essentially converting ISO8859-1 to UTF-8. You make it sound like these two are called one after another. But that's not what happens. It *tries* to access data with passwords converted with OPENSSL_asc2uni *or* OPENSSL_utf82uni, effectively independently of each other, in order to see which one is right. So that you can *transparently* access old broken data, as well as specification-compliant one. > So, if my password is "naïve". In UTF-8 that's 6e 61 c3 af 76 65, which > is the correct sequence of bytes to use for the password? 00 6e 00 61 00 ef 00 76 00 65, big-endian UTF-16. > And you now convert that sequence of bytes to 6e 61 c3 83 c2 af 76 65 > by assuming it's ISO8859-1 (which would be 'naïve') and converting to UTF-8? I don't follow. Why would it have to be converted to this sequence? > So... what was the bug that was actually being worked around? 6e 61 c3 af 76 65 was expanded to 00 6e 00 61 00 c3 00 af 00 76 00 65, in violation of specification. > That > older versions were converting from the local charset to UTF-8 twice in > a row? So you've implemented a "convert ISO8859-1 to UTF-8" fallback > which will cope with that in *some* locales but not all...? Yeah, something like that. Note that if you have created PKCS#12 file with a non-UTF8 locale, it's not given that you can read it with another locale, UTF-8 or not. This was *always* the case. And that's *not* what we try to address here. We assume that you don't change locale when you upgrade OpenSSL version. Ultimate goal is to make it compliant with specification and therefore interoperable with other compliant implementations. But we can't just switch, because then it stops interoperate with older OpenSSL versions. This is the reason for why it looks messy. It's about having it both ways... Though there is one ambiguity in this. Said interoperability assumes UTF-8 locale (on *ix, or OPENSSL_WIN32_UTF8 opt-in environment variable on Windows). I.e. it's not given that users that use non-UTF8 locale can actually interoperate with other implementations. It's assumed that such users are not actually interested to, and if they express interest, they will be advised to switch to UTF-8 locale and convert their data to interoperable format. Is this helpful? -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.
On Wed, 2016-08-24 at 18:55 +0100, Dr. Stephen Henson wrote: > commit 647ac8d3d7143e3721d55e1f57730b6f26e72fc9 > > OpenSSL versions before 1.1.0 didn't convert non-ASCII > UTF8 PKCS#12 passwords to Unicode correctly. > > To correctly decrypt older files, if MAC verification fails > with the supplied password attempt to use the broken format > which is compatible with earlier versions of OpenSSL. > > Reviewed-by: Richard LevitteHm, this sounds like something that other crypto libraries also ought to try to work around. Please could you elaborate on the specific problem, and/or show a test case? I'm not quite sure how to interpret the patch itself. You pass the password through OPENSSL_asc2uni() and then OPENSSL_uni2utf8() — which is essentially converting ISO8859-1 to UTF-8. So, if my password is "naïve". In UTF-8 that's 6e 61 c3 af 76 65, which is the correct sequence of bytes to use for the password? And you now convert that sequence of bytes to 6e 61 c3 83 c2 af 76 65 by assuming it's ISO8859-1 (which would be 'naïve') and converting to UTF-8? So... what was the bug that was actually being worked around? That older versions were converting from the local charset to UTF-8 twice in a row? So you've implemented a "convert ISO8859-1 to UTF-8" fallback which will cope with that in *some* locales but not all...? I don't really understand. Thanks for any light you can shed on it! /me goes off to add non-ASCII passwords to the growing torture test suite at http://git.infradead.org/users/dwmw2/openconnect.git/blob/HEAD:/tests/Makefile.am -- dwmw2 smime.p7s Description: S/MIME cryptographic signature -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev