Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.

2016-08-30 Thread David Woodhouse
On Tue, 2016-08-30 at 12:38 +0200, Andy Polyakov wrote:
> > So for other applications to try to read OpenSSL's PKCs#12 files, what
> > we need to do is first convert the sane Unicode rendition of the
> > password into the native locale charset (e.g. Windows-1252), then take
> > that bytestream and *pretend* it's ISO8859-1, and convert to UTF-16BE
> > to check the MAC. Then if that fails, take the same Windows-1252
> > bytestream and *pretend* it's UTF-8, and convert *that* to UTF-16BE to
> > see if it works.
> 
> Are you sure you want to know? I mean you already said "I wish I hadn't
> ask", and now you are bringing Windows into conversation :-) :-) :-)

I concede, I am a masochist. :)

> Yes, it's as bad as you can possibly imagine, and trouble is that there
> is no "right thing" to do *if* you formulate "keep old data accessible"
> as goal.

Yeah, if you want to be able to create PKCS#12 files that'll work (in
the same locale) with older versions of OpenSSL, you're kind of stuck.

I can live with that historical accident, and the workaround of
"convert to the locale charset, then pretend that's ISO8859-1 and try
again". I can even live with the fact that, for the reasons you've
stated, OpenSSL *still* produces files which need that workaround.

But I *really* wish you hadn't added *another* bug, and required
another workaround

> At the same time a way to access and generate
> specification-compliant data is provided (on *ix it takes *an* UTF8
> locale, which is default in absolute majority of cases bu now, and on
> Windows it's an *opt-in* environment variable for the time being).

... so instead of doing the UTF-8 thing whenever the incoming
bytestream happens to be interpretable as valid UTF-8, why not do it
only if LC_CTYPE actually *is* UTF-8?

So my (admittedly contrived and unlikely) example of "ĂŻ" in an
ISO8859-2 locale would only continue to do the *old* wrong thing, and
not a *new* wrong thing that needs an additional workaround :)

-- 
dwmw2

smime.p7s
Description: S/MIME cryptographic signature
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.

2016-08-30 Thread Andy Polyakov
> Hm, words fail me.
> 
> Well, that's not entirely true. But *polite* words fail me...

:-)

> Let me try to understand this... you have always ignored, and still
> ignore, the actual LC_CTYPE which tells you the character set in which
> the password was provided from the user.
> 
> You *used* to convert it through your horribly misnamed
> OPENSSL_asc2uni() function, which actually converts ISO8859-1 to
> UTF16BE by simply expanding it and inserting 0x00 between each byte of
> the original. So effectively you were "assuming" the input was
> ISO8859-1.

Unfortunately yes.

> Now you assume it's UTF-8, and convert it with OPENSSL_utf8touni().
> (And now what happens when the locale input *isn't* valid UTF-8,
> because it's a legacy 8-bit charset? That conversion should fail,
> right?)

Right. Though more correct formulation probably "is likely to fail"
instead of "should fail".

> Your latest workaround (from which I started this thread) is addressing
> the first problem *purely* for the case of the UTF-8 locale. It checks
> for the "we treated UTF-8 as if it were ISO8859-1" case, which is what
> leads to the 006e 0061 00c3 00af 0076 0065 example you gave above.

Yes.

> But you *haven't* actually implemented any workaround for the other
> whole set of "we treated locale X as if it were ISO8859-1" bugs
> that your original code had. Or the whole set of "we treated local
> X as if it were UTF-8" bugs that the new code has.

Yes.

> So for other applications to try to read OpenSSL's PKCs#12 files, what
> we need to do is first convert the sane Unicode rendition of the
> password into the native locale charset (e.g. Windows-1252), then take
> that bytestream and *pretend* it's ISO8859-1, and convert to UTF-16BE
> to check the MAC. Then if that fails, take the same Windows-1252
> bytestream and *pretend* it's UTF-8, and convert *that* to UTF-16BE to
> see if it works.

Are you sure you want to know? I mean you already said "I wish I hadn't
ask", and now you are bringing Windows into conversation :-) :-) :-)
Yes, it's as bad as you can possibly imagine, and trouble is that there
is no "right thing" to do *if* you formulate "keep old data accessible"
as goal. I mean the right thing to do is to perform all due conversions
to obtain locale-neutral big-endian UTF-16 (though it's not obvious if
it's actually desired to bring in dependency on something like iconv
into libcrypto), but trouble is that it will more than likely render old
data inaccessible. That's why somewhat opportunistic approach is taken
in this version, unfortunate as it is. Main goal is that given otherwise
unchanged circumstances new version *has to* be able to read old data
generated by previous version on the *same* system, irregardless how
broken it is. At the same time a way to access and generate
specification-compliant data is provided (on *ix it takes *an* UTF8
locale, which is default in absolute majority of cases bu now, and on
Windows it's an *opt-in* environment variable for the time being).

-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.

2016-08-30 Thread David Woodhouse
On Mon, 2016-08-29 at 23:01 +0200, Andy Polyakov wrote:
> > 
> > So let's try a better example. The password is "ĂŻ" (U+0102 U+017b),
> > and the locale (not that it *should* matter) is ISO8859-2.
> When it comes to locale in *this* case you should rather wonder what
> does your terminal emulator program do and how does it interact with
> your shell. Because these two are responsible for composing the string
> that OPENSSL_[asc|utf8]2uni gets.

Right. And for the purpose of my example I have moved to eastern Europe
and fallen through a time-warp into the 20th century, so I'm using an
ISO8859-2 locale.

> > The correct rendition is 01 02 01 7b.
> Yes. And now note that it's passed as 'c4 82 c5 bb' to openssl pkcs12 as
> argument or console input under an UTF-8 locale. Otherwise it would have
> been passed as 'c3 af'.

No. My locale is ISO8859-2 so the password "ĂŻ" *is* passed as 'c3 af'.

Old OpenSSL will ignore the fact that the locale is ISO8859-2, and
perform an ISO8859-1 to UCS16BE conversion using the doubly-misnamed
OPENSSL_asc2uni() function. So it'll use '00 c3 00 af'.

New OpenSSL will ignore the fact that the locale is ISO8859-2, and
perform a UTF-8 to UCS16BE conversion using the only singly renamed
OPENSSL_utf82uni() function. So it'll use '00 ef'.

You had a bug because you made assumptions about the input data and
didn't properly convert from the locale charset to UTF16BE as you
should have done. Instead of fixing the bug, you just changed the
assumption you made to one that's slightly more likely to be valid in
today's world.

Leaving the rest of us with *two* buggy behaviours to try to work
around.

> > 
> > The "old buggy OpenSSL" rendition, in the ISO8859-2 locale, would be
> > to *convert* to ISO8859-2 (giving c3 af).
> No, no conversion from UTF-8 to ISO8859-x or any other conversion was
> *ever* performed. Nor is it performed now. It was and still is all about
> how string is composed by *somebody else*. That's why I said that "we
> assume that you don't change locale when you upgrade OpenSSL".

I'm talking about how other libraries should attempt to read the
PKCS#12 files created by OpenSSL. In my code I have the string "ĂŻ"
which the user has provided as the password. It doesn't matter what
charset it's in, as long as I *know* what charset it's in, and haven't
wantonly thrown that information away and started making *assumptions*
about how to interpret the stream of bytes.

So in order to try to emulate the old OpenSSL bug and read the file, I
need to convert to ISO8859-2, then "forget" that it's ISO8859-2 and
treat it as ISO8859-1 for the purpose of converting to UTF16BE and
trying to decrypt. Which gives me the '00 c3 00 af' as above.

And in order to emulate the new OpenSSL bug and read the file, I need
to convert to ISO8859-2, then "forget" that it's ISO8859-2 and treat it
as UTF-8 for the purpose of converting to UTF16BE and trying to
decrypt. Which gives me the '00 ef' that current OpenSSL will use.

This latter buggy behaviour hasn't actually been released yet, has it?
Is it too late to fix it properly?

-- 
dwmw2



smime.p7s
Description: S/MIME cryptographic signature
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.

2016-08-29 Thread Andy Polyakov
> So let's try a better example. The password is "ĂŻ" (U+0102 U+017b),
> and the locale (not that it *should* matter) is ISO8859-2.

When it comes to locale in *this* case you should rather wonder what
does your terminal emulator program do and how does it interact with
your shell. Because these two are responsible for composing the string
that OPENSSL_[asc|utf8]2uni gets.

> The correct rendition is 01 02 01 7b.

Yes. And now note that it's passed as 'c4 82 c5 bb' to openssl pkcs12 as
argument or console input under an UTF-8 locale. Otherwise it would have
been passed as 'c3 af'.

> The "old buggy OpenSSL" rendition, in the ISO8859-2 locale, would be
> to *convert* to ISO8859-2 (giving c3 af).

No, no conversion from UTF-8 to ISO8859-x or any other conversion was
*ever* performed. Nor is it performed now. It was and still is all about
how string is composed by *somebody else*. That's why I said that "we
assume that you don't change locale when you upgrade OpenSSL".

> Then to interpret those bytes
> as ISO8859-1 (in which they would mean "ï") and convert *that* to
> UTF16LE to give 00 c3 00 af

Previously OpenSSL would convert 'c4 82 c5 bb' to '00 c4 00 82 00 c5 00
bb'. Now it converts it to '01 02 01 7b', but attempts even '00 c4 00 82
00 c5 00 bb' for "old times sake". And for 'c3 af' question is if it's
valid UTF-8 encoding. If it is (it is in this example), then it would
attempt '00 ef' (in this example) and then '00 c3 00 af', and if not,
then it would go straight to '00 c3 00 af'.


-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.

2016-08-29 Thread David Woodhouse
On Mon, 2016-08-29 at 18:40 +0100, David Woodhouse wrote:
> 
> So... let's have a password 'nałve', in a ISO8859-2 environment where
> that is represented by the bytes 6e 61 b3 76 65.
> 
> First I should try the correct version according to the spec:
>  006e 0061 0142 0076 0065
> 
> Then we try the "OpenSSL assumed it was ISO8859-1" version:
>  006e 0061 00b3 0076 0065
> 
> And finally we try the "OpenSSL assumed it was UTF-8" version:
>  006e 0061 ... actually you *can't* convert that byte sequence "from"
> UTF-8 since it isn't valid UTF-8. So what will new OpenSSL do here,
> again?

In this specific example (the byte stream is not valid UTF-8 it looks
like OPENSSL_utf8touni() will assume it's actually ISO8859-1 and thus
the final case will be identical to the previous one.

So let's try a better example. The password is "ĂŻ" (U+0102 U+017b),
and the locale (not that it *should* matter) is ISO8859-2.

The correct rendition is 01 02 01 7b.

The "old buggy OpenSSL" rendition, in the ISO8859-2 locale, would be
to *convert* to ISO8859-2 (giving c3 af). Then to interpret those bytes
as ISO8859-1 (in which they would mean "ï") and convert *that* to
UTF16LE to give 00 c3 00 af

The "new buggy OpenSSL" rendition, again in the ISO8859-2 locale, would
again be to convert to ISO8859-2 (c3 af). Then to interpret those bytes
as UTF-8 (in which they would mean "ï") and convert *that* to UTF16LE
to give 00 ef.

Right?

-- 
dwmw2




smime.p7s
Description: S/MIME cryptographic signature
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.

2016-08-29 Thread David Woodhouse
On Mon, 2016-08-29 at 15:25 +0200, Andy Polyakov wrote:
> First of all. *Everything* that is said below (and what modifications in
> question are about) applies to non-ASCII passwords. ASCII-only passwords
> are not affected by this and keep working as they were.
> 
> > 
> > > 
> > > commit 647ac8d3d7143e3721d55e1f57730b6f26e72fc9
> > > 
> > > OpenSSL versions before 1.1.0 didn't convert non-ASCII
> > > UTF8 PKCS#12 passwords to Unicode correctly.
> > > 
> > > To correctly decrypt older files, if MAC verification fails
> > > with the supplied password attempt to use the broken format
> > > which is compatible with earlier versions of OpenSSL.
> > > 
> > > Reviewed-by: Richard Levitte 
> > 
> > Hm, this sounds like something that other crypto libraries also ought
> > to try to work around. 
> > 
> > Please could you elaborate on the specific problem, and/or show a test
> > case?
> 
> Note that there is 80-test_pkcs12.t that refers to shibboleth.pfx.

Thanks.

> > I'm not quite sure how to interpret the patch itself. You pass the
> > password through OPENSSL_asc2uni() and then OPENSSL_uni2utf8() — which
> > is essentially converting ISO8859-1 to UTF-8.
> 
> You make it sound like these two are called one after another. But
> that's not what happens. It *tries* to access data with passwords
> converted with OPENSSL_asc2uni *or* OPENSSL_utf82uni, effectively
> independently of each other, in order to see which one is right. So that
> you can *transparently* access old broken data, as well as
> specification-compliant one.

Hm... at line 541 of pkcs12.c we call PKCS12_verify_mac(…mpass…) with
the original provided password. Which is in the locale-native character
set, is it not? No attempt to convert to anything known.

Then if that *fails* we do indeed convert it via OPENSSL_asc2uni() and
OPENSSL_uni2utf8() to make 'badpass' and try again:

} else if (!PKCS12_verify_mac(p12, mpass, -1)) {
/*
 * May be UTF8 from previous version of OpenSSL:
 * convert to a UTF8 form which will translate
 * to the same Unicode password.
 */
unsigned char *utmp;
int utmplen;
utmp = OPENSSL_asc2uni(mpass, -1, NULL, );
if (utmp == NULL)
goto end;
badpass = OPENSSL_uni2utf8(utmp, utmplen);
OPENSSL_free(utmp);
if (!PKCS12_verify_mac(p12, badpass, -1)) {
BIO_printf(bio_err, "Mac verify error: invalid password?\n");
ERR_print_errors(bio_err);
goto end;
} else {
BIO_printf(bio_err, "Warning: using broken algorithm\n");
if (!twopass)
cpass = badpass;
}

The shibboleth.pfx test seems to pass on the *first* call to
PKCS12_verify_mac() — it *isn't* testing this fallback. If I add a
space to the end of the correct password and stick a breakpoint on
PKCS12_verify_mac(), I see the following calls:

#0  PKCS12_verify_mac (p12=0x956e40, 
pass=0x956a30 "σύνθημα γνώρισμα ", passlen=-1)
at crypto/pkcs12/p12_mutl.c:152
#1  0x00425567 in pkcs12_main (argc=0, argv=0x7fffdd90)
at apps/pkcs12.c:541


And then it converts that string from ISO8859-1 (which it wasn't) to
UTF-8, and calls PKCS12_verify_mac() again:

#0  PKCS12_verify_mac (p12=0x956e40, 
pass=0x9597e0 "Ï\302\203Ï\302\215νθημα
γνÏ\302\216Ï\302\201ιÏ\302\203μα ", passlen=-1) at
crypto/pkcs12/p12_mutl.c:152
#1  0x004255fc in pkcs12_main (argc=0, argv=0x7fffdd90)
at apps/pkcs12.c:554

> > 
> > So, if my password is "naïve". In UTF-8 that's 6e 61 c3 af 76 65, which
> > is the correct sequence of bytes to use for the password?
> 
> 00 6e 00 61 00 ef 00 76 00 65, big-endian UTF-16.

Is that conversion done inside PKCS12_verify_mac()? Because we
definitely aren't passing UTF-16-BE into PKCS12_verify_mac().

> > 
> > And you now convert that sequence of bytes to 6e 61 c3 83 c2 af 76 65
> > by assuming it's ISO8859-1 (which would be 'naïve') and converting to 
> > UTF-8?
> 
> I don't follow. Why would it have to be converted to this sequence?

That's what it's doing. Changing the example above and showing the same
breakpoints as they get hit again...

Starting program: /home/dwmw2/git/openssl/apps/openssl pkcs12 -noout -password 
pass:naïve -in test/shibboleth.pfx
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, PKCS12_verify_mac (p12=0x956e30, pass=0x956a30 "naïve", 
passlen=-1) at crypto/pkcs12/p12_mutl.c:152
152 if (p12->mac == NULL) {
(gdb) x/7bx pass
0x956a30:   0x6e0x610xc30xaf0x760x650x00
(gdb) c
Continuing.

Breakpoint 1, PKCS12_verify_mac (p12=0x956e30, pass=0x959960 "naïve", 
passlen=-1) at crypto/pkcs12/p12_mutl.c:152
152 if (p12->mac == NULL) {
(gdb) x/8bx pass
0x959960:   0x6e0x610xc30x83 

Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.

2016-08-29 Thread Andy Polyakov
First of all. *Everything* that is said below (and what modifications in
question are about) applies to non-ASCII passwords. ASCII-only passwords
are not affected by this and keep working as they were.

>> commit 647ac8d3d7143e3721d55e1f57730b6f26e72fc9
>>
>> OpenSSL versions before 1.1.0 didn't convert non-ASCII
>> UTF8 PKCS#12 passwords to Unicode correctly.
>>
>> To correctly decrypt older files, if MAC verification fails
>> with the supplied password attempt to use the broken format
>> which is compatible with earlier versions of OpenSSL.
>>
>> Reviewed-by: Richard Levitte 
> 
> Hm, this sounds like something that other crypto libraries also ought
> to try to work around. 
> 
> Please could you elaborate on the specific problem, and/or show a test
> case?

Note that there is 80-test_pkcs12.t that refers to shibboleth.pfx.

> I'm not quite sure how to interpret the patch itself. You pass the
> password through OPENSSL_asc2uni() and then OPENSSL_uni2utf8() — which
> is essentially converting ISO8859-1 to UTF-8.

You make it sound like these two are called one after another. But
that's not what happens. It *tries* to access data with passwords
converted with OPENSSL_asc2uni *or* OPENSSL_utf82uni, effectively
independently of each other, in order to see which one is right. So that
you can *transparently* access old broken data, as well as
specification-compliant one.

> So, if my password is "naïve". In UTF-8 that's 6e 61 c3 af 76 65, which
> is the correct sequence of bytes to use for the password?

00 6e 00 61 00 ef 00 76 00 65, big-endian UTF-16.

> And you now convert that sequence of bytes to 6e 61 c3 83 c2 af 76 65
> by assuming it's ISO8859-1 (which would be 'naïve') and converting to UTF-8?

I don't follow. Why would it have to be converted to this sequence?

> So... what was the bug that was actually being worked around?

6e 61 c3 af 76 65 was expanded to 00 6e 00 61 00 c3 00 af 00 76 00 65,
in violation of specification.

> That
> older versions were converting from the local charset to UTF-8 twice in
> a row? So you've implemented a "convert ISO8859-1 to UTF-8" fallback
> which will cope with that in *some* locales but not all...?

Yeah, something like that. Note that if you have created PKCS#12 file
with a non-UTF8 locale, it's not given that you can read it with another
locale, UTF-8 or not. This was *always* the case. And that's *not* what
we try to address here. We assume that you don't change locale when you
upgrade OpenSSL version. Ultimate goal is to make it compliant with
specification and therefore interoperable with other compliant
implementations. But we can't just switch, because then it stops
interoperate with older OpenSSL versions. This is the reason for why it
looks messy. It's about having it both ways... Though there is one
ambiguity in this. Said interoperability assumes UTF-8 locale (on *ix,
or OPENSSL_WIN32_UTF8 opt-in environment variable on Windows). I.e. it's
not given that users that use non-UTF8 locale can actually interoperate
with other implementations. It's assumed that such users are not
actually interested to, and if they express interest, they will be
advised to switch to UTF-8 locale and convert their data to
interoperable format.

Is this helpful?

-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [PATCH] Support broken PKCS#12 key generation.

2016-08-28 Thread David Woodhouse
On Wed, 2016-08-24 at 18:55 +0100, Dr. Stephen Henson wrote:
> commit 647ac8d3d7143e3721d55e1f57730b6f26e72fc9
>
> OpenSSL versions before 1.1.0 didn't convert non-ASCII
> UTF8 PKCS#12 passwords to Unicode correctly.
> 
> To correctly decrypt older files, if MAC verification fails
> with the supplied password attempt to use the broken format
> which is compatible with earlier versions of OpenSSL.
> 
> Reviewed-by: Richard Levitte 

Hm, this sounds like something that other crypto libraries also ought
to try to work around. 

Please could you elaborate on the specific problem, and/or show a test
case?

I'm not quite sure how to interpret the patch itself. You pass the
password through OPENSSL_asc2uni() and then OPENSSL_uni2utf8() — which
is essentially converting ISO8859-1 to UTF-8.

So, if my password is "naïve". In UTF-8 that's 6e 61 c3 af 76 65, which
is the correct sequence of bytes to use for the password?

And you now convert that sequence of bytes to 6e 61 c3 83 c2 af 76 65
by assuming it's ISO8859-1 (which would be 'naïve') and converting to
UTF-8?

So... what was the bug that was actually being worked around? That
older versions were converting from the local charset to UTF-8 twice in
a row? So you've implemented a "convert ISO8859-1 to UTF-8" fallback
which will cope with that in *some* locales but not all...? I don't
really understand.

Thanks for any light you can shed on it!

/me goes off to add non-ASCII passwords to the growing torture test
suite at
http://git.infradead.org/users/dwmw2/openconnect.git/blob/HEAD:/tests/Makefile.am

-- 
dwmw2




smime.p7s
Description: S/MIME cryptographic signature
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev