Re: [openssl-dev] PBE_UNICODE

2015-11-23 Thread Dmitry Belyavsky
Dear Andy,

On Fri, Nov 20, 2015 at 10:45 PM, Andy Polyakov  wrote:

>
> > The test example was provide by the authors of specification. There are
> > also examples in the document. May be it will be useful.
>
> We are apparently talking about slightly different things. Well, they
> are somewhat related, but not quite the same. It's just additional angle
> to cover.
>

Ok. Let's try to collect all the things.

1. We have a lack of Internet-wide specification regarding processing the
multibyte characters in PKCS#12.
2. We have the openssl code with 2 variants of such processing that can be
switched by the PBE_UNICODE define.
3. We have a "Russian" variant of specification.

I am interested mostly in compatibility between the openssl implementation
and "Russian" one when the password contains ASCII or Cyrillic characters.

How can I help?

-- 
SY, Dmitry Belyavsky
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] PBE_UNICODE

2015-11-20 Thread Andy Polyakov
>> > I do not know whether the authors of the CSP have implemented their own
>> > mechanism of transforming the password or used any provided by the
>> > Windows system default.
>>
>> What are chances that they too used same formally incorrect approach?
> 
> I think that they use the system method, if any, because it means they
> do not increase their work :-)

But there are couple of ways to do it with "system methods", and as far
as I can tell neither should be interoperable with what OpenSSL pkcs12
command-line utility does...

>> ??? So suggestion is leave it as it is? Well, given the presented
>> evidence doing the right thing should break things for you. But does it
>> mean that one can/should be excused from getting things right?
> 
> https://tools.ietf.org/html/rfc7292#appendix-B.1 says:
> 
>In this specification, however, all passwords are created from
>BMPStrings with a NULL terminator.  This means that each character in
>the original BMPString is encoded in 2 bytes in big-endian format
>(most-significant byte first).  There are no Unicode byte order
>marks.  The 2 bytes produced from the last character in the BMPString
>are followed by 2 additional bytes with the value 0x00.
> 
> As I understand the text herein before, there is no ultimate specification.

Correct. At the same time it should be noted that there is explicit
reference to 2-byte encoding and Unicode. Well, one can argue that when
they mention Unicode they refer to [lack of] byte order mark, and byte
order mark only, and it has nothing to do with what that 2nd byte is
used for. [Or shall we say "1st" as we are looking at big-endian?] But
at the same time they don't say that the additional byte has to be zero.
The only sensible and natural thing to do is to use these 2 bytes for
storing UTF-16 character, not to mechanically inject zeros into UTF-8
encoded string as now... Of course one can say that latter, essentially
unnatural way, is de-facto standard and we are stuck with it. But that
would have to mean that it has to be harmonized on Windows. I refer to
how OpenSSL pkcs12 works on Windows, not what somebody else does. In
other words there is a dilemma. A. Choose what is right thing to do and
act accordingly. B. Accept status quo with Unix as reference and
harmonize Windows. Alternative to dilemma is to explicitly disclaim
support for non-ASCII characters in pkcs12 utility.

___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] PBE_UNICODE

2015-11-20 Thread Andy Polyakov
On 11/20/15 10:20, Dmitry Belyavsky wrote:
> Dear Andy,
> 
> On Fri, Nov 20, 2015 at 12:08 PM, Andy Polyakov  > wrote:
> 
> > ... And on Windows it's even worse. As it stands now
> > even passing non-ASCII strings as command-line argument [and presumably
> > at prompt] is not an option.
> 
> This is not entirely true. Whether or not one can pass non-ASCII strings
> as command-line argument is language-specific. Or rather code
> page-specific in Windowish. With Asian languages you're really out of
> luck, while smaller alphabets can work out (but not mixed) if system
> locale matches expectations.
> 
> 
> I understand that there should be problems with Windows.

To eliminate possibility of misunderstanding. Claim is not limited to
problems with Windows, but that OpenSSL handles non-ASCII characters in
apparently non-interoperable way. On all systems. I referred to Windows
as complicating factor in the problem, not whole/separate problem.

> So the test PKCS12 object was created using Windows using a
> GOST-providing CSP.
> 
> I do not know whether the authors of the CSP have implemented their own
> mechanism of transforming the password or used any provided by the
> Windows system default.

What are chances that they too used same formally incorrect approach?

> But in fact the openssl being built without defining the PBE_UNICODE
> macros was able to parse the test PKCS12.

Right. Doing otherwise will put burden of big-endian UTF-16 conversion
on caller and chances that caller gets it right are low.

> Thank you!

??? So suggestion is leave it as it is? Well, given the presented
evidence doing the right thing should break things for you. But does it
mean that one can/should be excused from getting things right?

___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] PBE_UNICODE

2015-11-20 Thread Dmitry Belyavsky
Dear Andy,

On Fri, Nov 20, 2015 at 1:48 PM, Andy Polyakov  wrote:
>
> > I understand that there should be problems with Windows.
>
> To eliminate possibility of misunderstanding. Claim is not limited to
> problems with Windows, but that OpenSSL handles non-ASCII characters in
> apparently non-interoperable way. On all systems. I referred to Windows
> as complicating factor in the problem, not whole/separate problem.


Yes, I understand it.

>
>
> > So the test PKCS12 object was created using Windows using a
> > GOST-providing CSP.
> >
> > I do not know whether the authors of the CSP have implemented their own
> > mechanism of transforming the password or used any provided by the
> > Windows system default.
>
> What are chances that they too used same formally incorrect approach?


I think that they use the system method, if any, because it means they do
not increase their work :-)

>
> > But in fact the openssl being built without defining the PBE_UNICODE
> > macros was able to parse the test PKCS12.
>
> Right. Doing otherwise will put burden of big-endian UTF-16 conversion
> on caller and chances that caller gets it right are low.


Ok.

>
>
> > Thank you!
>
> ??? So suggestion is leave it as it is? Well, given the presented
> evidence doing the right thing should break things for you. But does it
> mean that one can/should be excused from getting things right?

https://tools.ietf.org/html/rfc7292#appendix-B.1 says:

The underlying password-based encryption methods in PKCS #5 v2.1 view
   passwords (and salt) as being simple byte strings.
...
What's left unspecified in the above paragraph is precisely where the
   byte string representing a password comes from.  (This is not an
   issue with salt strings, since they are supplied as a password-based
   encryption (or authentication) parameter.)  PKCS #5 v2.1 says: "[...]
   a password is considered to be an octet string of arbitrary length
   whose interpretation as a text string is unspecified.  In the
   interest of interoperability, however, it is recommended that
   applications follow some common text encoding rules.  ASCII and UTF-8
   are two possibilities."

   In this specification, however, all passwords are created from
   BMPStrings with a NULL terminator.  This means that each character in
   the original BMPString is encoded in 2 bytes in big-endian format
   (most-significant byte first).  There are no Unicode byte order
   marks.  The 2 bytes produced from the last character in the BMPString
   are followed by 2 additional bytes with the value 0x00.

As I understand the text herein before, there is no ultimate specification.

So I would prefer a set of options be specified by the caller with a
reasonable default value.
But as I do not have enough PKCS#12 from real-life sources, I can't predict
this default value.

Currently the openssl is in "works for me" state. To propose changes, I
need to have a set of mismatching PKCS#12 files.

--
SY, Dmitry Belyavsky
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] PBE_UNICODE

2015-11-20 Thread Kurt Roeckx
On Thu, Nov 19, 2015 at 11:16:23PM +0100, Andy Polyakov wrote:
> 
> The way I read PKCS12 the string should be big-endian UTF-16 one.
[...]
> Correct procedure should be to convert it to wchar_t and
> then ensure correct endianness.

Please note that wchar_t itself might not have any relation with
UTF.  You should explictly convert from the locale charmap to
UTF16-BE.

Depending on the OS, there are functions available to ask what the
current encoding is and convert between encodings.


Kurt

___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] PBE_UNICODE

2015-11-20 Thread Dmitry Belyavsky
Dear Andy,

On Fri, Nov 20, 2015 at 4:51 PM, Andy Polyakov  wrote:



>
> >> ??? So suggestion is leave it as it is? Well, given the presented
> >> evidence doing the right thing should break things for you. But does it
> >> mean that one can/should be excused from getting things right?
> >
> > https://tools.ietf.org/html/rfc7292#appendix-B.1 says:
> >
> >In this specification, however, all passwords are created from
> >BMPStrings with a NULL terminator.  This means that each character in
> >the original BMPString is encoded in 2 bytes in big-endian format
> >(most-significant byte first).  There are no Unicode byte order
> >marks.  The 2 bytes produced from the last character in the BMPString
> >are followed by 2 additional bytes with the value 0x00.
> >
> > As I understand the text herein before, there is no ultimate
> specification.
>
> Correct. At the same time it should be noted that there is explicit
> reference to 2-byte encoding and Unicode. Well, one can argue that when
> they mention Unicode they refer to [lack of] byte order mark, and byte
> order mark only, and it has nothing to do with what that 2nd byte is
> used for. [Or shall we say "1st" as we are looking at big-endian?] But
> at the same time they don't say that the additional byte has to be zero.
> The only sensible and natural thing to do is to use these 2 bytes for
> storing UTF-16 character, not to mechanically inject zeros into UTF-8
> encoded string as now... Of course one can say that latter, essentially
> unnatural way, is de-facto standard and we are stuck with it. But that
> would have to mean that it has to be harmonized on Windows. I refer to
> how OpenSSL pkcs12 works on Windows, not what somebody else does. In
> other words there is a dilemma. A. Choose what is right thing to do and
> act accordingly. B. Accept status quo with Unix as reference and
> harmonize Windows. Alternative to dilemma is to explicitly disclaim
> support for non-ASCII characters in pkcs12 utility.
>

There is a specification in Russian,
http://tk26.ru/methods/containers_v1/Addition_to_PKCS8_v1_0.pdf

It says:
"The password Psw should be represented in UTF-8 format without trailing
zero byte and passed as the P element of the PBKDF2 algorithm"

The test example was provide by the authors of specification. There are
also examples in the document. May be it will be useful.


-- 
SY, Dmitry Belyavsky
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] PBE_UNICODE

2015-11-20 Thread Andy Polyakov
>> The way I read PKCS12 the string should be big-endian UTF-16 one.
> [...]
>> Correct procedure should be to convert it to wchar_t and
>> then ensure correct endianness.
> 
> Please note that wchar_t itself might not have any relation with
> UTF.  You should explictly convert from the locale charmap to
> UTF16-BE.

Right, it was kind of sloppy formulation.

> Depending on the OS, there are functions available to ask what the
> current encoding is and convert between encodings.

And question also is who should be responsible for the conversion. For
example in order to make it work you might have to mess up locale, and a
library in general is hardly the right place, and libcrypto in
particular is terrible place to do that. Nor does one want to create
dependency on library like iconv, not in libcrypto. In other words I'd
say that it's not unreasonable to specify which input do we expect in
libcrypto and put whole responsibility to ensure that input is right on
caller. Formally one can say nothing needs to be changed in libcrypto,
indeed there is _asc and _uni, i.e. ASCII and verbatim implying
big-endian UTF-16. Though this is not an excuse for not doing anything
about pkcs12 command-line utility... At the same time one can still
argue that since there is a convenience _asc interface, there ought to
be convenience utf8 one (and as already implied not dependent on current
locale?)...

___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] PBE_UNICODE

2015-11-20 Thread Andy Polyakov
> There is a specification in
> Russian, 
> http://tk26.ru/methods/containers_v1/Addition_to_PKCS8_v1_0.pdf
> 
> It says:
> "The password Psw should be represented in UTF-8 format without trailing
> zero byte and passed as the P element of the PBKDF2 algorithm"

Yeah, but this describes specific case, and has no "effect" on it being
not really specified in general case.

> The test example was provide by the authors of specification. There are
> also examples in the document. May be it will be useful.

We are apparently talking about slightly different things. Well, they
are somewhat related, but not quite the same. It's just additional angle
to cover.

On related note I had no problem exporting pkcs12 file protected with
password with non-ASCII letters from Windows and then failing to read it
on Linux.

___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] PBE_UNICODE

2015-11-20 Thread Andy Polyakov
> ... And on Windows it's even worse. As it stands now
> even passing non-ASCII strings as command-line argument [and presumably
> at prompt] is not an option.

This is not entirely true. Whether or not one can pass non-ASCII strings
as command-line argument is language-specific. Or rather code
page-specific in Windowish. With Asian languages you're really out of
luck, while smaller alphabets can work out (but not mixed) if system
locale matches expectations.

___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] PBE_UNICODE

2015-11-20 Thread Dmitry Belyavsky
Dear Andy,

On Fri, Nov 20, 2015 at 12:08 PM, Andy Polyakov  wrote:

> > ... And on Windows it's even worse. As it stands now
> > even passing non-ASCII strings as command-line argument [and presumably
> > at prompt] is not an option.
>
> This is not entirely true. Whether or not one can pass non-ASCII strings
> as command-line argument is language-specific. Or rather code
> page-specific in Windowish. With Asian languages you're really out of
> luck, while smaller alphabets can work out (but not mixed) if system
> locale matches expectations.
>

I understand that there should be problems with Windows.
So the test PKCS12 object was created using Windows using a GOST-providing
CSP.

I do not know whether the authors of the CSP have implemented their own
mechanism of transforming the password or used any provided by the Windows
system default.

But in fact the openssl being built without defining the PBE_UNICODE macros
was able to parse the test PKCS12.

Thank you!

-- 
SY, Dmitry Belyavsky
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] PBE_UNICODE

2015-11-19 Thread Andy Polyakov
Hi,

> I use the openssl 1.0.2d.
> 
> There is a commented out definition of the PBE_UNICODE define in the
> file pkcs12.h
> I expected it to be necessary for correct processing of the Cyrillic
> symbols in PKCS12 passwords, but my test shows that the password is
> correctly processed when the PBE_UNICODE is undefined and locale is set
> to ru_RU.utf8.
> 
> Do I miss something or this variable and corresponding #ifdef may be
> eliminated?

What is "correctly"? PKCS12 is about interoperability and just because
it's consistent with itself doesn't automatically mean it's actually
interoperable. The way I read PKCS12 the string should be big-endian
UTF-16 one. But what happens now? The string simply gets expanded as if
it was plain ASCII. But is it right for UTF8 string that you're surely
passing? No, correct procedure should be to convert it to wchar_t and
then ensure correct endianness. In other words attempt to pass non-ASCII
string at command line or prompt would not do the right thing.

But it should also be recognized that deploying mbrtowc in _asc would
only be part of the solution. Because interoperability is also about
multiple operating systems. I mean we have to consider what happens on
other OS, e.g. Windows. And on Windows it's even worse. As it stands now
even passing non-ASCII strings as command-line argument [and presumably
at prompt] is not an option.

Bottom line is that one has to draw conclusion that non-ASCII characters
are effectively not supported in pkcs12 utility. Regardless locale.
Application programmer can get it right by sticking to _uni interface
and performing due conversion to big-endian UTF-16 in own application.

___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] PBE_UNICODE

2015-11-18 Thread Dmitry Belyavsky
Hello OpenSSL Team,

I use the openssl 1.0.2d.

There is a commented out definition of the PBE_UNICODE define in the file
pkcs12.h
I expected it to be necessary for correct processing of the Cyrillic
symbols in PKCS12 passwords, but my test shows that the password is
correctly processed when the PBE_UNICODE is undefined and locale is set to
ru_RU.utf8.

Do I miss something or this variable and corresponding #ifdef may be
eliminated?

Thank you!

-- 
SY, Dmitry Belyavsky
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev