OMC Vote on deprecation of command line apps

2020-05-07 Thread Dr Paul Dale
PR 11575  has been blocking 
awaiting decision for a while now.  Time for a vote:

topic: Merge #11575 for 3.0.
comment: This PR removes the notes indicating that a number of the command
 line utilities are deprecated.  Not merging it will leave them flagged
 as deprecated.
Proposed by: Paul Dale
Public: yes
opened: 2020-05-08

Ideally we’ll have a decision in time for the next 3.0 alpha release.


The crux of the matter is that a number of the command line utilities are 
flagged as deprecated currently:
dhparam
dsa
dsaparam
ec
ecparam
agendas
rsa
These commands are not being removed in 3.0, instead they’ve been rewritten to 
use the PKEY APIs instead of the low level APIs as far as possible.


The reasons for keeping them are:
they are easier to use than the pkey replacements
a web search will likely result in thees commands not the pkey replacements.

The reason for removing them is one of maintenance: having duplicate commands 
means having to make changes in two places and this has been missed in the past 
and will be in the future.


Other random notes:
Deprecation of these commands does not mandate that they are removed at the 
first opportunity.  It only indicates that we want to move away from them.
Rewriting these commands so that they call the pkey replacements looks to be 
very difficult.  Reproducing the exact behaviours will be challenging, although 
the basic functionality would be straightforward.
The rsautl command is deprecated and isn’t slated for being restored — pkeyutl 
is every bit as easy to use.
The -dsaparam option to dhparam is deprecated — it cannot be supported without 
direct access to low level functionality we want to remove.
Post quantum crypto will make the discussion obsolete — none of these 
algorithms are useful in a quantum computer world.

My personal opinion is that these commands are good being deprecated but that 
we should not remove them until their usefulness is at an end.  This will 
likely mean not removing them after five years of deprecation.  It would mean 
removing them once quantum computers are shown to be effective.  Without 
deprecation now, we can’t remove them until a lot later.


Pauli
-- 
Dr Paul Dale | Distinguished Architect | Cryptographic Foundations 
Phone +61 7 3031 7217
Oracle Australia






Re: Unexpected EOF handling

2020-05-07 Thread Matt Caswell



On 07/05/2020 20:28, Dmitry Belyavsky wrote:
> From my point of view, if we don't revert the change for the sake of API
> clarity, we need to provide an option restoring old behaviour at least
> for test purposes.

Presumably nginx can already handle the situation where a close_notify
*is* received. So if we have an option to behave as if that had occurred
in the event of eof then nginx should be able to handle it without
requiring special codepaths?? If so then we don't necessarily have to
have an option to restore the old behaviour - just an option to treat
eof like a close_notify.

Matt


> 
> On Thu, May 7, 2020 at 2:52 PM Tomas Mraz  > wrote:
> 
> On Thu, 2020-05-07 at 13:22 +0200, Kurt Roeckx wrote:
> > Hi,
> >
> > We introduced a change in the 1.1.1e release that changed how we
> > handled an unexpected EOF. This broke various software which
> > resulted in the release of 1.1.1f that reverted that change.
> > In master we still have the 1.1.1e behaviour.
> >
> > The change itself is small, it just calls SSLfatal() with a new
> > error code. This has at least 2 effects:
> > - The error code was changed from SSL_ERROR_SYSCALL with errno 0
> >   to SSL_ERROR_SSL with reason code
> >   SSL_R_UNEXPECTED_EOF_WHILE_READING
> > - The session is marked as in error, and so can't be used to
> >   resume.
> >
> > There is code written that now checks for the SSL_ERROR_SYSCALL
> > with errno 0 case, while we never documented that behaviour. The
> > 1.1.1 manpage of SSL_get_error() currently reports this as a bug
> > and that it will change to SSL_ERROR_SSL /
> > SSL_R_UNEXPECTED_EOF_WHILE_READING.
> >
> > Code that checks the SSL_ERROR_SYSCALL with errno 0 seem to fall
> > in at least 2 categories:
> > - Ignore the error
> > - Check for the error, for instance in a test suite
> >
> > Not sending the close notify alert is a violation of the TLS
> > specifications, but there is software out there doesn't send it,
> > so there is a need to be able to deal with peers that don't send
> > it.
> >
> > When the close notify alert wasn't received, but we did get an
> > EOF, there are 2 cases:
> > - It's a fatal error, the protocol needs the close notify alert to
> >   prevent a truncation attack
> > - The protocol running on top of SSL can detect a truncation
> >   attack itself, and does so. It doesn't need the close notify
> >   alert. It's not a fatal error. They just want to check that that
> >   is what happened.
> >
> > The problem is that we can't make a distiction between the 2
> > cases, so the only thing we can do currently is to look at
> > it as a fatal error. So the documentation was changed to say
> > that if you're in the 2nd cases, and need to talk to a peer
> > that doesn't send the close notify alert, you should not wait
> > for the close notify alert, so that you don't get an error.
> >
> > The alternative is that we add an option that does tell us if we
> > should look at it as a fatal error or not.
> >
> > There is at least a request that we keep returning the old error code
> > (SSL_ERROR_SYSCALL with errno 0). I think that from an API point
> > of view this is very confusing. We'd then have SSL_ERROR_SYSCALL
> > as documented that it's fatal, except when errno is 0. If errno is
> > 0, it can either be fatal or not, and we're not telling you which
> > one it is. I think we also can't guarantee that SSL_ERROR_SYSCALL
> > with errno 0 is always the unexpected EOF, we returned that
> > combination because of a bug in OpenSSL.
> >
> > So I think we need at least to agree on:
> > - Do we want an option that makes the unexpected EOF either a fatal
> >   error or a non-fatal error?
> > - Which error should we return?
> 
> From application developer side of view for protocols that do not
> depend on SSL detecting the truncation I think inventing a different
> behavior for this condition than the existing one as of 1.1.1f would be
> clearly wrong. Switching on a SSL_OP if that newly provided OP is a
> trivial change to an application that needs to accommodate various
> versions of OpenSSL (and I am not talking about forks), however
> switching on a SSL_OP and also add another special error case is much
> more complicated change and has potential for bringing regressions in
> the applications that need to do it.
> 
> It is also an API break, however we would do it anyway unless we make
> the legacy behavior the default one, so that is not really relevant
> here.
> 
> Is there really another situation where SSL_ERROR_SYSCALL with errno 0
> could be returned apart from the unclean EOF condition?
> 
> I can't really think of any.
> 
> So I would be just for properly 

Re: Some more extra tests

2020-05-07 Thread Dmitry Belyavsky
Dear Nicola,

I feel a significant lack of knowledge of preparing such a PR.
If I was able to submit it, I would.

On Thu, May 7, 2020 at 10:38 PM Nicola Tuveri  wrote:

> I would be interested in seeing a PR to see what enabling these tests
> would require!
>
> I believe we do indeed need to test more thoroughly to ensure we are not
> breaking the engine API!
>
>
> Nicola
>
> On Thu, May 7, 2020, 21:08 Dmitry Belyavsky  wrote:
>
>> Dear colleagues,
>>
>> Let me draw your attention to a potentially reasonable set of extended
>> tests for the openssl engines.
>>
>> The gost-engine project (https://github.com/gost-engine/engine) has some
>> test scenarios robust enough for testing engine-provided algorithms and
>> some basic RSA regression tests. It contains a rather eclectic set of C,
>> Perl, and TCL(!) tests that are used by me on a regular basis.
>>
>> If these tests are included in the project extended test suite, they
>> could reduce some regression that sometimes occurs (see
>> https://github.com/gost-engine/engine/issues/232 as a current list of
>> known problems).
>>
>> I will be happy to assist in enabling these tests as a part of openssl
>> test suites.
>> Many thanks!
>>
>> --
>> SY, Dmitry Belyavsky
>>
>

-- 
SY, Dmitry Belyavsky


Re: Unexpected EOF handling

2020-05-07 Thread Dmitry Belyavsky
Hello,

I want to draw everybody's attention to the position (and argumentation) of
Nginx developers:
===

As already mentioned by Dmitry, we here at nginx don't think the change was
necessary. As Matt already said above in the comments to SSL_CONF_cmd.pod
change, the error was always reported. The only issue is that
SSL_ERROR_SYSCALL with a 0 errno is not properly documented. On the other
hand, the behaviour was present since ancient OpenSSL versions, and
actually tested in various software using OpenSSL library, including nginx.
A better solution, in our opinion, would be to document the error instead.

Right now the situation in OpenSSL 3.0 is that the error reporting
behaviour was changed, and, if we are going to support OpenSSL 3.0, we have
to introduce specific error testing for OpenSSL 3.0. And at the same time
we have to support previous error reporting, since we support OpenSSL
versions starting from OpenSSL 0.9.8, as well as other libraries such as
BoringSSL and LibreSSL, which still report connection close with
SSL_ERROR_SYSCALL with a 0 errno.

For obvious reasons we don't want to support multiple code paths to test
for the same error. Especially keeping in mind that due to BoringSSL and
LibreSSL we probably have to support these multiple code paths forever.

It would be really helpful if the change in question was reverted and the
existing behaviour (that is, SSL_ERROR_SYSCALL with a 0 errno) was
documented instead.
===

>From my point of view, if we don't revert the change for the sake of API
clarity, we need to provide an option restoring old behaviour at least for
test purposes.

On Thu, May 7, 2020 at 2:52 PM Tomas Mraz  wrote:

> On Thu, 2020-05-07 at 13:22 +0200, Kurt Roeckx wrote:
> > Hi,
> >
> > We introduced a change in the 1.1.1e release that changed how we
> > handled an unexpected EOF. This broke various software which
> > resulted in the release of 1.1.1f that reverted that change.
> > In master we still have the 1.1.1e behaviour.
> >
> > The change itself is small, it just calls SSLfatal() with a new
> > error code. This has at least 2 effects:
> > - The error code was changed from SSL_ERROR_SYSCALL with errno 0
> >   to SSL_ERROR_SSL with reason code
> >   SSL_R_UNEXPECTED_EOF_WHILE_READING
> > - The session is marked as in error, and so can't be used to
> >   resume.
> >
> > There is code written that now checks for the SSL_ERROR_SYSCALL
> > with errno 0 case, while we never documented that behaviour. The
> > 1.1.1 manpage of SSL_get_error() currently reports this as a bug
> > and that it will change to SSL_ERROR_SSL /
> > SSL_R_UNEXPECTED_EOF_WHILE_READING.
> >
> > Code that checks the SSL_ERROR_SYSCALL with errno 0 seem to fall
> > in at least 2 categories:
> > - Ignore the error
> > - Check for the error, for instance in a test suite
> >
> > Not sending the close notify alert is a violation of the TLS
> > specifications, but there is software out there doesn't send it,
> > so there is a need to be able to deal with peers that don't send
> > it.
> >
> > When the close notify alert wasn't received, but we did get an
> > EOF, there are 2 cases:
> > - It's a fatal error, the protocol needs the close notify alert to
> >   prevent a truncation attack
> > - The protocol running on top of SSL can detect a truncation
> >   attack itself, and does so. It doesn't need the close notify
> >   alert. It's not a fatal error. They just want to check that that
> >   is what happened.
> >
> > The problem is that we can't make a distiction between the 2
> > cases, so the only thing we can do currently is to look at
> > it as a fatal error. So the documentation was changed to say
> > that if you're in the 2nd cases, and need to talk to a peer
> > that doesn't send the close notify alert, you should not wait
> > for the close notify alert, so that you don't get an error.
> >
> > The alternative is that we add an option that does tell us if we
> > should look at it as a fatal error or not.
> >
> > There is at least a request that we keep returning the old error code
> > (SSL_ERROR_SYSCALL with errno 0). I think that from an API point
> > of view this is very confusing. We'd then have SSL_ERROR_SYSCALL
> > as documented that it's fatal, except when errno is 0. If errno is
> > 0, it can either be fatal or not, and we're not telling you which
> > one it is. I think we also can't guarantee that SSL_ERROR_SYSCALL
> > with errno 0 is always the unexpected EOF, we returned that
> > combination because of a bug in OpenSSL.
> >
> > So I think we need at least to agree on:
> > - Do we want an option that makes the unexpected EOF either a fatal
> >   error or a non-fatal error?
> > - Which error should we return?
>
> From application developer side of view for protocols that do not
> depend on SSL detecting the truncation I think inventing a different
> behavior for this condition than the existing one as of 1.1.1f would be
> clearly wrong. Switching on a SSL_OP if that newly provided OP is 

Some more extra tests

2020-05-07 Thread Dmitry Belyavsky
Dear colleagues,

Let me draw your attention to a potentially reasonable set of extended
tests for the openssl engines.

The gost-engine project (https://github.com/gost-engine/engine) has some
test scenarios robust enough for testing engine-provided algorithms and
some basic RSA regression tests. It contains a rather eclectic set of C,
Perl, and TCL(!) tests that are used by me on a regular basis.

If these tests are included in the project extended test suite, they could
reduce some regression that sometimes occurs (see
https://github.com/gost-engine/engine/issues/232 as a current list of known
problems).

I will be happy to assist in enabling these tests as a part of openssl test
suites.
Many thanks!

-- 
SY, Dmitry Belyavsky


Monthly Status Report (April)

2020-05-07 Thread Matt Caswell
As well as normal reviews, responding to user queries, wiki user
requests, OMC business, handling security reports, etc., key activities
this month:

- Ongoing review work on the CMP contribution
- Fixed some issues with the XTS documentation
- Updated WPACKET to be able to do "end first" writing to support
DER_w_* functions
- Make X509_STORE_CTX libctx aware
- Updated the CT code to be library context aware
- Enabled export_to functions to have access to the libctx
- Made PrivateKey loading libctx aware
- Enabled Ed25519/Ed448 signing/verifying to be libctx aware
- Investigated and created a POC for CVE-2020-1967
- Made X509_verify() libctx aware
- PR to run sslapitest with the FIPS module
- PR to run ssl_test_new with the FIPS module
- Investigated and fixed issue on website where the scripts failed if we
only had one tarball
- PR to run ssl_test_old with the FIPS module
- Ensured calls to EC_POINT_point2buf use a libctx
- Ensure import_to functions pass a libctx
- Fixed an issue in libssl which resulted in no alert being sent even
though a fatal error occurred
- Wrote a wiki page about 3.0
- Performed the 1.1.1g release
- Fixed no-des
- Fixed no-ec
- Fixed no-dh and no-dsa
- Fixed no-deprecated tests when the GOST engine is present
- Fixed no-err
- Performed the alpha1 release
- Fixed ssl_test_old when SSLv3 is enabled
- Fixed typo in the makefile templates meaning that fips.so and
legacy.so were not being installed
- Fixed the raw provider key implementation
- Performed the 1.0.2v release for premium support customers
- Updated to the testsuite to centralise environment variable setting
and fix a problem with test_includes


Matt



Re: Technically an API break

2020-05-07 Thread Matt Caswell



On 07/05/2020 16:02, Brian Smith wrote:
> This kind of change might cause memory unsafety issues unless the
> application is recompiled. At least, it's worth investigating that.
> 
> On most platforms the ABI of a function that returns `void` and one that
> returns `int` is the same, from the perspective of a caller that doesn't
> expect or use the return value. I seem to vaguely remember in the past
> that there was at least one common platform where that isn't true
> though. Unfortunately I cannot remember which one it is. I also don't
> remember if it is problematic to change from "int" to "void" or "void"
> to "int" or both.
> 
> Anyway, my point is that you should consider this an ABI-breaking
> change, not just an API breaking one.

Yes - thanks for that Brian. Actually though this change is targeted
only at the master branch (which will become OpenSSL 3.0). That is a
major release and is already ABI breaking - so recompilation is already
a requirement. Actually as a major release we are allowed to be API
breaking too, but we are trying to keep that to a minimum.

Matt



Re: Technically an API break

2020-05-07 Thread Brian Smith
Matt Caswell  wrote:

> PR11589 makes a change to the public API function
> `SSL_set_record_padding_callback` to change its return type from void to
> int:
>
> https://github.com/openssl/openssl/pull/11589
>
> This is technically an API break - but it doesn't seem too serious. It's
> possible, I suppose, that existing applications that use this will fail
> to spot the error return since this function can now fail. The function
> itself was only recently added (in 1.1.1), and I suspect real-world
> usage is very small (or possibly nil).
>
> Is this considered ok?
>

This kind of change might cause memory unsafety issues unless the
application is recompiled. At least, it's worth investigating that.

On most platforms the ABI of a function that returns `void` and one that
returns `int` is the same, from the perspective of a caller that doesn't
expect or use the return value. I seem to vaguely remember in the past that
there was at least one common platform where that isn't true though.
Unfortunately I cannot remember which one it is. I also don't remember if
it is problematic to change from "int" to "void" or "void" to "int" or both.

Anyway, my point is that you should consider this an ABI-breaking change,
not just an API breaking one.

Cheers,
Brian


Re: Unexpected EOF handling

2020-05-07 Thread Tomas Mraz
On Thu, 2020-05-07 at 15:45 +0200, Kurt Roeckx wrote:
> On Thu, May 07, 2020 at 03:15:22PM +0200, Tomas Mraz wrote:
> > Actually the coincidence is that the errno is set to 0 on EOF. So
> > yes,
> > we should explicitly clear the errno on EOF so any leftover value
> > from
> > previous calls does not affect this.
> 
> On EOF, errno is normally not modified. It's value is not defined
> if no error is returned. It is not guaranteed to be 0 on success
> or EOF. It can be modified, because the implementation might have
> done other system calls that did return an error. But a simple test
> shows that it's not modified on my system.
> 

Yeah, that's what I actually meant, sorry for not being clear. I did
not mean that the errno is explicitly set to 0 on EOF by the read call
but that the errno is 0 because it is not modified and was 0 before
coincidentally. 

-- 
Tomáš Mráz
No matter how far down the wrong road you've gone, turn back.
  Turkish proverb
[You'll know whether the road is wrong if you carefully listen to your
conscience.]




Re: Unexpected EOF handling

2020-05-07 Thread Kurt Roeckx
On Thu, May 07, 2020 at 03:15:22PM +0200, Tomas Mraz wrote:
> 
> Actually the coincidence is that the errno is set to 0 on EOF. So yes,
> we should explicitly clear the errno on EOF so any leftover value from
> previous calls does not affect this.

On EOF, errno is normally not modified. It's value is not defined
if no error is returned. It is not guaranteed to be 0 on success
or EOF. It can be modified, because the implementation might have
done other system calls that did return an error. But a simple test
shows that it's not modified on my system.


Kurt



Re: Unexpected EOF handling

2020-05-07 Thread Tomas Mraz
On Thu, 2020-05-07 at 14:53 +0200, Kurt Roeckx wrote:
> On Thu, May 07, 2020 at 01:46:05PM +0200, Tomas Mraz wrote:
> > From application developer side of view for protocols that do not
> > depend on SSL detecting the truncation I think inventing a
> > different
> > behavior for this condition than the existing one as of 1.1.1f
> > would be
> > clearly wrong. Switching on a SSL_OP if that newly provided OP is a
> > trivial change to an application that needs to accommodate various
> > versions of OpenSSL (and I am not talking about forks), however
> > switching on a SSL_OP and also add another special error case is
> > much
> > more complicated change and has potential for bringing regressions
> > in
> > the applications that need to do it.
> 
> Of course, just adding a call to get the old behaviour back is a
> very easy change. But I currently don't see how a different error
> is that much more complicated.
> 
> > Is there really another situation where SSL_ERROR_SYSCALL with
> > errno 0
> > could be returned apart from the unclean EOF condition?
> > 
> > I can't really think of any.
> 
> It's not because we can't think of any other case that there aren't
> any.
> 
> I also have a problem with checking errno being 0. We don't set
> errno. There is no reason to assume that errno is set to any
> value. errno can be modified on a succesful call. That errno is 0
> is just a coincidence. If we're going to document that, we should
> actually make sure that that is the case.

Actually the coincidence is that the errno is set to 0 on EOF. So yes,
we should explicitly clear the errno on EOF so any leftover value from
previous calls does not affect this.

> I think the other property of the old behaviour is that we don't
> add anything to the error stack.
> 
> > So I would be just for properly documenting the condition and
> > keeping
> > it as is if the SSL_OP to ignore unclean EOF is in effect.
> 
> And also don't add an option for applications that do want to get
> a fatal error?

Why another option? That would be the default behavior.

But anyway I like the Matt's proposal even more.

-- 
Tomáš Mráz
No matter how far down the wrong road you've gone, turn back.
  Turkish proverb
[You'll know whether the road is wrong if you carefully listen to your
conscience.]




Re: Unexpected EOF handling

2020-05-07 Thread Kurt Roeckx
On Thu, May 07, 2020 at 01:46:05PM +0200, Tomas Mraz wrote:
> From application developer side of view for protocols that do not
> depend on SSL detecting the truncation I think inventing a different
> behavior for this condition than the existing one as of 1.1.1f would be
> clearly wrong. Switching on a SSL_OP if that newly provided OP is a
> trivial change to an application that needs to accommodate various
> versions of OpenSSL (and I am not talking about forks), however
> switching on a SSL_OP and also add another special error case is much
> more complicated change and has potential for bringing regressions in
> the applications that need to do it.

Of course, just adding a call to get the old behaviour back is a
very easy change. But I currently don't see how a different error
is that much more complicated.

> Is there really another situation where SSL_ERROR_SYSCALL with errno 0
> could be returned apart from the unclean EOF condition?
> 
> I can't really think of any.

It's not because we can't think of any other case that there aren't
any.

I also have a problem with checking errno being 0. We don't set
errno. There is no reason to assume that errno is set to any
value. errno can be modified on a succesful call. That errno is 0
is just a coincidence. If we're going to document that, we should
actually make sure that that is the case.

I think the other property of the old behaviour is that we don't
add anything to the error stack.

> So I would be just for properly documenting the condition and keeping
> it as is if the SSL_OP to ignore unclean EOF is in effect.

And also don't add an option for applications that do want to get
a fatal error?


Kurt



Re: Unexpected EOF handling

2020-05-07 Thread Tomas Mraz
On Thu, 2020-05-07 at 12:47 +0100, Matt Caswell wrote:
> 
> On 07/05/2020 12:22, Kurt Roeckx wrote:
> > So I think we need at least to agree on:
> > - Do we want an option that makes the unexpected EOF either a fatal
> >   error or a non-fatal error?
> > - Which error should we return?
> 
> This is an excellent summary of the current situation.
> 
> I am not keen on maintaining the SSL_ERROR_SYSCALL with 0 errno as a
> long term solution. It's a very confusing API for new applications to
> use. Effectively it means SSL_ERROR_SYSCALL is a fatal error - except
> when its not. SSL_ERROR_SYSCALL should mean fatal error.
> 
> That said I also recognise that it is apparently commonplace to shut
> down a TLS connection without sending close_notify - despite what the
> standards may say about it (and TBH we can hardly claim the moral
> high
> ground here since s_server does exactly this - or at least it does in
> 1.1.1 and did until very recently in master).
> 
> But we do have to consider usages beyond HTTPS. I have no idea if
> this
> occurs in other settings or not.
> 
> Perhaps what we need is an option for "strict shutdown". With strict
> shutdown "off" we could treat EOF as if we had received a
> close_notify
> gracefully (and don't invalidate the session). Presumably existing
> code
> would be able to cope with that???

Yes, existing code would be able to cope with that with one important
exception that I am going to talk about below.

> With strict shutdown "on" we treat it as SSL_ERROR_SSL (as now in
> master).
> 
> I'm not sure though what the default should be.

In case we go with this solution, which would be acceptable I think, we
MUST NOT EVER make it the default because existing applications that
depend on the existing way how the unclean EOF condition is returned
might actually depend on it to detect the truncation attack.

The existing legacy way does not really prevent applications from
detecting the truncation attack because the condition is reported to
the application albeit in this legacy undocumented way. So changing the
default to mean - never report the unclean EOF condition at all, simply
return as if the shutdown was clean, would actually mean a security
issue for these existing applications that care about the unclean EOF.

So yes, perhaps it would be better for the SSL_OP to actually fully
ignore the unclean EOF instead of returning this undocumented error
condition, but it must not be a default (not even in SSL_OP_ALL).

-- 
Tomáš Mráz
No matter how far down the wrong road you've gone, turn back.
  Turkish proverb
[You'll know whether the road is wrong if you carefully listen to your
conscience.]




Re: Unexpected EOF handling

2020-05-07 Thread Tomas Mraz
On Thu, 2020-05-07 at 13:22 +0200, Kurt Roeckx wrote:
> Hi,
> 
> We introduced a change in the 1.1.1e release that changed how we
> handled an unexpected EOF. This broke various software which
> resulted in the release of 1.1.1f that reverted that change.
> In master we still have the 1.1.1e behaviour.
> 
> The change itself is small, it just calls SSLfatal() with a new
> error code. This has at least 2 effects:
> - The error code was changed from SSL_ERROR_SYSCALL with errno 0
>   to SSL_ERROR_SSL with reason code
>   SSL_R_UNEXPECTED_EOF_WHILE_READING
> - The session is marked as in error, and so can't be used to
>   resume.
> 
> There is code written that now checks for the SSL_ERROR_SYSCALL
> with errno 0 case, while we never documented that behaviour. The
> 1.1.1 manpage of SSL_get_error() currently reports this as a bug
> and that it will change to SSL_ERROR_SSL /
> SSL_R_UNEXPECTED_EOF_WHILE_READING.
> 
> Code that checks the SSL_ERROR_SYSCALL with errno 0 seem to fall
> in at least 2 categories:
> - Ignore the error
> - Check for the error, for instance in a test suite
> 
> Not sending the close notify alert is a violation of the TLS
> specifications, but there is software out there doesn't send it,
> so there is a need to be able to deal with peers that don't send
> it.
> 
> When the close notify alert wasn't received, but we did get an
> EOF, there are 2 cases:
> - It's a fatal error, the protocol needs the close notify alert to
>   prevent a truncation attack
> - The protocol running on top of SSL can detect a truncation
>   attack itself, and does so. It doesn't need the close notify
>   alert. It's not a fatal error. They just want to check that that
>   is what happened.
> 
> The problem is that we can't make a distiction between the 2
> cases, so the only thing we can do currently is to look at
> it as a fatal error. So the documentation was changed to say
> that if you're in the 2nd cases, and need to talk to a peer
> that doesn't send the close notify alert, you should not wait
> for the close notify alert, so that you don't get an error.
> 
> The alternative is that we add an option that does tell us if we
> should look at it as a fatal error or not.
> 
> There is at least a request that we keep returning the old error code
> (SSL_ERROR_SYSCALL with errno 0). I think that from an API point
> of view this is very confusing. We'd then have SSL_ERROR_SYSCALL
> as documented that it's fatal, except when errno is 0. If errno is
> 0, it can either be fatal or not, and we're not telling you which
> one it is. I think we also can't guarantee that SSL_ERROR_SYSCALL
> with errno 0 is always the unexpected EOF, we returned that
> combination because of a bug in OpenSSL.
> 
> So I think we need at least to agree on:
> - Do we want an option that makes the unexpected EOF either a fatal
>   error or a non-fatal error?
> - Which error should we return?

>From application developer side of view for protocols that do not
depend on SSL detecting the truncation I think inventing a different
behavior for this condition than the existing one as of 1.1.1f would be
clearly wrong. Switching on a SSL_OP if that newly provided OP is a
trivial change to an application that needs to accommodate various
versions of OpenSSL (and I am not talking about forks), however
switching on a SSL_OP and also add another special error case is much
more complicated change and has potential for bringing regressions in
the applications that need to do it.

It is also an API break, however we would do it anyway unless we make
the legacy behavior the default one, so that is not really relevant
here.

Is there really another situation where SSL_ERROR_SYSCALL with errno 0
could be returned apart from the unclean EOF condition?

I can't really think of any.

So I would be just for properly documenting the condition and keeping
it as is if the SSL_OP to ignore unclean EOF is in effect.

-- 
Tomáš Mráz
No matter how far down the wrong road you've gone, turn back.
  Turkish proverb
[You'll know whether the road is wrong if you carefully listen to your
conscience.]




Re: Unexpected EOF handling

2020-05-07 Thread Matt Caswell



On 07/05/2020 12:22, Kurt Roeckx wrote:
> So I think we need at least to agree on:
> - Do we want an option that makes the unexpected EOF either a fatal
>   error or a non-fatal error?
> - Which error should we return?

This is an excellent summary of the current situation.

I am not keen on maintaining the SSL_ERROR_SYSCALL with 0 errno as a
long term solution. It's a very confusing API for new applications to
use. Effectively it means SSL_ERROR_SYSCALL is a fatal error - except
when its not. SSL_ERROR_SYSCALL should mean fatal error.

That said I also recognise that it is apparently commonplace to shut
down a TLS connection without sending close_notify - despite what the
standards may say about it (and TBH we can hardly claim the moral high
ground here since s_server does exactly this - or at least it does in
1.1.1 and did until very recently in master).

But we do have to consider usages beyond HTTPS. I have no idea if this
occurs in other settings or not.

Perhaps what we need is an option for "strict shutdown". With strict
shutdown "off" we could treat EOF as if we had received a close_notify
gracefully (and don't invalidate the session). Presumably existing code
would be able to cope with that???

With strict shutdown "on" we treat it as SSL_ERROR_SSL (as now in master).

I'm not sure though what the default should be.

Matt


Unexpected EOF handling

2020-05-07 Thread Kurt Roeckx
Hi,

We introduced a change in the 1.1.1e release that changed how we
handled an unexpected EOF. This broke various software which
resulted in the release of 1.1.1f that reverted that change.
In master we still have the 1.1.1e behaviour.

The change itself is small, it just calls SSLfatal() with a new
error code. This has at least 2 effects:
- The error code was changed from SSL_ERROR_SYSCALL with errno 0
  to SSL_ERROR_SSL with reason code
  SSL_R_UNEXPECTED_EOF_WHILE_READING
- The session is marked as in error, and so can't be used to
  resume.

There is code written that now checks for the SSL_ERROR_SYSCALL
with errno 0 case, while we never documented that behaviour. The
1.1.1 manpage of SSL_get_error() currently reports this as a bug
and that it will change to SSL_ERROR_SSL /
SSL_R_UNEXPECTED_EOF_WHILE_READING.

Code that checks the SSL_ERROR_SYSCALL with errno 0 seem to fall
in at least 2 categories:
- Ignore the error
- Check for the error, for instance in a test suite

Not sending the close notify alert is a violation of the TLS
specifications, but there is software out there doesn't send it,
so there is a need to be able to deal with peers that don't send
it.

When the close notify alert wasn't received, but we did get an
EOF, there are 2 cases:
- It's a fatal error, the protocol needs the close notify alert to
  prevent a truncation attack
- The protocol running on top of SSL can detect a truncation
  attack itself, and does so. It doesn't need the close notify
  alert. It's not a fatal error. They just want to check that that
  is what happened.

The problem is that we can't make a distiction between the 2
cases, so the only thing we can do currently is to look at
it as a fatal error. So the documentation was changed to say
that if you're in the 2nd cases, and need to talk to a peer
that doesn't send the close notify alert, you should not wait
for the close notify alert, so that you don't get an error.

The alternative is that we add an option that does tell us if we
should look at it as a fatal error or not.

There is at least a request that we keep returning the old error code
(SSL_ERROR_SYSCALL with errno 0). I think that from an API point
of view this is very confusing. We'd then have SSL_ERROR_SYSCALL
as documented that it's fatal, except when errno is 0. If errno is
0, it can either be fatal or not, and we're not telling you which
one it is. I think we also can't guarantee that SSL_ERROR_SYSCALL
with errno 0 is always the unexpected EOF, we returned that
combination because of a bug in OpenSSL.

So I think we need at least to agree on:
- Do we want an option that makes the unexpected EOF either a fatal
  error or a non-fatal error?
- Which error should we return?


Kurt



Re: Technically an API break

2020-05-07 Thread Richard Levitte
On Thu, 07 May 2020 10:31:42 +0200,
Tomas Mraz wrote:
> 
> On Thu, 2020-05-07 at 09:24 +0100, Matt Caswell wrote:
> > PR11589 makes a change to the public API function
> > `SSL_set_record_padding_callback` to change its return type from void
> > to
> > int:
> > 
> > https://github.com/openssl/openssl/pull/11589
> > 
> > This is technically an API break - but it doesn't seem too serious.
> > It's
> > possible, I suppose, that existing applications that use this will
> > fail
> > to spot the error return since this function can now fail. The
> > function
> > itself was only recently added (in 1.1.1), and I suspect real-world
> > usage is very small (or possibly nil).
> > 
> > Is this considered ok?
> 
> I would say this is an acceptable thing if it is documented (which it
> is in the PR). Especially because the error return can happen only when
> the application sets up the SSL to use kernel TLS.

I agree with this assessment.

Cheers,
Richard

-- 
Richard Levitte levi...@openssl.org
OpenSSL Project http://www.openssl.org/~levitte/


Re: Technically an API break

2020-05-07 Thread Tomas Mraz
On Thu, 2020-05-07 at 09:24 +0100, Matt Caswell wrote:
> PR11589 makes a change to the public API function
> `SSL_set_record_padding_callback` to change its return type from void
> to
> int:
> 
> https://github.com/openssl/openssl/pull/11589
> 
> This is technically an API break - but it doesn't seem too serious.
> It's
> possible, I suppose, that existing applications that use this will
> fail
> to spot the error return since this function can now fail. The
> function
> itself was only recently added (in 1.1.1), and I suspect real-world
> usage is very small (or possibly nil).
> 
> Is this considered ok?

I would say this is an acceptable thing if it is documented (which it
is in the PR). Especially because the error return can happen only when
the application sets up the SSL to use kernel TLS.

-- 
Tomáš Mráz
No matter how far down the wrong road you've gone, turn back.
  Turkish proverb
[You'll know whether the road is wrong if you carefully listen to your
conscience.]




Technically an API break

2020-05-07 Thread Matt Caswell
PR11589 makes a change to the public API function
`SSL_set_record_padding_callback` to change its return type from void to
int:

https://github.com/openssl/openssl/pull/11589

This is technically an API break - but it doesn't seem too serious. It's
possible, I suppose, that existing applications that use this will fail
to spot the error return since this function can now fail. The function
itself was only recently added (in 1.1.1), and I suspect real-world
usage is very small (or possibly nil).

Is this considered ok?

Matt