Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Viktor Dukhovni
On Fri, Jun 14, 2019 at 01:41:51PM +1000, Dr Paul Dale wrote:

> I’m behind ditching the function identifier #defines but not their text names.

Good to hear.

> #define ERR_raise_error ERR_raise_error_internal(__FILE__, __LINE__, __FUNC__)

Well, __FUNC__ is entirely non-standard, and __func__ is C99.  Are
we ready to abandon C89/C90?  If not, then __func__ (and variants)
becomes compiler-specific.

In test/testutil.h, we have some of the requisite gymnastics:

# if !defined(__STDC_VERSION__) || __STDC_VERSION__ < 199901L
#  if defined(_MSC_VER)
#   define TEST_CASE_NAME __FUNCTION__
#  else
#   define testutil_stringify_helper(s) #s
#   define testutil_stringify(s) testutil_stringify_helper(s)
#   define TEST_CASE_NAME __FILE__ ":" testutil_stringify(__LINE__)
#  endif/* _MSC_VER */
# else
#  define TEST_CASE_NAME __func__
# endif /* __STDC_VERSION__ */

While the GCC manual: 
http://gcc.gnu.org/onlinedocs/gcc-4.8.1/gcc/Function-Names.html
suggests:

 #if __STDC_VERSION__ < 199901L
 # if __GNUC__ >= 2
 #  define __func__ __FUNCTION__
 # else
 #  define __func__ ""
 # endif
 #endif

we would also need similar for any other pre-C99 supported compilers.

That said, I'm still in favour of function strings, and just the
error and library codes as numeric.

-- 
Viktor.


Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Dr Paul Dale
I’m behind ditching the function identifier #defines but not their text names.


The varargs based error function seems like a good idea too.  It would be nicer 
if the ERR_raise_error function included the file and line information without 
them needing to be specified.  As an exercise for the reader, I’ll skip any 
implementation details but will note that it is possible without using variadic 
macros.


Pauli


Hint:
#define ERR_raise_error ERR_raise_error_internal(__FILE__, __LINE__, __FUNC__)
int (*)(int, const char *, ...) ERR_raise_error_internal(const char *, int, 
const char *);
-- 
Dr Paul Dale | Cryptographer | Network Security & Encryption 
Phone +61 7 3031 7217
Oracle Australia



> On 14 Jun 2019, at 12:04 pm, Viktor Dukhovni  
> wrote:
> 
> On Wed, Jun 12, 2019 at 05:51:44AM +0200, Richard Levitte wrote:
> 
>> A discussion point in that PR is whether it's still interesting to
>> keep information on the system function/callback that was called when
>> we're reporting a system error, i.e. this type of error report:
>> 
>>SYSerr(SYS_F_FSTAT, errno);
>> 
>> (incidently, there's another PR, 9072, which changes those to
>> 'SYSerr("fstat", errno)')
>> 
>> So, the main points of discussion seem to be:
>> 
>> - should we remove function indicators in our error reporting?
>> - should we make an exception for system errors (SYSerr())?
> 
> Bottom line, my take is that function indicators should stay, but
> string names would be more convenient all around than numeric codes.
> 
> -- 
>   Viktor.



Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Richard Levitte
On Thu, 13 Jun 2019 22:04:17 +0200,
Roumen Petrov wrote:
> 
> Richard Levitte wrote:
> > On Thu, 13 Jun 2019 20:23:05 +0200,
> > Roumen Petrov wrote:
> >> Hello,
> >> 
> >> First I did not understand what is wrong to use function or
> >> reasons. All of them are subject to particular "library".
> > We didn't say anything against reason codes.  As a matter of fact, I
> > do find those useful.  Function codes, on the other hand, are unstable
> > and thereby quite useless, at least for figuring out errors.
> I'm not aware of OpenSSL documentation that describes reason codes as
> stable (fixed).

Perhaps, but that's not saying they aren't...  The reason codes,
specifically, have been fairly stable over time.  However, we can get
better.  Even better, we could get better at documenting them.

> If code is rewritten not only functions are changed. More or less
> reasons are changed as well.

Sometimes, that's true, and especially as new functionality has come
up.  We could certainly do a better job there by getting better at
giving consistent and more well defined reason codes.

> >> To parse openssl error code in an application code is bad practice.
> > Is it?  Would you say it's bad practice to look at errno codes too?
> No as already in one of my post I wrote that  errsrt utility is very useful.

I'm not talking about display for the moment.

> Bad practice is to check that reason code encoded into 'error' code
> has value of A or B.

Why?  Why would it be bad practice to check if the reason for an error
is ERR_R_MALLOC_FAILURE, for example?

You can make the exact same analogy with errno and checking its
values programatically.

git grep 'errno *[!=]='

Considering the result of the above command, would you say that we're
having bad practice in OpenSSL code?

> >> But developer could  format "extra message" for instance :
> >>      NSSerr(NSS_F_RSA_SIGN, NSS_R_UNSUPPORTED_NID);
> >>      {/*add extra error message data*/
> >>      char msgstr[10];
> >>      BIO_snprintf(msgstr, sizeof(msgstr), "%d", dtype);
> >>      ERR_add_error_data(2, "NID=", msgstr);
> >>      }
> > I'll counter with a (yet fictitious):
> > 
> >  ERR_raise_error(NSS_R_UNSUPPORED_NID, "NID=%d", dtype);
> Reason code could be shared between functions :

Err, yeah?  Of course they can!

> > or if you want to add information that helps debugging:
> > 
> >  ERR_raise_error_debug(NSS_R_UNSUPPORED_NID,
> >__FILE__, __LINE__, __FUNC__, "$Format:%H",
> >"NID=%d", dtype);
> 
> This look like complete different solution. __FILE__ exposes some
> 'private'  information (build related) and is some cases is not
> acceptable .

Maybe you should look at the WHATEVERerr macros, here's one:

# define SYSerr(f,r)  
ERR_PUT_error(ERR_LIB_SYS,(f),(r),OPENSSL_FILE,OPENSSL_LINE)

OPENSSL_FILE and OPENSSL_LINE are macros that are aliases for
__FILE__ and __LINE__, if those are available.

So we already do expose that "private information".  How other
applications call ERR_PUT_error() is none of our business.

> __FUNC__ does not look portable  - __func__ vs __FUNCTION__ vs "not
> defined".

It was just an example, to give you an idea...  I didn't test it or
check it for correctness.

> Also going into this direction question is why to use "reason code".

It's to allow calling applications to react differently depending on
the reason for an error to occur.  How else would you have them know?

> Functionality similar to errno, sys_errlist  is also outdated.

Ok, so what kind of functionality do you want to offer?  Remember,
we're talking about a language that doesn't have much error handling
per se.  More advanced languages have an exception system, but even
there, the applicatin can catch them and react differently depending
on what the error is, so apart from the out-of-band nature of an
exception system (and automated unwinding), the difference isn't
really that big.

> > (since this is just an idea so far, it's perfectly possible to throw
> > in a library code as well, but like I said already, dynamic library
> > codes have their own issue)
> > 
> > I know which I find easier to write.
> > 
> >> There is no reason OpenSSL code to use ERR_add_error_data as usually
> >> library packs whole functionality.
> > That's not really true, there are places where OpenSSL code does use
> > ERR_add_error_data(), and for good reason.
> Hmm, you cut my note for externals like IO CAPI ;)

Because I wasn't arguing that.  I was arguing that what you were
saying about OpenSSL code is incorrect.

Cheers,
Richard

-- 
Richard Levitte levi...@openssl.org
OpenSSL Project http://www.openssl.org/~levitte/


Re: Start up entropy gathering

2019-06-13 Thread Kurt Roeckx
On Thu, Jun 13, 2019 at 05:06:16PM +1000, Dr Paul Dale wrote:
> 
> The second suggestion is broadly similar but requires a file containing 
> entropy that persists across reboots.  This alternative requires a more 
> management: the entropy file once read needs to be rewritten immediately (and 
> ideally on shutdown as well).  It also introduces a new attack vector against 
> the entropy storage.  It also isn’t possible to skip the entropy file 
> read/rewrite sequence because it is impossible to determine if /dev/urandom 
> has actually been seeded.  I’ve not attempted to code this, persistent files 
> containing seed material potentially introduce other problems.

This is what init systems have always done. I see no need to also
do it. They have a policy not to credit that the entropy from that
file, I see no reason why we should override that policy.


Kurt



Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Roumen Petrov

Richard Levitte wrote:

On Thu, 13 Jun 2019 20:23:05 +0200,
Roumen Petrov wrote:

Hello,

First I did not understand what is wrong to use function or
reasons. All of them are subject to particular "library".

We didn't say anything against reason codes.  As a matter of fact, I
do find those useful.  Function codes, on the other hand, are unstable
and thereby quite useless, at least for figuring out errors.
I'm not aware of OpenSSL documentation that describes reason codes as 
stable (fixed).
If code is rewritten not only functions are changed. More or less 
reasons are changed as well.




To parse openssl error code in an application code is bad practice.

Is it?  Would you say it's bad practice to look at errno codes too?

No as already in one of my post I wrote that  errsrt utility is very useful.
Bad practice is to check that reason code encoded into 'error' code has 
value of A or B.






Error *text* is a different matter.


Richard Levitte wrote:

On Thu, 13 Jun 2019 12:01:46 +0200,
Matt Caswell wrote:

   The
additional information you're talking about is something we currently
provide the ERR_add_error_data() function for, and that together with
the reason text (derived from the reason code) is the data the end
user can reasonably get.  It's up to whoever writes the error raising
code to provide enough useful information.

Yes, although in practice we don't currently do this (we very rarely add
additional explanatory text). Not sure if that is a problem with the API, our
coding standards, or our culture.

This is partly historical...  ERR_add_error_data() has been around
since the beginning of time, and it looks to me like it was designed
in a time where varargs hadn't universally caught on yet (yes, there
was a time before varargs, and it's appropriate to call that "the
stone age").

But developer could  format "extra message" for instance :
     NSSerr(NSS_F_RSA_SIGN, NSS_R_UNSUPPORTED_NID);
     {/*add extra error message data*/
     char msgstr[10];
     BIO_snprintf(msgstr, sizeof(msgstr), "%d", dtype);
     ERR_add_error_data(2, "NID=", msgstr);
     }

I'll counter with a (yet fictitious):

 ERR_raise_error(NSS_R_UNSUPPORED_NID, "NID=%d", dtype);

Reason code could be shared between functions :
$ grep CAPI_R_FILE_OPEN_ERROR *.c
e_capi.c:    CAPIerr(CAPI_F_CAPI_CTRL, CAPI_R_FILE_OPEN_ERROR);
e_capi.c:    CAPIerr(CAPI_F_CAPI_VTRACE, CAPI_R_FILE_OPEN_ERROR);
e_capi_err.c:    {ERR_PACK(0, 0, CAPI_R_FILE_OPEN_ERROR), "file open 
error"},


NSS engine has similar case:
$ grep NSS_R_MISSING_PVTKEY *.c
e_nss_dsa.c:    NSSerr(NSS_F_DSA_DO_SIGN, NSS_R_MISSING_PVTKEY);
e_nss_ec.c:    NSSerr(NSS_F_ECDSA_DO_SIGN, NSS_R_MISSING_PVTKEY);
e_nss_err.c:#define NSS_R_MISSING_PVTKEY 133
e_nss_err.c:    { ERR_REASON(NSS_R_MISSING_PVTKEY)  , 
"Missing private key" },

e_nss_key.c:    NSSerr(NSS_F_LOAD_KEY, NSS_R_MISSING_PVTKEY);
e_nss_rsa.c:    NSSerr(NSS_F_RSA_PRIV_DEC, NSS_R_MISSING_PVTKEY);
e_nss_rsa.c:    NSSerr(NSS_F_RSA_SIGN, NSS_R_MISSING_PVTKEY);



or if you want to add information that helps debugging:

 ERR_raise_error_debug(NSS_R_UNSUPPORED_NID,
   __FILE__, __LINE__, __FUNC__, "$Format:%H",
   "NID=%d", dtype);


This look like complete different solution. __FILE__ exposes some 
'private'  information (build related) and is some cases is not acceptable .
__FUNC__ does not look portable  - __func__ vs __FUNCTION__ vs "not 
defined".


Also going into this direction question is why to use "reason code".
Functionality similar to errno, sys_errlist  is also outdated.



(since this is just an idea so far, it's perfectly possible to throw
in a library code as well, but like I said already, dynamic library
codes have their own issue)

I know which I find easier to write.


There is no reason OpenSSL code to use ERR_add_error_data as usually
library packs whole functionality.

That's not really true, there are places where OpenSSL code does use
ERR_add_error_data(), and for good reason.

Hmm, you cut my note for externals like IO CAPI ;)

  BIO_lookup and friends add
the extra resolver error on error, the CONF code adds information on
exactly what value causes a configuration file parsing error, the DSO
code adds information on exactly what file it attempted to load (full
path if possible), etc etc etc...  those are things that can't be part
of the error code.

Cheers,
Richard



Roumen


Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Matthew Lindner
Hi I'm a developer that writes code that interfaces with OpenSSL. One
perspective I think that is missing that I haven't seen is that error
codes that point to the exact point in the OpenSSL code that caused an
issue are often needed to debug code that interfaces with OpenSSL.
Importantly because the documentation is often lacking complete
clarity on exactly what any given OpenSSL function does. We (using
openssl 1.0.2) from time to time get errors from within OpenSSL that
need diving into the OpenSSL code to diagnose. If the errors don't
point to the location in the OpenSSL code the error was generated,
this makes it all the harder to diagnose issues.

Thanks,
Matthew


Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Roumen Petrov

Hi Viktor,

Viktor Dukhovni wrote:

On Wed, Jun 12, 2019 at 10:02:25AM +0100, Matt Caswell wrote:


OTOH I do find them quite helpful from a debugging perspective, e.g. when people
send in questions along the lines of "I got this error what does it mean/how do
I fix it" - although what is actually useful is usually the function name rather
than the function code itself.

Indeed what's needed is the function name.  The numeric code is far
less important.  On the error consumer side, the idiom I'm familiar
with is:

 while ((err = ERR_get_error_line_data(, , , )) != 0) {
 ERR_error_string_n(err, buffer, sizeof(buffer));
 if (flags & ERR_TXT_STRING)
 printf("...: %s:%s:%d:%s", buffer, file, line, data);
 else
 printf("...: %s:%s:%d", buffer, file, line);
 }

this makes no explicit reference to function numbers, returning the
appropriate strings.  So any change is likely limited to error
producers.

Actually err encodes library, function and reason.



On the producer side, my ssl_dane library (used in Exim for example),
does depend on the function ordinal API:

 https://github.com/vdukhovni/ssl_dane/blob/master/danessl.c#L52-L118
 https://github.com/vdukhovni/ssl_dane/blob/master/danessl.c#L52-L118

Why you tables with functions and reasons does not use ERR_PACK ?



so that would need to change (or be longer supported) if the function
ordinals are replaced by strings, or otherwise change.





Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Richard Levitte
On Thu, 13 Jun 2019 18:26:41 +0200,
Salz, Rich wrote:
> 
> >The proper way to handle this, in my experience, is *DO NOT REUSE ERROR 
> > CODES.* Each error code appears in exactly one place, and we could 
> > eventually build up documentation explaining what they mean, the causes, 
> > and how to address this. This means, we don't use ERR_R_MALLOC when trying 
> > to create an RSA key, for example, but rather a handful of new errors for 
> > ERR_R_RSA_CANT_CREATE_D, ...CANT_CREATE_N, etc.  That is a big job, albeit 
> > mostly a tedious one.
>  
> I got some feedback on- and off-list about this. Most of it was
> "surely you can't be serious."  I am, and stop calling me
> Shirley. :) Let me add some details.  First, recall that OpenSSL has
> an error stack, and that as errors are "unwound" each function can
> add its own error code to that stack. This naturally leads to the
> point where the first entry has the most detailed error, "malloc
> failed" and the last entry has the most application-oriented error
> "Could not create RSA key"; along the way are "Could not create d"
> and "Could not create secure bignum" errors.  I hope that makes more
> sense.
> 
> HOWEVER, this point (which got the most comments) was a side-note to
> the main point of my email, which gave some reasons why I think
> including the function code is a bad idea.

So basically, what you're actually saying is that we should add
additional errors up the call stack...  so if some function called
OPENSSL_zalloc(), it should be perfectly OK to raise a
ERR_R_MALLOC_FAILURE, but functions above should *add* things like
WHATEVER_R_KEY_CREATE_FAILURE, etc etc etc, thereby creating that
backtrace that Tim talks about.

A point here is that the application may want to find out if there was
an allocation error -- at which point it might choose to simply bail --
or some other error that prevented the key to be created (insecure
amount of bits, say?) -- at which point it might choose to try and
mitigate.  The question is what's easier for the application, getting
the wanted information as a top error or having to walk to the bottom
of the error stack to figure things out.

Cheers,
Richard

-- 
Richard Levitte levi...@openssl.org
OpenSSL Project http://www.openssl.org/~levitte/


Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Richard Levitte
On Thu, 13 Jun 2019 12:01:46 +0200,
Matt Caswell wrote:
> >  The
> > additional information you're talking about is something we currently
> > provide the ERR_add_error_data() function for, and that together with
> > the reason text (derived from the reason code) is the data the end
> > user can reasonably get.  It's up to whoever writes the error raising
> > code to provide enough useful information.
> 
> Yes, although in practice we don't currently do this (we very rarely add
> additional explanatory text). Not sure if that is a problem with the API, our
> coding standards, or our culture.

This is partly historical...  ERR_add_error_data() has been around
since the beginning of time, and it looks to me like it was designed
in a time where varargs hadn't universally caught on yet (yes, there
was a time before varargs, and it's appropriate to call that "the
stone age").

In today's coding practices, I personally find ERR_add_error_data()
clumsy to deal with, so I seldom use it.  Also, being a separate
function, it's easy to forget that it's there and useful.  That's a
reason to think that having all integrated in one function call that
includes the flexibility of BIO_printf() probably would encourage
producing better information.

Cheers,
Richard

-- 
Richard Levitte levi...@openssl.org
OpenSSL Project http://www.openssl.org/~levitte/


Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Matt Caswell



On 13/06/2019 10:50, Richard Levitte wrote:
> 
> Good point, but note that there is no conflict with what I wrote.

Yes, I realise that.

>  The
> additional information you're talking about is something we currently
> provide the ERR_add_error_data() function for, and that together with
> the reason text (derived from the reason code) is the data the end
> user can reasonably get.  It's up to whoever writes the error raising
> code to provide enough useful information.

Yes, although in practice we don't currently do this (we very rarely add
additional explanatory text). Not sure if that is a problem with the API, our
coding standards, or our culture.

> 
> Incidently, I did mention thinking about a new (better, I hope) error
> raising function on github, and might as well repeat it here:
> 
> int ERR_raise_error(int reason, const char *fmt, ...);
> 
> To allow for debugging information (like I mentioned, some sensible
> combination of __FILE__, __LINE__, __FUNC__, and possibly git commit
> id), one might imagine this:
> 
> int ERR_add_debug_info(const char *file, size_t line,
>const char *func, const char *commitid);
> 
> or a combo of both:
> 
> int ERR_raise_error_debug(int reason,
>   const char *file, size_t line,
>   const char *func, const char *commitid,
>   const char *fmt, ...);
> 
> Note that this doesn't have to conflict with the current error raising
> code, all we need to do is to convert whatever input ERR_put_error()
> gets into the new form.

I think this is quite a good idea. One possible concern is an exponential growth
the amount of string data that we have. Not sure what impact that might have on
the size of the library.

Matt


Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Richard Levitte
On Thu, 13 Jun 2019 11:16:38 +0200,
Matt Caswell wrote:
> 
> I agree with everything Richard just said. I just have some additional 
> thoughts
> inserted below.
> 
> On 13/06/2019 10:00, Richard Levitte wrote:
> > If we look at it from the perspective of the application author,
> > what's most often needed are reliable error/reason codes (to check and
> > to react appropriatly to) and associated reason strings (for display).
> > For the application author, it would normally be completely
> > uninteresting exactly which function the error happened in (think
> > about it, when's the last time you wanted to know exactly which
> > internal function in libc raised the EAGAIN you just got?).  For the
> > application author, the interesting bit is usually "what went wrong?"
> > (reason) and "what did I call?" (they already know that).
> > 
> > 
> > If we look at *our* needs as library writers, then the exact location
> > that raised the error is interesting, of course!  But then, is it
> > interesting in the form of codes, or are we rather interested in the
> > text form?  I dunno about you, but I don't give a damn about the
> > actual library and function codes, I look at the displayed library
> > names and function names.  (and yet, they aren't necessarily enough
> > without knowing the OpenSSL version).
> > 
> > 
> > Finally, if we look at it from a provider author's perspective,
> > dealing with all these different codes is quite hard.  Ideally, a
> > provider author should just need to deal with their own reason codes
> > and associated strings, and that's about it.
> 
> I think you missed another important perspective, i.e. that of the end user.
> 
> Note that this is slightly different to the needs of the application author. 
> An
> application author will typically do one of two things if an error occurs:
> 
> 1) Try and figure out the cause of the error and recover from it, e.g. if I 
> get
> reason code x then I'll retry the operation with slightly different parameters
> but if I get y then I'll just ignore this failure and carry on anyway.
> 
> or
> 
> 2) If its an error they don't know how to handle then give up and display
> whatever error details are available to the end user (or log them, or 
> whatever)
> 
> So the application author is going to want a small number of reason codes that
> they can handle. If its not in the small set they know about, then give up.
> 
> The end user OTOH doesn't care about reason codes at all and wants as detailed
> an explanation of the error (in text form) as is possible. Potentially that
> might mean different text for every point in the code where we raise an error.
> Ideally they might want information about the context that led up to the 
> error,
> i.e. what particular operation was being attempted at the time. I'm thinking
> that there is some relationship here between the information available from 
> the
> trace API, and the error API.

Good point, but note that there is no conflict with what I wrote.  The
additional information you're talking about is something we currently
provide the ERR_add_error_data() function for, and that together with
the reason text (derived from the reason code) is the data the end
user can reasonably get.  It's up to whoever writes the error raising
code to provide enough useful information.

Incidently, I did mention thinking about a new (better, I hope) error
raising function on github, and might as well repeat it here:

int ERR_raise_error(int reason, const char *fmt, ...);

To allow for debugging information (like I mentioned, some sensible
combination of __FILE__, __LINE__, __FUNC__, and possibly git commit
id), one might imagine this:

int ERR_add_debug_info(const char *file, size_t line,
   const char *func, const char *commitid);

or a combo of both:

int ERR_raise_error_debug(int reason,
  const char *file, size_t line,
  const char *func, const char *commitid,
  const char *fmt, ...);

Note that this doesn't have to conflict with the current error raising
code, all we need to do is to convert whatever input ERR_put_error()
gets into the new form.

Cheers,
Richard

-- 
Richard Levitte levi...@openssl.org
OpenSSL Project http://www.openssl.org/~levitte/


Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Tim Hudson
On Thu, Jun 13, 2019 at 6:40 PM Salz, Rich  wrote:

> The proper way to handle this, in my experience, is *DO NOT REUSE ERROR
> CODES.*


No. This is a path to a rather unacceptable outcome.
Taking your example and running forward with it, having an out-of-memory
separate error code for every possible context would like to *589 error
codes* for just handling out-of-memory in master at a single level.
And then on top of that you would need to cascade those errors up the call
chain potentially.

Each error condition that might require different treatment should have a
common code.
The concept of out-of-memory (as a simple example) is not context specific
(in terms of handling).
The concept of unable-to-open-file is also not context specific.

What the current error system does is to provide a context for the user
when error conditions occur to be able to have some idea as to what
happened.
It is very much like the java stack trace - and useful for a variety of
reasons.

The error system had two purposes:
1) allow for handling of an error by the calling function (or any function
up the call stack) in a different manner
2) provide the end user with context when things fail (as applications are
generally not very good at doing this)

Both of these are equally important - but aimed at different contexts
(developers, end-users).
Neither context should be ignored when considering changes IMHO.

Tim.


Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Matt Caswell
I agree with everything Richard just said. I just have some additional thoughts
inserted below.

On 13/06/2019 10:00, Richard Levitte wrote:
> If we look at it from the perspective of the application author,
> what's most often needed are reliable error/reason codes (to check and
> to react appropriatly to) and associated reason strings (for display).
> For the application author, it would normally be completely
> uninteresting exactly which function the error happened in (think
> about it, when's the last time you wanted to know exactly which
> internal function in libc raised the EAGAIN you just got?).  For the
> application author, the interesting bit is usually "what went wrong?"
> (reason) and "what did I call?" (they already know that).
> 
> 
> If we look at *our* needs as library writers, then the exact location
> that raised the error is interesting, of course!  But then, is it
> interesting in the form of codes, or are we rather interested in the
> text form?  I dunno about you, but I don't give a damn about the
> actual library and function codes, I look at the displayed library
> names and function names.  (and yet, they aren't necessarily enough
> without knowing the OpenSSL version).
> 
> 
> Finally, if we look at it from a provider author's perspective,
> dealing with all these different codes is quite hard.  Ideally, a
> provider author should just need to deal with their own reason codes
> and associated strings, and that's about it.

I think you missed another important perspective, i.e. that of the end user.

Note that this is slightly different to the needs of the application author. An
application author will typically do one of two things if an error occurs:

1) Try and figure out the cause of the error and recover from it, e.g. if I get
reason code x then I'll retry the operation with slightly different parameters
but if I get y then I'll just ignore this failure and carry on anyway.

or

2) If its an error they don't know how to handle then give up and display
whatever error details are available to the end user (or log them, or whatever)

So the application author is going to want a small number of reason codes that
they can handle. If its not in the small set they know about, then give up.

The end user OTOH doesn't care about reason codes at all and wants as detailed
an explanation of the error (in text form) as is possible. Potentially that
might mean different text for every point in the code where we raise an error.
Ideally they might want information about the context that led up to the error,
i.e. what particular operation was being attempted at the time. I'm thinking
that there is some relationship here between the information available from the
trace API, and the error API.


> What I would like to see is a simplified error system that delivers
> reason codes, and attached extra text information.  That text
> information could have two parts, where one is debugging information
> (__FILE__, __LINE__, __FUNC__, and possibly exact commit hash
> generated with 'git archive', see "export-subst" in gitattributes(5)),
> and the other is the extra data added with the likes of
> ERR_add_error_data().
> We certainly already have the infrastructure to allow this, the
> modifications needed wouldn't be that big.

+1 to this.

Matt


Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Richard Levitte
The discussion so far has touched on a number of items that aren't
necessarily the same thing:

- the instability of function codes.
- debugging information
- display of errors

and then, the point that no one talks about but that's the main reason
error codes in general exist at all:

- an application's need to figure out what went wrong and how to react


The current error codes have these items added together:

- library code (I'll touch on that further down)
- function code (debugging information)
- reason code


If we look at it from the perspective of the application author,
what's most often needed are reliable error/reason codes (to check and
to react appropriatly to) and associated reason strings (for display).
For the application author, it would normally be completely
uninteresting exactly which function the error happened in (think
about it, when's the last time you wanted to know exactly which
internal function in libc raised the EAGAIN you just got?).  For the
application author, the interesting bit is usually "what went wrong?"
(reason) and "what did I call?" (they already know that).


If we look at *our* needs as library writers, then the exact location
that raised the error is interesting, of course!  But then, is it
interesting in the form of codes, or are we rather interested in the
text form?  I dunno about you, but I don't give a damn about the
actual library and function codes, I look at the displayed library
names and function names.  (and yet, they aren't necessarily enough
without knowing the OpenSSL version).


Finally, if we look at it from a provider author's perspective,
dealing with all these different codes is quite hard.  Ideally, a
provider author should just need to deal with their own reason codes
and associated strings, and that's about it.


What I would like to see is a simplified error system that delivers
reason codes, and attached extra text information.  That text
information could have two parts, where one is debugging information
(__FILE__, __LINE__, __FUNC__, and possibly exact commit hash
generated with 'git archive', see "export-subst" in gitattributes(5)),
and the other is the extra data added with the likes of
ERR_add_error_data().
We certainly already have the infrastructure to allow this, the
modifications needed wouldn't be that big.


On the subject of 'openssl errstr', it's abssolutely fine for errors
raised within our libraries, but outside of that, it's useless.  This
is an issue with engines that no one has spoken about but that has
been there all along.  See, engines, if they use our error reporting
system at all, usually allocate a new library code dynamically, by
calling ERR_get_next_error_library().  Since the library code may
differ from one time to the next, the complete error codes raised will
differ, giving 'openssl errstr' an impossible job, even if the engine
is loaded.
This issue is exactly the same for provider modules.  This makes
library codes less than useful outside of our libraries.


I digress..  the main point is, however, that we need to keep the
application author's needs from our needs.  That affects how we look
at error codes.


Cheers,
Richard


On Wed, 12 Jun 2019 05:51:44 +0200,
Richard Levitte wrote:
> 
> Many of us, both past and present, have looked at the error reporting
> code and wante to change it somehow.  There's currently a PR, 9058,
> which covers one aspect, the function name indicator.
> 
> A -1 was raised early on for the purpose of starting a discussion, and
> eventually (possibly?) a vote.
> 
> A discussion point in that PR is whether it's still interesting to
> keep information on the system function/callback that was called when
> we're reporting a system error, i.e. this type of error report:
> 
> SYSerr(SYS_F_FSTAT, errno);
> 
> (incidently, there's another PR, 9072, which changes those to
> 'SYSerr("fstat", errno)')
> 
> So, the main points of discussion seem to be:
> 
> - should we remove function indicators in our error reporting?
> - should we make an exception for system errors (SYSerr())?
> 
> Cheers,
> Richard
> 
> P.S. I'm personally not entirely sure that we need an actual vote, if
> this discussion shows that we all agree.  Perhaps to confirm and make
> the decision formally solid?
> 
> -- 
> Richard Levitte levi...@openssl.org
> OpenSSL Project http://www.openssl.org/~levitte/
> 
-- 
Richard Levitte levi...@openssl.org
OpenSSL Project http://www.openssl.org/~levitte/


Re: Removing function names from errors (PR 9058)

2019-06-13 Thread Salz, Rich
I think exposing the function internals is a mistake for a couple of reasons: 
it encourages familiarity with, and dependence on, OpenSSL library internals, 
and it goes against the spirit of layering, and there is no guarantee that the 
function code does not change as internal code gets moved around (refactored, 
removed, etc).

We have the source filename and line number available, although this has the 
some of the same drawbacks as function codes. It's just a little less ugly 
because C provides that data and we don't wedge it into the error code space.

The proper way to handle this, in my experience, is *DO NOT REUSE ERROR CODES.* 
Each error code appears in exactly one place, and we could eventually build up 
documentation explaining what they mean, the causes, and how to address this. 
This means, we don't use ERR_R_MALLOC when trying to create an RSA key, for 
example, but rather a handful of new errors for ERR_R_RSA_CANT_CREATE_D, 
...CANT_CREATE_N, etc.  That is a big job, albeit mostly a tedious one.





Start up entropy gathering

2019-06-13 Thread Dr Paul Dale
Topic: The OpenSSL 3.0.0 release will include mitigation for the low entropy on 
boot and first boot problems.
Comment: PR#9084 removed such mitigation due to the negative side effects.

This vote has passed (two for, four abstain, none against):


The discussion now needs to turn to how this can be done.

The mitigation is for older Linux kernels only (and perhaps old Unix machines 
of other flavours).  So no Windows, getentropy(3), ….


I have a suggestion.  Two kind of.

Use a volatile mechanism for communicating between processes — a file in /tmp, 
a block of shared memory or whatever.
On system boot, the first instance of libcrypto that attempts to read 
/dev/urandom, first locks the shared resource and reads one byte from 
/dev/random.  It uses the shared resource to indicate that the read was 
successful and continues seeding from /dev/urandom.
Other instances of libcrypto start up, see that the shared resource is properly 
initialised and go straight to using /dev/urandom.
On system shutdown/reboot, the communication mechanism disappears and the 
entire process will repeat on the next boot.
There are some fine details but this is the gist of it.

This is more difficult to describe than code: see #9152 
.


Notes:

1. Reading the byte from /dev/random avoid problems on systems that cannot 
select(2) on /dev/random.

2. Early start up of /libcrypto will block until entropy is available (and 
consume eight bits of same).  Later starts have the overhead of the locking and 
resource access.  Importantly, no entropy will be drained from the system pool 
after start up.



The second suggestion is broadly similar but requires a file containing entropy 
that persists across reboots.  This alternative requires a more management: the 
entropy file once read needs to be rewritten immediately (and ideally on 
shutdown as well).  It also introduces a new attack vector against the entropy 
storage.  It also isn’t possible to skip the entropy file read/rewrite sequence 
because it is impossible to determine if /dev/urandom has actually been seeded. 
 I’ve not attempted to code this, persistent files containing seed material 
potentially introduce other problems.


Pauli
-- 
Dr Paul Dale | Cryptographer | Network Security & Encryption 
Phone +61 7 3031 7217
Oracle Australia