Re: Naming conventions

2020-07-06 Thread Matt Caswell



On 06/07/2020 07:41, Richard Levitte wrote:
> On Fri, 03 Jul 2020 11:25:37 +0200,
> Matt Caswell wrote:
>> On 19/06/2020 08:15, Tomas Mraz wrote:
>>> to something like:
>>>
>>> int EVP_MacInit(EVP_MAC_CTX *ctx);
>>>   
>>> int EVP_MacUpdate(EVP_MAC_CTX *ctx, const unsigned char *data, size_t 
>>> datalen);
>>>  
>>> int EVP_MacFinal(EVP_MAC_CTX *ctx, unsigned char *out, size_t *outl, size_t 
>>> outsize);
>>>
>>> or at least
>>>
>>> int EVP_MACInit(EVP_MAC_CTX *ctx);
>>>   
>>> int EVP_MACUpdate(EVP_MAC_CTX *ctx, const unsigned char *data, size_t 
>>> datalen);
>>>  
>>> int EVP_MACFinal(EVP_MAC_CTX *ctx, unsigned char *out, size_t *outl, size_t 
>>> outsize);
>>>
>>> Should we do that? I hope for the sheer ugliness of the supposedly
>>> consistent names that we do not.
>>
>> This pattern is not universally used (as you mention the EVP_PKEY
>> API does something different).
> 
> Er, you've choose what you wanted to read, I suppose...  For fairly
> high level EVP APIs, the CamelCase form is actually quite consistent.
> For EVP_PKEY, we have them covering most if not all higher level
> usages:

I'm actually ok with using the CamelCase versions as well. But in the
interests of choosing my battles I'm for the moment just focusing on
getting us back to where we started.


> 
> EVP_Sign{Init, Update, Final}
> EVP_Verify{Init, Update, Final}
> EVP_Open{Init, Update, Final}
> EVP_Seal{Init, Update, Final}
> 
>> I remain strongly opposed to this renaming as I really don't think it
>> helps to do this sort of thing now. It just introduces significant
>> confusion without a long term strategy. We are too close to 3.0 beta1 to
>> embark on this journey now. I'm also not convinced that strategically
>> this is the right pattern to use.
> 
> What I hear from this is that 3.0 is the wrong time to introduce a new
> (and hopefully better) public API, that we should post-pone that to
> something like 4.0.  I can get along with that line of thought.

Ok. Good.

Matt





Re: Naming conventions

2020-07-06 Thread Richard Levitte
On Fri, 03 Jul 2020 11:25:37 +0200,
Matt Caswell wrote:
> On 19/06/2020 08:15, Tomas Mraz wrote:
> > to something like:
> > 
> > int EVP_MacInit(EVP_MAC_CTX *ctx);
> >   
> > int EVP_MacUpdate(EVP_MAC_CTX *ctx, const unsigned char *data, size_t 
> > datalen);
> >  
> > int EVP_MacFinal(EVP_MAC_CTX *ctx, unsigned char *out, size_t *outl, size_t 
> > outsize);
> > 
> > or at least
> > 
> > int EVP_MACInit(EVP_MAC_CTX *ctx);
> >   
> > int EVP_MACUpdate(EVP_MAC_CTX *ctx, const unsigned char *data, size_t 
> > datalen);
> >  
> > int EVP_MACFinal(EVP_MAC_CTX *ctx, unsigned char *out, size_t *outl, size_t 
> > outsize);
> > 
> > Should we do that? I hope for the sheer ugliness of the supposedly
> > consistent names that we do not.
> 
> This pattern is not universally used (as you mention the EVP_PKEY
> API does something different).

Er, you've choose what you wanted to read, I suppose...  For fairly
high level EVP APIs, the CamelCase form is actually quite consistent.
For EVP_PKEY, we have them covering most if not all higher level
usages:

EVP_Sign{Init, Update, Final}
EVP_Verify{Init, Update, Final}
EVP_Open{Init, Update, Final}
EVP_Seal{Init, Update, Final}

> I remain strongly opposed to this renaming as I really don't think it
> helps to do this sort of thing now. It just introduces significant
> confusion without a long term strategy. We are too close to 3.0 beta1 to
> embark on this journey now. I'm also not convinced that strategically
> this is the right pattern to use.

What I hear from this is that 3.0 is the wrong time to introduce a new
(and hopefully better) public API, that we should post-pone that to
something like 4.0.  I can get along with that line of thought.

Cheers,
Richard

-- 
Richard Levitte levi...@openssl.org
OpenSSL Project http://www.openssl.org/~levitte/


Re: Naming conventions

2020-07-03 Thread Matt Caswell



On 19/06/2020 08:15, Tomas Mraz wrote:
> to something like:
> 
> int EVP_MacInit(EVP_MAC_CTX *ctx);
>   
> int EVP_MacUpdate(EVP_MAC_CTX *ctx, const unsigned char *data, size_t 
> datalen);
>  
> int EVP_MacFinal(EVP_MAC_CTX *ctx, unsigned char *out, size_t *outl, size_t 
> outsize);
> 
> or at least
> 
> int EVP_MACInit(EVP_MAC_CTX *ctx);
>   
> int EVP_MACUpdate(EVP_MAC_CTX *ctx, const unsigned char *data, size_t 
> datalen);
>  
> int EVP_MACFinal(EVP_MAC_CTX *ctx, unsigned char *out, size_t *outl, size_t 
> outsize);
> 
> Should we do that? I hope for the sheer ugliness of the supposedly
> consistent names that we do not.

This pattern is not universally used (as you mention the EVP_PKEY API
does something different). This is one of the areas of existing
quirkiness, so I'd be ok with leaving these names as they are.

But the fact that there is some strangeness in this area that we may be
willing to accept does not mean we should accept a totally new pattern
in other function names, i.e.

EVP_MAC_CTX_new -> EVP_MAC_new_ctx
EVP_MAC_CTX_free -> EVP_MAC_free_ctx
EVP_MAC_CTX_dup -> EVP_MAC_dup_ctx
EVP_MAC_CTX_mac -> EVP_MAC_get_ctx_mac
EVP_MAC_CTX_get_params -> EVP_MAC_get_ctx_params
EVP_MAC_CTX_set_params -> EVP_MAC_set_ctx_params

I remain strongly opposed to this renaming as I really don't think it
helps to do this sort of thing now. It just introduces significant
confusion without a long term strategy. We are too close to 3.0 beta1 to
embark on this journey now. I'm also not convinced that strategically
this is the right pattern to use.

In the interests of getting to some kind of decision on this I think a
simple OTC vote is probably in order. I already have a PR to revert this
(12186) which has an OTC hold on it. Therefore a simple vote to lift the
hold would be sufficient to at least make a decision for now.

My proposed vote text is:

"We should lift the OTC hold on PR12186 and continue the normal review
process"


Matt


Re: Naming conventions

2020-06-29 Thread Richard Levitte
This discussion seems to have gone stale.

As far as I can read the thread, there are three lines of thought at
play (in no special order):

- the API put forth in #11996 and #11997
- the API exemplified with EVP_MAC and EVP_KDF before #11996 and #11997
- the API exemplified by functions in CamelCase

I'm uncertain if we mean to say that only new EVP features (sometimes
called sub-APIs) should be affected by whatever we decide, or if we
should make appropriate aliases for older EVP features as well (one
might argue that the CamelCase functions pave a way that avoids such
aliases).

Cheers,
Richard

-- 
Richard Levitte levi...@openssl.org
OpenSSL Project http://www.openssl.org/~levitte/


Re: Naming conventions

2020-06-19 Thread Richard Levitte
On Fri, 19 Jun 2020 09:15:22 +0200,
Tomas Mraz wrote:
> The problem is that there is not really fully consistent convention
> followed currently (even in 1.1.1, not counting the API additions in
> 3.0).
> 
> For example if we would like to follow the convention _fully_ we would
> have to rename these new EVP_MAC functions:
> 
> int EVP_MAC_init(EVP_MAC_CTX *ctx);
>   
> int EVP_MAC_update(EVP_MAC_CTX *ctx, const unsigned char *data, size_t 
> datalen);
>  
> int EVP_MAC_final(EVP_MAC_CTX *ctx, unsigned char *out, size_t *outl, size_t 
> outsize);
> 
> to something like:
> 
> int EVP_MacInit(EVP_MAC_CTX *ctx);
>   
> int EVP_MacUpdate(EVP_MAC_CTX *ctx, const unsigned char *data, size_t 
> datalen);
>  
> int EVP_MacFinal(EVP_MAC_CTX *ctx, unsigned char *out, size_t *outl, size_t 
> outsize);
> 
> or at least
> 
> int EVP_MACInit(EVP_MAC_CTX *ctx);
>   
> int EVP_MACUpdate(EVP_MAC_CTX *ctx, const unsigned char *data, size_t 
> datalen);
>  
> int EVP_MACFinal(EVP_MAC_CTX *ctx, unsigned char *out, size_t *outl, size_t 
> outsize);

Someone's walking memory lane ;-)
I had to look back at OpenSSL_0_9_1c to remind myself, 'cause today,
we're quite mixed up, what with the slightly lower level functions
EVP_PKEY_sign_init, EVP_PKEY_sign, etc etc etc...

I would say, thought, that if you want to do the higher level function
thing (which are all CamelCase as far as I can see), there's another
naming convention going on, at least with EVP_PKEY, it seems to be
done according to the higher level operation you want to perform, not
the type of data used to do it...  what do you normally to with a MAC? 
So:

int EVP_AuthenticateInit(EVP_MAC_CTX *ctx);
int EVP_AUthenticateUpdate(EVP_MAC_CTX *ctx, const unsigned char *data, size_t 
datalen);
int EVP_AUthenticateFinal(EVP_MAC_CTX *ctx, unsigned char *out, size_t *outl, 
size_t outsize);

... or something like that.  Quite frankly, a naming convention that
is about the operation rather than the type of any type is something I
could play along with.

> And I actually hope we could add at least non-CamelCase aliases to the
> existing (non-deprecated) CamelCase functions because they were always
> the worst offender of the API consistency.

I agree with you...  but we have to recognise that would be a bigger
remake, and if we do look at just the CamelCase API, it's actually
fairly consistent (turning a blind eye at DigestSign quirkiness when I
say that).

(I suppose that CamelCase was inspired by the time, it was quite
popular in certain circles in the 90's, and got especially popularised
by Microsoft...  and well, there are quite a few Microsoft ideas that
have infiltrated OpenSSL...  need I mention '_ex'?)

Cheers,
Richard

-- 
Richard Levitte levi...@openssl.org
OpenSSL Project http://www.openssl.org/~levitte/


Re: Naming conventions

2020-06-19 Thread Tomas Mraz
On Fri, 2020-06-19 at 01:48 +1000, Tim Hudson wrote:
> We have a convention that we mostly follow. Adding new stuff with
> variations in the convention offers no benefit without also adjusting
> the rest of the API. Inconsistencies really do not assist any
> developer.
> 
> Where APIs have been added that don't follow the conventions they
> should be changed. 
> 
> It really is that simple - each developer may have a different set of
> personal preferences and if we simply allow any two people to pick
> their own API pattern effectively at whim we end up with a real mess
> over time.
> 
> This example is a clear cut case where we should fix the unnecessary
> variation in the pattern. It serves no benefit whatsoever to have
> such a mix of API patterns. 
> 
> We do have some variations that we should adjust - and for APIs that
> have been in official releases dropping in backwards compatibility
> macros is appropriate.
> 
> The argument that we aren't completely consistent is specious - it is
> saying because we have a few mistakes that have slipped through the
> cracks we have open season on API patterns. 
> 
> It also would not hurt to have an automated check of API deviations
> on commits to catch such things in future. 

The problem is that there is not really fully consistent convention
followed currently (even in 1.1.1, not counting the API additions in
3.0).

For example if we would like to follow the convention _fully_ we would
have to rename these new EVP_MAC functions:

int EVP_MAC_init(EVP_MAC_CTX *ctx);
  
int EVP_MAC_update(EVP_MAC_CTX *ctx, const unsigned char *data, size_t datalen);
 
int EVP_MAC_final(EVP_MAC_CTX *ctx, unsigned char *out, size_t *outl, size_t 
outsize);

to something like:

int EVP_MacInit(EVP_MAC_CTX *ctx);
  
int EVP_MacUpdate(EVP_MAC_CTX *ctx, const unsigned char *data, size_t datalen);
 
int EVP_MacFinal(EVP_MAC_CTX *ctx, unsigned char *out, size_t *outl, size_t 
outsize);

or at least

int EVP_MACInit(EVP_MAC_CTX *ctx);
  
int EVP_MACUpdate(EVP_MAC_CTX *ctx, const unsigned char *data, size_t datalen);
 
int EVP_MACFinal(EVP_MAC_CTX *ctx, unsigned char *out, size_t *outl, size_t 
outsize);

Should we do that? I hope for the sheer ugliness of the supposedly
consistent names that we do not.

Fortunately or unfortunately depending on personal opinons we have
already diverged from that pattern with EVP_PKEY API.

And I actually hope we could add at least non-CamelCase aliases to the
existing (non-deprecated) CamelCase functions because they were always
the worst offender of the API consistency.

-- 
Tomáš Mráz
No matter how far down the wrong road you've gone, turn back.
  Turkish proverb
[You'll know whether the road is wrong if you carefully listen to your
conscience.]




Re: Naming conventions

2020-06-18 Thread Richard Levitte
On Thu, 18 Jun 2020 16:36:30 +0200,
Matt Caswell wrote:
> On 18/06/2020 13:03, Richard Levitte wrote:
> > As for not doing something piecemeal, I actually disagree, I see
> > benefit in more than just one person getting their hands dirty (plus
> > changing everything in one go may be overwhelming, and would make for
> > another huge PR that takes overly much time to get through).  However,
> > as strategies go, and if we agree on making the change, we could very
> > well create an issue for each affected sub-API and have it point at a
> > common page that describes what we agreed upon...  this could be a
> > good use of the github wiki tab, for example.
> 
> I don't mean piecemeal in the sense of doing it spread over a number of
> PRs. I mean piecemeal in the sense of doing it spread over a number of
> releases. As far as I can tell #11996 and #11997 were one offs without
> any long term strategy in mind to convert the whole API in this way.

Ah, ok.  I agree with you there, if we're doing this, we should do it
consistently for the same release.

> > When do you see that time being, then?  3.1 (we've talked about it
> > being a "cleanup" release)?  4.0?
> 
> Perhaps never. But if we do it then either 3.1 or 4.0 could be
> considered. I am yet to be convinced that its worth it.

I actually have a different idea, but that's much more further in the
future: a consistent libcrypto API across the board, where all
libcrypto functions are in the "namespaces" (i.e. are prefixed with)
OSSL or OPENSSL.  No exception.
That idea would be a fairly complete API remake, and I do think it
would be worth the while.
So uhm, "never" isn't a line of thinking that I'm ready to accept.

Cheers,
Richard

-- 
Richard Levitte levi...@openssl.org
OpenSSL Project http://www.openssl.org/~levitte/


Re: Naming conventions

2020-06-18 Thread Tim Hudson
We have a convention that we mostly follow. Adding new stuff with
variations in the convention offers no benefit without also adjusting the
rest of the API. Inconsistencies really do not assist any developer.

Where APIs have been added that don't follow the conventions they should be
changed.

It really is that simple - each developer may have a different set of
personal preferences and if we simply allow any two people to pick their
own API pattern effectively at whim we end up with a real mess over time.

This example is a clear cut case where we should fix the unnecessary
variation in the pattern. It serves no benefit whatsoever to have such a
mix of API patterns.

We do have some variations that we should adjust - and for APIs that have
been in official releases dropping in backwards compatibility macros is
appropriate.

The argument that we aren't completely consistent is specious - it is
saying because we have a few mistakes that have slipped through the cracks
we have open season on API patterns.

It also would not hurt to have an automated check of API deviations on
commits to catch such things in future.

Tim.


Re: Naming conventions

2020-06-18 Thread Matt Caswell



On 18/06/2020 13:03, Richard Levitte wrote:
> 
> Okie, if we're going to start this discussion with taking a stand, I
> guess I'll declare that while I initially had the exact same concern,
> I now see this change in a positive light.  This comment from #11997
> got me to change my mind:
> 
> The already established patterns is a furphy. I'll reword it more
> precisely: we've done things badly in the past, thus we should do
> them badly moving forwards. We've always been at war with
> Eurasia.oh wrong context, it's Eastasia.
> 

While I can agree with the sentiment here I disagree with the conclusion
that the answer is to change the APIs piecemeal to introduce even more
quirkiness and inconsistency in the APIs.

I'm also not convinced that the currently slightly strange
inconsistencies are worth fixing. Yes, if we were to design it again we
would do it differently. But that does not mean that we have to change
everything to fit into our ideal picture of the way things should be.

To fix it necessarily involves changing the names of a lot of different
functions (how many depends on exactly what convention we eventually
decided upon). Yes, we can lessen the impact by providing backwards
compat macros - but that doesn't mean that a change of name is without
cost. Either applications must change all of the names in their code to
the new form OR they keep the old form and start using the new form for
new code. This pushes the complexity of the name change onto our
downstream application maintainers. They will have all the confusion of
different naming conventions within their application.

It adds further complexity to our documentation. Either we only document
the new forms (which is confusing when application maintainers and
authors see all the old forms being used "out there") - or we have to
ensure all the synonyms are documented everywhere - which is itself
confusing to our users. And our documentation will be at odds with all
the examples and samples and unofficial documentation that our users
undoubtedly use.

I think a wholesale name change can only really be considered if it
brings significant benefits. At the moment I'm struggling to see that
the benefits outweigh the costs.

Even if we decided that we would want to do this, we should also think
twice about doing it in the 3.0 timeframe. We are already pushing a lot
of change onto our users with the change to our architecture. Pushing
more change for dubious benefit doesn't seem like a great idea.

> As for not doing something piecemeal, I actually disagree, I see
> benefit in more than just one person getting their hands dirty (plus
> changing everything in one go may be overwhelming, and would make for
> another huge PR that takes overly much time to get through).  However,
> as strategies go, and if we agree on making the change, we could very
> well create an issue for each affected sub-API and have it point at a
> common page that describes what we agreed upon...  this could be a
> good use of the github wiki tab, for example.

I don't mean piecemeal in the sense of doing it spread over a number of
PRs. I mean piecemeal in the sense of doing it spread over a number of
releases. As far as I can tell #11996 and #11997 were one offs without
any long term strategy in mind to convert the whole API in this way. In
the absence of a strategy we shouldn't be making one off changes like this.

>> There *are* inconsistencies and quirks in our current naming
>> conventions. I am skeptical that now is the time to resolve them given
>> the number of other changes we are already introducing into 3.0.
> 
> When do you see that time being, then?  3.1 (we've talked about it
> being a "cleanup" release)?  4.0?

Perhaps never. But if we do it then either 3.1 or 4.0 could be
considered. I am yet to be convinced that its worth it.

Matt



Re: Naming conventions

2020-06-18 Thread Richard Levitte
On Thu, 18 Jun 2020 12:36:52 +0200,
Matt Caswell wrote:
> 
> EVP_MAC_CTX_new -> EVP_MAC_new_ctx
> EVP_MAC_CTX_free -> EVP_MAC_free_ctx
> EVP_MAC_CTX_dup -> EVP_MAC_dup_ctx
> EVP_MAC_CTX_mac -> EVP_MAC_get_ctx_mac
> EVP_MAC_CTX_get_params -> EVP_MAC_get_ctx_params
> EVP_MAC_CTX_set_params -> EVP_MAC_set_ctx_params
> 
> There are similar renamings for the KDF APIs.
> 
> I am opposed to this renaming on the basis that it is inconsistent with
> what we do elsewhere.

Okie, if we're going to start this discussion with taking a stand, I
guess I'll declare that while I initially had the exact same concern,
I now see this change in a positive light.  This comment from #11997
got me to change my mind:

The already established patterns is a furphy. I'll reword it more
precisely: we've done things badly in the past, thus we should do
them badly moving forwards. We've always been at war with
Eurasia.oh wrong context, it's Eastasia.

Ref: https://github.com/openssl/openssl/pull/11997#issuecomment-636658901

For historical background, changing the function name pattern isn't a
new discussion, it's even been brought up on this list more than a
year ago: 
https://mta.openssl.org/pipermail/openssl-project/2019-March/001280.html
Unfortunately, those discussion never got much traction.  I'm not at
all surprised that #11996 and #11997 came to be, it follows a fairly
human pattern that when talking leads nowhere and someone is still
engaged enough, action will happen.

When it comes to EVP_MAC and EVP_KDF, those sub-APIs already break
with existing patterns, albeit more subtly.  They have a _dup
function, where previous sub-APIs have a _copy function.  But that's
perhaps small enough to be acceptable.

> For example, except for the MAC/KDF APIs I see
> there are 26 functions of the form:
> 
> FOO_CTX_new
> 
> The case is similar for FOO_CTX_free. Aside from the newly introduced
> MAC/KDF names there are no other instances of FOO_new_ctx or
> FOO_free_ctx. This is totally inconsistent and, IMO, will cause
> significant confusion for our users.

Sure, but it counters the confusion with when to use FOO and when to
use FOO_CTX as a prefix.  You and I seem to be pretty clear on this,
but I've seen enough questions on the matter to see the benefit with
this change, even though it breaks the old pattern.

> "If we want to change the naming conventions then we should do so only
> after agreeing what those conventions should be, and agreeing a strategy
> for how we are going to apply them. It should not be done piecemeal (and
> hidden away in a PR that supposedly made things more consistent but in
> reality did the opposite)."

I agree that it would have been good to discuss it further first.
However, those discussion have been met with utter silence before, so
that wasn't very fruitful.  I guess that #11996 and #11997 was enough
of a slap to wake us up.

As for not doing something piecemeal, I actually disagree, I see
benefit in more than just one person getting their hands dirty (plus
changing everything in one go may be overwhelming, and would make for
another huge PR that takes overly much time to get through).  However,
as strategies go, and if we agree on making the change, we could very
well create an issue for each affected sub-API and have it point at a
common page that describes what we agreed upon...  this could be a
good use of the github wiki tab, for example.

> There *are* inconsistencies and quirks in our current naming
> conventions. I am skeptical that now is the time to resolve them given
> the number of other changes we are already introducing into 3.0.

When do you see that time being, then?  3.1 (we've talked about it
being a "cleanup" release)?  4.0?

Each choice has its pros and cons:

- Pros for 3.1: we get the added API variants out fairly early.
- Cons for 3.1: because of ABI compatibility concerns, the old
  functions will have to be preserved as is and the new will have to
  be implemented as macros or new functions, making for a bigger
  libcrypto.num.

- Pros for 4.0: ABI concerns aren't a factor, so the old functions can
  be renamed, and we can preserve the old names as macros.  No need
  for double entries in libcrypto.num.
- Cons for 4.0: By that's far away, which means that we solidify
  the old pattern even more before remaking it.


Side note: if we get through a rename fest, can we get rid of
CamelCase?

Cheers,
Richard

-- 
Richard Levitte levi...@openssl.org
OpenSSL Project http://www.openssl.org/~levitte/