On Fri, May 12, 2017 at 06:42:20PM +0200, Daniel Schneller wrote:
> > That said, given that we can already look up a cert based on a name,
> > maybe in fact we could load all of them and just try to find a more
> > recent one if the first one reported by the SNI is outdated. I don't
> > know if that solves everything there.
> 
> 
> It actually might. In the end it would be something like a map, with the
> key being the domain, and the value a list of pointers to the actual
> certificates, sorted by remaining validity, having shortest first.

That's already what is done in the SNI trees, except that the validity
date is not considered, the first one matching is retrieved.

> I think it would benefit Let's Encrypt and similar scenarios. I would
> still require reloads to pick up newly added certificates. But as renewed
> certificates overlap their predecessors' validity period, dropping them
> into a directory and just doing a reload maybe once a day would work.
> Clients would still get the older one, until it finally expired, but that
> should not matter, as we are not talking about revocations where
> switching to a new cert is wanted quickly.

Using the old one "until it expires" is what really causes me a problem
(and I understand that in your case that's what you need). There are
several reasons for prefering the latest one instead :
  - it might provide stronger algorithms

  - it might use a CA which is not being blacklisted (remember that people
    started to complain about haproxy.org causing them some warnings because
    the CA was considered unsafe)

  - it was issued in the past (minutes, hours, days) so is likely already
    valid regardless of any small time shift. Using the old one one minute
    past its validity date will be a big problem.

  - the change will be effective at the moment of reload, meaning that any
    surprize like an incomplete chain, incorrect OCSP, key size incompatible
    with certain browsers, will be identified at an expected moment and when
    it's not too late to fix it. By using the oldest one as long as possible,
    it would break at any time in the middle of the night and would do it
    once you cannot roll back.

And that's the point. Users praise haproxy's reliability but in fact it's
not (just) the code's reliability (git log --grep BUG shames us), but the
fact that it has always been designed to be used by humans, who make
mistakes and who want to spot them very quickly and to fix them before
they become a big trouble. Config warnings/errors, checks for suspicious
constructs and logs are directly involved here. And we do know that our
users occasionally fail and we must help them recover, and even possibly
cover their mistakes before the boss or the customer has any chance to
notice.

So creating something designed to fail by default in their back without
prior notice and without the ability to quickly stop before anyone notices
is contrary to the philosophy here.

That doesn't mean that what you need must not be implemented, it means
that under no circumstance it should be the default nor happen to be
enabled by default. Thus I think that at minima if we ever go in that
direction, the default behaviour must be the expected one (ie: use the
most recent valid cert), and maybe there could be an option to prefer
the old one instead and to apply a date margin (eg: avoid using this
one if there's less than a day left).

(...)
> PS: This is an interesting discussion, and I am happy to continue
> it, if anyone feels the same.

I would not be surprized if we get some followups in either direction.
Over the mid term, more and more people will be affected by related
situations and the whole aspect of cert renewal will eventually become
hot. But I strongly doubt we'll do anything for this in 1.8, though
collecting views, ideas and constraints can be useful to try to serve
everyone the best later.

> As I said, I will try to solve this via
> provisioning scripts in the meantime, so there is no time pressure.

That's perfect! Your feedback and possible trouble in doing this will
also definitely help!

thanks,
Willy

Reply via email to