Re: Incident report D-TRUST: syntax error in one tls certificate

Nick Lamb via dev-security-policy Tue, 04 Dec 2018 16:06:20 -0800

On Tue, 4 Dec 2018 14:55:47 +0100
Jakob Bohm via dev-security-policy
<[email protected]> wrote:


> Oh, so you meant "CA issuance systems and protocols with explicit
> automation features" (as opposed to e.g. web server systems or
> operating systems or site specific subscriber automation systems).
> That's why I asked.

Yes. These systems exist, have existed for some time, and indeed now
appear to make up a majority of all issuance.

> And note that this situation started with an OV certificate, not a DV
> certificate.  So more than domain ownership needs to be validated.

Fortunately it is neither necessary nor usual to insist upon fresh
validations for Organisational details for each issuance. Cached
validations can be re-used for a period specified in the BRs although
in some cases a CA might chose tighter constraints.

> You have shown that ONE system, which you happen to like, can avoid
> that weakness, IF you ignore some other issues.  You have not shown
> that requiring subscribers to do this for any and all combinations of
> validation systems and TLS server systems they encounter won't have
> this weakness.

Yes, an existence proof. Subscribers must of course choose trade-offs
that they're comfortable with. That might mean accepting that your web
site could become unavailable for a period of several days at short
notice, or that you can't safely keep running Microsoft IIS 6.0 even
though you'd prefer not to upgrade. What I want to make clear is that
offering automation without write access to the private key is not only
theoretically conceivable, it's actually easy enough that a bunch of
third party clients do it today because it was simpler than whatever
else they considered.

> I made no such claim.  I was saying that your hypothetical that
> all/most validation systems have the properties of ACME and that
> all/most TLS servers allow certificate replacement without access to
> the private key storage represents an idealized scenario different
> from practical reality.

Subscribers must choose for themselves, in particular it does not
constitute an excuse as to why they need more time to react. Choices
have consequences, if you choose a process you know can't be done in a
timely fashion, it won't be done in a timely fashion and you'll go
off-line.

> And the paragraph I quoted says to not do that unless you are using a
> HSM, which very few subscribers do.

It says it only recommends doing this for a _renewal_ if you have an
HSM. But a scheduled _renewal_ already provides sufficient notice for
you to replace keys and make a fresh CSR at your leisure if you so
choose. Which is why you were talking about unscheduled events.

If you have a different reference which says what you originally
claimed, I await it.

> It is not a convenience of scheduling.  It is a security best
> practice, called out (as the first example found) in that particular
> NIST document.

If that was indeed their claimed security best practice the NIST
document would say you must replace keys every time you replace
certificates, for which it would need some sort of justification, and
there isn't one. But it doesn't - it recommends you _renew_ once per
year‡, and that you should change keys when you _renew_, which is to
say, once per year.

‡ Technically this document is written to be copy-pasted into a three
ring binder for an organisation, so you can just write in some other
amount of time instead of <one year or less>. As with other documents of
this sort it will not achieve anything on its own.

> Which has absolutely no bearing on the rule that keys stored outside
> an HSM should (as a best practice) be changed on every reissue.  It
> would be contradictory if part B says not to reuse keys, and part C
> then prescribes an automation method violating that.

There is no such rule listed in that NIST document. The rule you've
cited talks about renewals, but a reissue is not a renewal. There was
nothing wrong with the expiry date for the certificate, that's not why
it was replaced.

There are however several recommendations which contradict this idea
that it's OK to have processes which take weeks to act, such as:

"System owners MUST maintain the ability to replace all certificates on
their systems within <2> days to respond to security incidents"

"Private keys, and the associated certificates, that have the
capability of being directly accessed by an administrator MUST be
replaced within <30> days of reassignment or <5> days of termination of
that administrator"


The NIST document also makes many other recommendations that - like the
one year limit - won't be followed by most real organisations; such as a
requirement to add CAA records, to revoke all their old certificates
a short time after they're replaced, the insistence on automation for
adding keys to "SSL inspection" type capabilities or the prohibition of
all wildcards.

> So it is real.

Oh yes, doing things that are a bad idea is very real. That is, after
all, why we're discussing this at all.

> - For systems that want the certificate as a PKCS#12 file only,
>   certificate import requires private key import and thus private key
>   write access.

Yup. This is a bad design. It's come up before. It's not our place to
tell programmers they can't do this, but it's certainly within our
remit (or indeed NIST's) to remind users that software designed this
way doesn't help them achieve their security goals. It can go on that
big heap of NIST recommendations actual users will ignore.

> - For systems that append the private key pem file to the certificate
>   chain PEM file, certificate import requires write access to the file
>   storing the private key.

This is also bad design but it's pretty trivial to "mask out" in a
wrapper of the software. I'm sure there are programs where this is
mandatory but in the ones I've seen it's usually an option rather than
the only way to provide certificates.

> - For systems that put the key and certificate chain in parallel PEM
>   files (such as Apache HTTPD), granting write access to the
> certificate but not the private key is potentially possible, though
> not necessarily.  For example, typical Apache HTTPD configurations
> place the certificate and key files in the same POSIX disk directory,
> where write access would be granted to the operator installing new
>   certificates.

Directory permissions might be one of the POSIX features most likely to
be misunderstood by people (as distinct from them knowing they don't
understand it). The operator writing to a certificate file does NOT need
write permission for a directory that certificate is in, such permission
would let them change the directory, which isn't what they need to do.
That operator only needs permission to write to the certificate file.

More over, in a truly automated system we should distinguish between
the permissions granted to the system and that fraction available to a
human user of the system. It is entirely possible that an automated
system which is technically permitted to write to a private key file is
not, in fact, designed to do so and does not do so, so that its user
cannot cause this to happen as a result of using the system.

> This assumes that granting a big global cloud provider easy access to
> your organization's private keys is considered an acceptable risk,
> which is not at all a given.

It may not be. Of course whether using cloud services in fact gives
them "easy access to your organisation's private keys" is a matter of
some debate, you will certainly find representatives of the major cloud
service providers happy to explain why they think their systems offer
better safeguards against malfeasance than whatever home-brew system
your organisation has itself.

One of the nice effects in automation at scale is that you can resist
the temptation to do things manually since it becomes necessarily more
work than automating them. This results in a situation where, say, an
AWS engineer isn't allowed to log into a customer's virtual machine and
tinker with their private keys NOT just out of a sense of the importance
of customer privacy but because doing so will never scale across the
platform. Any engineer who wants to do this is Bad at their job, even if
they aren't in fact a privacy-invading snoop, so there's no reason to
make it possible and every reason to detect them and fire them.

Symantec was never able to wean itself off the practice of manually
issuing certificates, even after years of problems caused by exactly
that approach. In contrast as I understand it ISRG / Let's Encrypt
obtained even their Mozilla-required test leaf certificates by...
actually requesting them through their automated issuance system just
like an end user.

> And there you assume that automation is the norm.  Which I am arguing
> it is not.

Well there's the thing. In terms of volume it is. That sort of thing
will sneak up on you with automation.

The largest CA by far in volume terms is ISRG's Let's Encrypt which of
course only issues with ACME. The second largest is probably Comodo /
Sectigo which issues a huge volume for cPanel (an automation solution)
and Cloudflare (also automated). Some fraction of the certs at second
tier CAs like DigiCert are automated but I would not hazard a guess at
how many.

Nick.
_______________________________________________
dev-security-policy mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-security-policy

Re: Incident report D-TRUST: syntax error in one tls certificate

Reply via email to