Re: when do things really need to be revoked? who decides?

'Jeremy Rowley' via [email protected] Thu, 30 May 2024 07:52:44 -0700

>From my perspective, it’s the third-party approval process some of these 
>companies are required  to go through to replace certs. Failure to go through 
>that process can result in government fines. Financial and medical companies 
>operating outside of the US seem especially handicapped by policy restrictions 
>when replacing certificates.
________________________________
From: [email protected] <[email protected]> on 
behalf of Wayne <[email protected]>
Sent: Thursday, May 30, 2024 8:34:16 AM
To: [email protected] <[email protected]>
Subject: Re: when do things really need to be revoked? who decides?


In the delayed revocation incidents recently, the main barrier for replacing a 
certificate has been deployment. I've not heard of validation being an issue 
as-of-yet, but it may just not have been mentioned.

On Thursday, May 30, 2024 at 6:49:04 AM UTC+1 Suchan Seo wrote:
I wonder what makes certficiate replacement slow and not wanted to do so - is 
it validation step or deploy new certficiate everywhere old certificate was?
OV/EV related valiations are valid for 398 days as 3.2.2.14.3 so most of 
revalidation should be about validating domains:

for simplyfying later part there could be an ocsp extension that points to 
another certificate (that signs same skid/publikey) that tell while this 
certificate itself is revoked, but this is replacement that likely to be valid: 
this makes in effect skips certificate deployment process, make replacement 
single email to webmaster to authroize replacement certificate.
2024년 5월 21일 화요일 오전 9시 46분 0초 UTC+9에 Mike Shaver님이 작성:
DELAYED REVOCATION IS TOO COMMON

This is long enough, so I’ll spare readers dozens of links to 
delayed-revocation incidents collected in Bugzilla; we all know that pretty 
much any other incident that involves misissuance will come with a 
delayed-revocation chaser these days.

In *many* of those delrev (?)incidents, we see a phrase like “we requested that 
our subscribers revoke and reissue”. They are not informing their subscribers 
as to a fixed revocation timeline, but rather simply asking if those 
subscribers if they would please do the revocation process when they’re able. 
In one case, I heard of a revocation request from a major CA that didn’t even 
have a timeline *suggested*. Of course, the subscriber gets no value out of 
replacing their certs: it’s pure overhead, and if WebPKI were operated 
perfectly, it would never be necessary. This is an externality of, most often, 
a CA’s failure to sufficiently invest in understanding, implementing, and 
verifying the processes that they use to twirl the keys to the entire web’s 
security.

Indeed in a number of cases the CAs didn’t even stop issuing once they realized 
that they were misissuing certs! Intentionally issuing certs that are known to 
be bad, what a world.

While CAs generally claim that they would be able to handle a mass revocation 
incident (such as due to leaked key material), the evidence we have for CAs 
aggressively revoking as called for by the BRs and the root programs is…scant. 
We’ve seen “it was a long weekend” as a reason for delaying revocation for 
certs—including some used by a different part of the CA’s company! One CA has 
proposed a “global fire drill” to stress-test revocation procedures, but we’re 
seeing revocation timelines reaching multiple months right now, so…a lot of 
stuff would end up burning in that fire.

CAs also tell us that they advocate and recommend for their subscribers to 
implement automation for cert management, but we never see any concrete targets 
or success criteria for those efforts, so they certainly seem to me to just be 
more “asking nicely”. (I’m not sure that all of the CAs claiming to be pushing 
for subscriber automation actually have robust ACME or similar support yet, in 
fact.)

(Some of the CAs made explicit promises years ago to not delay revocation, some 
of them issued even though they knew that zlint showed an error—there are lots 
of additional twists on simply “issuing bad certs and not cleaning them up as 
agreed”.)

Now, in the wake of these *many* delrev incidents, over years of history, the 
root programs have responded with pretty much no consequences whatsoever as far 
as I can tell. There’s one case open about Entrust’s overall behaviour, who are 
certainly over-achieving when it comes to ways to get location fields wrong, 
but they are definitely not the only ones who treat the BRs’ 1/5-day revocation 
instruction as instead meaning “when it’s convenient for the customer”.

THE QUESTION

So: what should be done to make revocations of misissued certificates—sometimes 
*intentionally* misissued certificates—as prompt as the BRs require?

The cost equation for CAs is obviously skewed against the health the web PKI, 
if we are to believe that the BRs are important. Once a CA has violated the BRs 
and misissued, it is *in their commercial interest* to not revoke promptly: it 
causes embarrassment and subscriber frustration, or even disruption to 
subscriber services. At the limit it might even lead a subscriber to change CAs 
if the reissuance events are frequent and disruptive enough.

On the other hand, the more bad certs there are floating around, even if it’s 
“only” a matter of a case mismatch, the less interoperable the web PKI is, and 
the harder it is for a relying party to make effective use of WebPKI’s 
guarantees. Let’s please not end up with a “quirks mode” for TLS certificate 
handling!

SOME OPTIONS

One option: decide that there really are some BR violations that “don’t 
matter”, such that revocation can happen on a more relaxed, accommodating 
timeline—or perhaps not at all, just letting them expire as has been seen in 
some delrev incidents already. This would mean that we would still see incident 
reports that in theory help other CAs learn to put the postal code in the right 
field or similar, but subscribers and CAs and root programs would have to do 
less work.

Another option: have affected certificates added to OneCRL after 72 hours. It 
would benefit from some automation, but it’s probably feasible to make 
relatively smooth. It is sometimes the case, worryingly, that it takes CAs a 
fair bit of time and multiple attempts to find all the affected certificates, 
so this might require some linter running off CT logs or similar as a watchdog.

Another another option: forbid CAs from selling WebPKI certificates into 
environments where a) revocation within a 5-day limit is operationally 
infeasible, and b) disruption of the related services would cause risk to human 
health and safety or similar. There are apparently many organizations out there 
which are critical to national economies or whatever, but need literal Earth 
months to replace a certificate. These are clearly cases where the requirements 
of WebPKI are incompatible with the operational constraints of the subscriber, 
so it’s not a good idea to mix them. (I’m sure some CAs could offer help with 
private PKI systems, probably with compelling margins.)

Yet another, this time somewhat more preventative: if a CA repeatedly 
demonstrates that they are unable or (always the case?) unwilling to honour 
their commitments to the BRs, impose validity length restrictions on certs that 
they issue. At least in that case future misissued certificates would be in the 
wild for longer, and it would also show nicely that CAs’ advocacy for 
certificate automation was fruitful. Ignoring Entrust’s diatribe against 90-day 
validity periods in that weird blog post, I don’t think that any CA has made a 
credible case that their customers would not be able to handle rotating 
certificates every 90 days, even if they have to carve the new fingerprint into 
a mountain using a toothbrush or whatever. They’d even know it’s coming.

One more: make delayed revocation incidents, specifically, more visible to 
subscribers and potential subscribers, and see if business pressure does what 
merely “agreeing legally to follow the BRs” (and optionally making empty “it’ll 
never happen again” promises) has been unable to do in too many cases.

THANKS FOR READING

I think the WebPKI is being poorly served by the *realities* of certificate 
integrity and misissuance responses. If nothing else, it’s causing a ton of 
delrev incidents for Ben to have to shepherd, without even module peers to 
assist him.

Something needs to change.

--
You received this message because you are subscribed to the Google Groups 
"[email protected]" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
[email protected]<mailto:[email protected]>.
To view this discussion on the web visit 
https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/79c8a805-c043-45d4-8a06-8946425a3cb5n%40mozilla.org<https://url.avanan.click/v2/___https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/79c8a805-c043-45d4-8a06-8946425a3cb5n*40mozilla.org?utm_medium=email&utm_source=footer___.YXAzOmRpZ2ljZXJ0OmE6bzo0ZmQxOTMyNmMyZjE5ZjI2NDAzMDU1NDA2NmRiZTgwMjo2OjRjYmI6NDJlMWU4MWJlOWZlNjc2M2RjNGQzOGYyYjI0NTZiYjUzNTg1MTEwZWQxMjY5ZTYzMzRlNjJlN2YzMzJjMjJhMDpoOlQ>.

-- 
You received this message because you are subscribed to the Google Groups 
"[email protected]" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/BYAPR14MB2600DF510017199D80A2F5888EF32%40BYAPR14MB2600.namprd14.prod.outlook.com.

Re: when do things really need to be revoked? who decides?

Reply via email to