Re: Feasibility of a binding commitment to revoke before issuance

'Amir Omidi (aaomidi)' via [email protected] Wed, 24 Jul 2024 12:45:43 -0700

Hey Ben,

I think that the suggestion to increase the time frame for revocation from 
5 to 20 days is dangerous. Here are a couple of issues I have with this:


First: Security Impact Analysis is very difficult. It's arguably harder 
than root cause analysis. The majority of CAs (by count, not issuance) do 
an awful job at root cause analysis. I do not think they are (or, honestly 
speaking, will ever be) at the maturity level to do security impact 
analysis within 24 hours to determine if this is a 24-hour or 20-day 
revocation deadline.

Second: We're effectively going to be left with very few situations that 
necessitate 24-hour revocations. This proposal:

   1. Makes it harder to test out if mass revocations will actually work 
   when they're required.
   2. Discourages entities from adopting Certificate Lifecycle Management 
   (CLM).
   3. Makes it significantly more difficult to reduce certificate lifetimes 
   to a 90-day maximum in the future.
   4. Sacrifices Web PKI security because of a handful of enterprise 
   companies that have the money, and talent to solve this problem internally, 
   but are choosing to invest in ${literally_anything_else} instead.

Third: Holidays, weekends, etc. are not really relevant here either, 
because any of these incidents can become a 24-hour revocation incident 
*anyway, 
*and if the 24-hour revocation incidents are not happening often enough, 
then CAs will not be ready to execute on a revocation like that. If this is 
too prohibitive for a CA to staff itself so it can handle revocation within 
24 hours, they should consider not being a CA.

Fourth: Root Program enforcement of the existing policies is weak. Mozilla 
& Apple & Microsoft still have not distrusted Entrust despite the clear 
negligence in their operations. So what happens if a CA doesn't revoke in 
20 days? Or misses a 24-hour revocation requirement? Any sort of rule 
change here without significantly upping the enforcement is not okay imo.

Fifth: We already have a way for CAs and Subscribers to avoid the need for 
revocation: *Short lived certificates.*

Sixth: The distribution of who benefits and who is hurt by this change is 
interesting. For example, on the CA ans subscriber side:

   1. Top CAs (in terms of issuance load), are either fully automated, or 
   have automation integrated with part of their product. Some of these CAs 
   also provide CLM solutions to avoid outages due to CA issues. So they're 
   not really going to benefit from this.
   2. Majority of subscribers (in terms of numbers of certificates held) 
   have, or are planning to implement CLM into their products. So they don't 
   really get any benefit from this proposal either.

The folks that really benefit from this change are:

   - Boutique CAs that have barely adopted automation for their CA 
   issuances. (e.g. some small CAs, some government CAs, etc)
   - A handful of enterprise subscribers that are not investing into CLM 
   and are relying on manual work for certificate replacement.

The folks that hurt, quite a bit, from this change are the end users (many 
of which look up to Mozilla to protect them when many other RPs are not). 
This change would make the web less safe for everyone by giving more 
allowances for the *bad *CAs and Subscribers to continue their bad behavior.

Anyway, this change encourages more hands-on and non-automated certificate 
lifecycle management. This would be a regression in the ecosystem.

*Alternative Proposal*

This is going to be pretty controversial too: I'd be in favor of removing 
the 5-day category altogether, and require a 24-hour revocation for all 
mis-issuances (probably as a step function, lowering the 120 hour time 
limit by 24 hours every 6 months or something until they're aligned?)

My justification for this is the inverse of the stuff I mentioned above. In 
other words, it forces companies to adopt automation, removes ambiguity 
from the side of CAs, and generally propels the ecosystem forward. This 
also means that we get more assurances that when a Crowdstrike situation 
hits Web PKI, we actually can respond in a reasonable time frame. This 
proposal also significantly simplifies the communications CAs must have 
with their subscribers about why a certificate is being revoked.

Amir
On Wednesday, July 24, 2024 at 2:36:31 PM UTC-4 Ben Wilson wrote:

> Dear Tim and Matt,
>
> Thank you both for your insightful comments and contributions to the 
> ongoing discussion regarding timely certificate revocation. Your 
> perspectives are invaluable as we strive to find balanced and effective 
> solutions to this problem.
>
> Tim, your proposal to identify problematic certificates in advance and 
> make this information transparent not only addresses the core issue of 
> preparedness, but also encourages organizations to improve their crypto 
> agility. 
>
> Matt, your questions and alternative proposal for regular, randomized 
> revocation testing are equally thought-provoking. Regular testing would 
> ensure that processes are robust, and that organizations remain vigilant 
> about their revocation capabilities.
>
> Given the complexity and importance of this issue, I would like to keep 
> the discussion alive and invite additional comments from the Mozilla 
> community. 
>
> Personally, I currently favor extending the timeframe for the revocation 
> of certificates that have no security impact, e.g. to 20 days (exact 
> language TBD – e.g. by adding a new subsection to section 4.9.1.1 of the 
> Baseline 
> Requirements 
> <https://cabforum.org/working-groups/server/baseline-requirements/requirements/>
> ). I understand that extending the timeframe from 5 days to 20 days for 
> some types of revocations might raise questions about the empirical basis 
> for my position, especially concerning our continued preparation for 
> 24-hour revocations when security compromises like we experienced with 
> Heartbleed happen, but here are some points to consider. My review of past 
> Bugzilla incidents shows that many delayed revocations are not related to 
> security issues, but to compliance details that do not pose immediate 
> security risks. We have also received consistent feedback from CAs and 
> subscribers that the 5-day window for these types of revocations is too 
> restrictive and does not reflect the operational realities of many 
> organizations. The current 5-day timeframe does not account for holidays, 
> weekends, and other operational delays. Extending the timeframe provides a 
> more realistic window for organizations to respond without compromising 
> their operational integrity. Some organizations face legal and regulatory 
> hurdles that make immediate revocation challenging, and extending the 
> timeframe can help them comply with both CA/B Forum requirements and local 
> laws. When adopting any security-related measure, such as revocation, a 
> cost-benefit-based risk analysis should be done. The analysis should 
> justify why a 5-day period is necessary when a 20-day period might be just 
> as effective without imposing undue burdens. Finally, extending the 
> timeframe for non-security-related revocations does not hinder preparation 
> for 24-hour revocation timelines for critical security incidents. In fact, 
> it allows organizations to better allocate resources and develop robust 
> processes that can be quickly mobilized in the event of a security 
> compromise.
>
> But whatever decision we reach as consensus is good for me--our 
> collective goal should be to find solutions that work best for the entire 
> community, and it would be great if we could come up with some solutions 
> and then recommend them to the Server Certificate Working Group of the 
> CA/Browser Forum. To facilitate this, I propose that we continue to gather 
> more input from the community, and try to understand the different 
> perspectives, which will help us refine suggestions and identify potential 
> challenges and solutions. Everyone’s continued engagement and support are 
> crucial as we work towards a consensus. I encourage everyone in the 
> community to share their thoughts and suggestions to help us develop a 
> robust and effective strategy to improve security while reducing the number 
> of CA incidents that are due to delayed revocation.
>
> Thank you once again for your contributions, and I look forward to our 
> continued collaboration on these important issues.
>
> Best regards,
> Ben
> On Monday, July 15, 2024 at 8:09:59 PM UTC-6 Matt Palmer wrote:
>
>> Hi Tim, 
>>
>
>> On Mon, Jul 15, 2024 at 09:22:22PM +0000, 'Tim Hollebeek' via 
>> [email protected] wrote: 
>> > If a publicly-trusted certificate is difficult to replace, for various 
>> > regulatory or technical reasons, the real reasons do not magically 
>> appear 
>> > when rotation is necessary. But a host of fake reasons are likely to 
>> arise 
>> > ("we can't rotate certificates faster because it costs money we don't 
>> want 
>> > to spend"). Furthermore, making progress on this problem would be 
>> greatly 
>> > assisted by better information about exactly which certificates can't 
>> be 
>> > replaced, the timescale on which they CAN be replaced, and why. 
>> > 
>> > The world would be better if we all knew, IN ADVANCE, which 
>> certificates are 
>> > automatically replaceable, and which aren't. This would also greatly 
>> > streamline operations when replacements are necessary, as it removes 
>> the 
>> > burden on making the determinations with a ticking clock, which is a 
>> > situation that doesn't lend itself to careful and unbiased evaluations. 
>>
>> If I'm understanding your proposal correctly, it basically requires 
>> organisations to identify, in advance, certificates which cannot be 
>> replaced in line with the WebPKI requirements. 
>>
>> If so, while I agree with the motivations (to have more useful 
>> information), I have... questions: 
>>
>> 1. What is the motivation for an organisation to take the time and 
>> effort to identify all problematic certificates? These organisations 
>> apparently don't have the available resources to fix the current 
>> problems, what will their reaction be to being asked to do even more 
>> work? 
>>
>> 2. If an organisation does not proactively declare a problematic 
>> certificate as being problematic, what are the consequences at 
>> revocation time? I can't imagine that CAs will be willing to revoke 
>> those certificates even though the organisation has not declared them as 
>> problematic, for the same reasons that those CAs are not willing to 
>> currently revoke problematic certificates. 
>>
>> 3. If an organisation is capable of proactively identifying problematic 
>> certificates, why issue a WebPKI certificate at all? On its face, a 
>> declaration that a certificate is incapable of being rotated in line 
>> with the requirements of the WebPKI is an admission that the customer is 
>> (or at the very least expects to be) in breach of their subscriber 
>> agreement. 
>>
>> 4. For certificates that are problematic, why add an extension to a 
>> WebPKI certificate that says "this certificate is non-compliant", rather 
>> than just moving that usage to a private PKI. 
>>
>> 5. Do you have any reason to believe that CAs and their customers will 
>> even be *willing* to disclose this sort of information? In every 
>> previous incident that comes to mind, the prevailing attitude from CAs 
>> has been to refuse to disclose customer information in any meaningful 
>> fashion. I can understand their reticence there on one level, as a 
>> protection against "customer poaching"[1], and I'd be hesitant for 
>> Mozilla 
>> to make it a requirement for CAs to disclose this from an anti-trust 
>> action perspective. 
>>
>> > I realize this would be a major change to how we do things, but we've 
>> been 
>> > having this exact same conversation about certificate replacement for 
>> pretty 
>> > much the entire decade I've been involved at CABForum, and I think it's 
>> time 
>> > for radical change. If this isn't the right idea, it at least gives a 
>> sense 
>> > of the kind of change that is needed to make progress here, and I would 
>> love 
>> > to hear any other potential ideas for how we finally exit the traffic 
>> circle 
>> > and start moving forward again. 
>>
>> My proposal is that root programs require CAs to accept revocation 
>> reqests from the root programs themselves for randomly-chosen 
>> certificates. At random intervals, a root program sends a (suitably 
>> authenticated) email to the CA's problem reporting address stating "this 
>> certificate should be considered compromised as of this moment, revoke 
>> in line with the BRs". Frequency and volume could be tuned to issuance 
>> volume, with upper and lower bounds as needed to ensure universal 
>> coverage without unduly burdening any particular CA with excessive 
>> administrivia. 
>>
>> I base this proposal on two factors: 
>>
>> 1. Regular testing of processes is important to be confident that those 
>> processes work. When I was running the Pwnedkeys Revokinator, I found 
>> plenty of problems with revocation practices at several CAs, resulting 
>> in multiple problem reports. I'd be more than willing to resurrect the 
>> Revokinator to once again analyse revocation processing compliance if I 
>> had confidence in support for it by root programs. 
>>
>> 2. It would put *everyone* in the ecosystem on notice that revocation is 
>> something that needs to be planned for. At the moment, organisations 
>> can deploy their infrastructure on the basis that "it'll never happen to 
>> us, we don't lose our keys / suffer from bugs / whatever", and they 
>> don't consider other causes of revocation. While the probability of any 
>> particular certificate getting chosen would be very low, that *definite* 
>> non-zero probability is likely to get more attention than any number of 
>> out-of-the-ordinary incidents that organisations can dismiss with "well, 
>> *that* would never happen to us!" 
>>
>> - Matt 
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"[email protected]" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/e96e43b7-cc95-4318-9a2b-7366a4319a6cn%40mozilla.org.

Re: Feasibility of a binding commitment to revoke before issuance

Reply via email to