Dear Tim and Matt,
Thank you both for your insightful comments and contributions to the ongoing discussion regarding timely certificate revocation. Your perspectives are invaluable as we strive to find balanced and effective solutions to this problem. Tim, your proposal to identify problematic certificates in advance and make this information transparent not only addresses the core issue of preparedness, but also encourages organizations to improve their crypto agility. Matt, your questions and alternative proposal for regular, randomized revocation testing are equally thought-provoking. Regular testing would ensure that processes are robust, and that organizations remain vigilant about their revocation capabilities. Given the complexity and importance of this issue, I would like to keep the discussion alive and invite additional comments from the Mozilla community. Personally, I currently favor extending the timeframe for the revocation of certificates that have no security impact, e.g. to 20 days (exact language TBD – e.g. by adding a new subsection to section 4.9.1.1 of the Baseline Requirements <https://cabforum.org/working-groups/server/baseline-requirements/requirements/> ). I understand that extending the timeframe from 5 days to 20 days for some types of revocations might raise questions about the empirical basis for my position, especially concerning our continued preparation for 24-hour revocations when security compromises like we experienced with Heartbleed happen, but here are some points to consider. My review of past Bugzilla incidents shows that many delayed revocations are not related to security issues, but to compliance details that do not pose immediate security risks. We have also received consistent feedback from CAs and subscribers that the 5-day window for these types of revocations is too restrictive and does not reflect the operational realities of many organizations. The current 5-day timeframe does not account for holidays, weekends, and other operational delays. Extending the timeframe provides a more realistic window for organizations to respond without compromising their operational integrity. Some organizations face legal and regulatory hurdles that make immediate revocation challenging, and extending the timeframe can help them comply with both CA/B Forum requirements and local laws. When adopting any security-related measure, such as revocation, a cost-benefit-based risk analysis should be done. The analysis should justify why a 5-day period is necessary when a 20-day period might be just as effective without imposing undue burdens. Finally, extending the timeframe for non-security-related revocations does not hinder preparation for 24-hour revocation timelines for critical security incidents. In fact, it allows organizations to better allocate resources and develop robust processes that can be quickly mobilized in the event of a security compromise. But whatever decision we reach as consensus is good for me--our collective goal should be to find solutions that work best for the entire community, and it would be great if we could come up with some solutions and then recommend them to the Server Certificate Working Group of the CA/Browser Forum. To facilitate this, I propose that we continue to gather more input from the community, and try to understand the different perspectives, which will help us refine suggestions and identify potential challenges and solutions. Everyone’s continued engagement and support are crucial as we work towards a consensus. I encourage everyone in the community to share their thoughts and suggestions to help us develop a robust and effective strategy to improve security while reducing the number of CA incidents that are due to delayed revocation. Thank you once again for your contributions, and I look forward to our continued collaboration on these important issues. Best regards, Ben On Monday, July 15, 2024 at 8:09:59 PM UTC-6 Matt Palmer wrote: > Hi Tim, > > On Mon, Jul 15, 2024 at 09:22:22PM +0000, 'Tim Hollebeek' via > [email protected] wrote: > > If a publicly-trusted certificate is difficult to replace, for various > > regulatory or technical reasons, the real reasons do not magically appear > > when rotation is necessary. But a host of fake reasons are likely to > arise > > ("we can't rotate certificates faster because it costs money we don't > want > > to spend"). Furthermore, making progress on this problem would be greatly > > assisted by better information about exactly which certificates can't be > > replaced, the timescale on which they CAN be replaced, and why. > > > > The world would be better if we all knew, IN ADVANCE, which certificates > are > > automatically replaceable, and which aren't. This would also greatly > > streamline operations when replacements are necessary, as it removes the > > burden on making the determinations with a ticking clock, which is a > > situation that doesn't lend itself to careful and unbiased evaluations. > > If I'm understanding your proposal correctly, it basically requires > organisations to identify, in advance, certificates which cannot be > replaced in line with the WebPKI requirements. > > If so, while I agree with the motivations (to have more useful > information), I have... questions: > > 1. What is the motivation for an organisation to take the time and > effort to identify all problematic certificates? These organisations > apparently don't have the available resources to fix the current > problems, what will their reaction be to being asked to do even more > work? > > 2. If an organisation does not proactively declare a problematic > certificate as being problematic, what are the consequences at > revocation time? I can't imagine that CAs will be willing to revoke > those certificates even though the organisation has not declared them as > problematic, for the same reasons that those CAs are not willing to > currently revoke problematic certificates. > > 3. If an organisation is capable of proactively identifying problematic > certificates, why issue a WebPKI certificate at all? On its face, a > declaration that a certificate is incapable of being rotated in line > with the requirements of the WebPKI is an admission that the customer is > (or at the very least expects to be) in breach of their subscriber > agreement. > > 4. For certificates that are problematic, why add an extension to a > WebPKI certificate that says "this certificate is non-compliant", rather > than just moving that usage to a private PKI. > > 5. Do you have any reason to believe that CAs and their customers will > even be *willing* to disclose this sort of information? In every > previous incident that comes to mind, the prevailing attitude from CAs > has been to refuse to disclose customer information in any meaningful > fashion. I can understand their reticence there on one level, as a > protection against "customer poaching"[1], and I'd be hesitant for Mozilla > to make it a requirement for CAs to disclose this from an anti-trust > action perspective. > > > I realize this would be a major change to how we do things, but we've > been > > having this exact same conversation about certificate replacement for > pretty > > much the entire decade I've been involved at CABForum, and I think it's > time > > for radical change. If this isn't the right idea, it at least gives a > sense > > of the kind of change that is needed to make progress here, and I would > love > > to hear any other potential ideas for how we finally exit the traffic > circle > > and start moving forward again. > > My proposal is that root programs require CAs to accept revocation > reqests from the root programs themselves for randomly-chosen > certificates. At random intervals, a root program sends a (suitably > authenticated) email to the CA's problem reporting address stating "this > certificate should be considered compromised as of this moment, revoke > in line with the BRs". Frequency and volume could be tuned to issuance > volume, with upper and lower bounds as needed to ensure universal > coverage without unduly burdening any particular CA with excessive > administrivia. > > I base this proposal on two factors: > > 1. Regular testing of processes is important to be confident that those > processes work. When I was running the Pwnedkeys Revokinator, I found > plenty of problems with revocation practices at several CAs, resulting > in multiple problem reports. I'd be more than willing to resurrect the > Revokinator to once again analyse revocation processing compliance if I > had confidence in support for it by root programs. > > 2. It would put *everyone* in the ecosystem on notice that revocation is > something that needs to be planned for. At the moment, organisations > can deploy their infrastructure on the basis that "it'll never happen to > us, we don't lose our keys / suffer from bugs / whatever", and they > don't consider other causes of revocation. While the probability of any > particular certificate getting chosen would be very low, that *definite* > non-zero probability is likely to get more attention than any number of > out-of-the-ordinary incidents that organisations can dismiss with "well, > *that* would never happen to us!" > > - Matt > > -- You received this message because you are subscribed to the Google Groups "[email protected]" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/f1642406-9ac5-4644-8d78-3b6df3659f79n%40mozilla.org.
