For what it’s worth, I think you can separate out the replacement part with the remediation part.
The concerns I raised on Sectigo’s issue were highlighting how replacement does not necessarily equate remediation, and the goal being to figure out what exactly the path towards remediation can or should look like. This is similar to, say, the discussion with SECOM re: https://bugzil.la/1707229 - and how simply issuing a new certificate, without addressing the old, doesn't remediate. That's not to say that there's reason to block this replacement in-and-of-itself (either by GTS or SECOM), but to acknowledge that the issue would still remain unremediated and unresolved, until some further action, either in Mozilla & BR policy or CA practice, occurred. I captured further details about why this distinction matters, in https://bugzilla.mozilla.org/show_bug.cgi?id=1741777#c5 , in examining the current policies and expectations. To be clear and explicit: With respect to the digitalSignature bit itself, I think this likely represents something minor on a technical level, but as I tried to capture, has some interesting dimensions from the policy and compliance angle. Sectigo's answer in that bug gave what they saw as the path forward, but didn't really capture what the alternatives considered were, what their impact was, and how that impact was assessed. These are things that I think should be reasonably addressed as part of a remediation plan. As I tried to call out in https://bugzilla.mozilla.org/show_bug.cgi?id=1741777, and I'm sure especially with the preceding paragraph, bears reiterating: while I believe a very likely outcome is for someone to make a suggestion of "If you don't think this requirement is major, we should just change the requirement and declare things remediated", I also believe that would be taking the intellectually lazy path out, especially for long-term ecosystem health. The question of how to remediate issues like this is, I think, of prime importance to this community, and would benefit from more detailed discussion, such as a more careful analysis of the pros and cons of different approaches, and data that can support the path chosen. The discussion on the issue captures some of these trade-offs, and why I think that discussion is important for long-term ecosystem health, by being more explicit in downsides and risks for the proposed approach. This matters primarily if the proposed path (replacement) is being conflated with remediation, but as I said, we don't have to couple replacement and remediation because they are different. As an example for "What might a data-driven decision look like", hopefully the bug captures why "replacement != remediation" at present. For discussing options for remediation, there's an obvious path suggested (on both issues) of using delegated signers. Sectigo raised concerns about delegated signers, namely a desire for "smaller response size, less complex, no added certificate lifecycle management burden, no additional HSM-based keys needed". Some of these seem subjective-to-the-CA benefits / aren't obvious why they would rank highly (e.g. HSM-based keys or certificate lifecycle management), some would benefit from more detail ("less complex"), and some would benefit from more data ("smaller response sizes"). To that last point, an example of "more data" could be similar to the work we see come out of other organizations that Mozilla works with have published. For example, Cloudflare shared a data-driven approach to exploring TLS certificate sizes and the effect on a variety of core performance metrics, in https://blog.cloudflare.com/sizing-up-post-quantum-signatures/ . While discussed on the bug how some systems behave re: revocation of CA certificates (e.g. the use of browser-based lists or, in non-browser cases, preferences for CRLs over OCSP), if there's data to show about how often the root responder is queried, and how variances in that response size can affect client performance, could be useful. You could easily imagine an A/B test where a CA, such as Sectigo, issues two intermediates used for testing, one of which has direct-signed OCSP responses issued for queries about that intermediate, another which has delegated-responder signed. Issuing certificates for test sites under each intermediate could allow you to use A/B tests to explore both client metrics (e.g. as reported through existing browser performance metrics systems) and server metrics - as Cloudflare did. For context, the reason for two intermediates is to avoid confounding factors of using an existing intermediate (for which the client already has a primed cache for). When designing such a test, it can also be explored "Exactly how small can a responder be" and, similarly, explored whether there are compatibility issues with responder key sizes (e.g. an ECC responder will have a much smaller key than an RSA responder). Of course, another alternative, which Sectigo only briefly alluded to in their response, is to look at doing a root transition - to work on fully sunsetting the existing roots, replacing with new roots which are truly new (i.e. new subject and key), and not just "replacements." Sectigo noted they are already exploring this path for other reasons, presumably cross-signing with their existing roots, and this is already what is practiced when such issues like this are detected during inclusion requests. I would imagine that the arguments against this remediation path might be similar: that is, that the increased chain means an increased response size (for older client compatibility, by including the cross-chain). But that's exactly the sort of analysis that these incident reports are meant to cover, by not only covering the "Here's what we're doing", but also exploring the "Here's the alternatives we rejected, and why - showing both we thought of them and here's our understanding about why these might not be best". Why do I suggest all this work if it is, after all, potentially a "minor" issue, especially knowing how much I vocally criticize when CAs' make such subjective distinctions, as there is no such compliance distinction? Because incidents like this, and the whole goal of incident reporting, is to build shared knowledge, and to address the uncertainty and fear of change in a brittle system, to make CAs more confident of making changes, and how to do them successfully. It may very well be that the decision is to "not remediate" - to expect to see these as incidents listed in the audit reports going forward, as an acknowledgement of the non-compliance, but which root programs might decide (in this limited, specific incident, and for limited, specific CAs), is the best of a bad set of options. I'm not sure the data is there at present to support the conclusion, but I'm not foreclosing on this possibility. To reiterate: I don't think you need to block the replacement over resolution of this. It's just that replacement is separate from, and neither here nor there, for remediation of the issue, objectively and technically speaking, and so that's still an outstanding issue for Sectigo to weigh. On Wed, Dec 1, 2021 at 1:49 PM Kathleen Wilson <[email protected]> wrote: > All, > > I need to finalize the December batch of root changes this week (Bug > #1733003 <https://bugzilla.mozilla.org/show_bug.cgi?id=1733003>), which > currently contains Bug #1735407 > <https://bugzilla.mozilla.org/show_bug.cgi?id=1735407>, "Replace Google > Trust Services LLC (GTS) root certificates in NSS", which is this exact > scenario of this discussion -- replacing a root CA certificate (missing the > digitalSignature key usage bit) with another root CA certificate (same key > pair) that has the digitalSignature key usage bit set. > > At this time I am inclined to remove Bug #1735407 > <https://bugzilla.mozilla.org/show_bug.cgi?id=1735407> from the December > 2021 batch of root changes and put it as tentatively to be part of the > March 2022 batch of root changes, so that we will have time for this > discussion to come to full conclusion. > > Does anyone foresee any problems with me postponing Bug #1735407 > <https://bugzilla.mozilla.org/show_bug.cgi?id=1735407> to March? > > Thanks, > Kathleen > > > > > > > > > -- > You received this message because you are subscribed to the Google Groups " > [email protected]" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/6d5313f6-390b-4523-8b05-2d7f97461d22n%40mozilla.org > <https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/6d5313f6-390b-4523-8b05-2d7f97461d22n%40mozilla.org?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "[email protected]" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/CAErg%3DHHDKGiUUFUz5Gh_QR6hpV56a%2BGMa4u-XZ%2B4VMN6Kc_QUg%40mail.gmail.com.
