For what it’s worth, I think you can separate out the replacement part with
the remediation part.

The concerns I raised on Sectigo’s issue were highlighting how replacement
does not necessarily equate remediation, and the goal being to figure out
what exactly the path towards remediation can or should look like. This is
similar to, say, the discussion with SECOM re: https://bugzil.la/1707229 -
and how simply issuing a new certificate, without addressing the old,
doesn't remediate. That's not to say that there's reason to block this
replacement in-and-of-itself (either by GTS or SECOM), but to acknowledge
that the issue would still remain unremediated and unresolved, until some
further action, either in Mozilla & BR policy or CA practice, occurred.

I captured further details about why this distinction matters, in
https://bugzilla.mozilla.org/show_bug.cgi?id=1741777#c5 , in examining the
current policies and expectations.

To be clear and explicit: With respect to the digitalSignature bit itself,
I think this likely represents something minor on a technical level, but as
I tried to capture, has some interesting dimensions from the policy and
compliance angle. Sectigo's answer in that bug gave what they saw as the
path forward, but didn't really capture what the alternatives considered
were, what their impact was, and how that impact was assessed. These are
things that I think should be reasonably addressed as part of a remediation
plan.

As I tried to call out in
https://bugzilla.mozilla.org/show_bug.cgi?id=1741777, and I'm sure
especially with the preceding paragraph, bears reiterating: while I believe
a very likely outcome is for someone to make a suggestion of "If you don't
think this requirement is major, we should just change the requirement and
declare things remediated", I also believe that would be taking the
intellectually lazy path out, especially for long-term ecosystem health.
The question of how to remediate issues like this is, I think, of prime
importance to this community, and would benefit from more detailed
discussion, such as a more careful analysis of the pros and cons of
different approaches, and data that can support the path chosen. The
discussion on the issue captures some of these trade-offs, and why I think
that discussion is important for long-term ecosystem health, by being more
explicit in downsides and risks for the proposed approach. This matters
primarily if the proposed path (replacement) is being conflated with
remediation, but as I said, we don't have to couple replacement and
remediation because they are different.

As an example for "What might a data-driven decision look like", hopefully
the bug captures why "replacement != remediation" at present. For
discussing options for remediation, there's an obvious path suggested (on
both issues) of using delegated signers. Sectigo raised concerns about
delegated signers, namely a desire for "smaller response size, less
complex, no added certificate lifecycle management burden, no additional
HSM-based keys needed". Some of these seem subjective-to-the-CA benefits /
aren't obvious why they would rank highly (e.g. HSM-based keys or
certificate lifecycle management), some would benefit from more detail
("less complex"), and some would benefit from more data ("smaller response
sizes").

To that last point, an example of "more data" could be similar to the work
we see come out of other organizations that Mozilla works with have
published. For example, Cloudflare shared a data-driven approach to
exploring TLS certificate sizes and the effect on a variety of core
performance metrics, in
https://blog.cloudflare.com/sizing-up-post-quantum-signatures/ . While
discussed on the bug how some systems behave re: revocation of CA
certificates (e.g. the use of browser-based lists or, in non-browser cases,
preferences for CRLs over OCSP), if there's data to show about how often
the root responder is queried, and how variances in that response size can
affect client performance, could be useful. You could easily imagine an A/B
test where a CA, such as Sectigo, issues two intermediates used for
testing, one of which has direct-signed OCSP responses issued for queries
about that intermediate, another which has delegated-responder signed.
Issuing certificates for test sites under each intermediate could allow you
to use A/B tests to explore both client metrics (e.g. as reported through
existing browser performance metrics systems) and server metrics - as
Cloudflare did. For context, the reason for two intermediates is to avoid
confounding factors of using an existing intermediate (for which the client
already has a primed cache for). When designing such a test, it can also be
explored "Exactly how small can a responder be" and, similarly, explored
whether there are compatibility issues with responder key sizes (e.g. an
ECC responder will have a much smaller key than an RSA responder).

Of course, another alternative, which Sectigo only briefly alluded to in
their response, is to look at doing a root transition - to work on fully
sunsetting the existing roots, replacing with new roots which are truly new
(i.e. new subject and key), and not just "replacements." Sectigo noted they
are already exploring this path for other reasons, presumably cross-signing
with their existing roots, and this is already what is practiced when such
issues like this are detected during inclusion requests. I would imagine
that the arguments against this remediation path might be similar: that is,
that the increased chain means an increased response size (for older client
compatibility, by including the cross-chain). But that's exactly the sort
of analysis that these incident reports are meant to cover, by not only
covering the "Here's what we're doing", but also exploring the "Here's the
alternatives we rejected, and why - showing both we thought of them and
here's our understanding about why these might not be best".

Why do I suggest all this work if it is, after all, potentially a "minor"
issue, especially knowing how much I vocally criticize when CAs' make such
subjective distinctions, as there is no such compliance distinction?
Because incidents like this, and the whole goal of incident reporting, is
to build shared knowledge, and to address the uncertainty and fear of
change in a brittle system, to make CAs more confident of making changes,
and how to do them successfully. It may very well be that the decision is
to "not remediate" - to expect to see these as incidents listed in the
audit reports going forward, as an acknowledgement of the non-compliance,
but which root programs might decide (in this limited, specific incident,
and for limited, specific CAs), is the best of a bad set of options. I'm
not sure the data is there at present to support the conclusion, but I'm
not foreclosing on this possibility.

To reiterate: I don't think you need to block the replacement over
resolution of this. It's just that replacement is separate from, and
neither here nor there, for remediation of the issue, objectively and
technically speaking, and so that's still an outstanding issue for Sectigo
to weigh.

On Wed, Dec 1, 2021 at 1:49 PM Kathleen Wilson <[email protected]> wrote:

> All,
>
> I need to finalize the December batch of root changes this week (Bug
> #1733003 <https://bugzilla.mozilla.org/show_bug.cgi?id=1733003>), which
> currently contains Bug #1735407
> <https://bugzilla.mozilla.org/show_bug.cgi?id=1735407>, "Replace Google
> Trust Services LLC (GTS) root certificates in NSS", which is this exact
> scenario of this discussion -- replacing a root CA certificate (missing the
> digitalSignature key usage bit) with another root CA certificate (same key
> pair) that has the digitalSignature key usage bit set.
>
> At this time I am inclined to remove Bug #1735407
> <https://bugzilla.mozilla.org/show_bug.cgi?id=1735407> from the December
> 2021 batch of root changes and put it as tentatively to be part of the
> March 2022 batch of root changes, so that we will have time for this
> discussion to come to full conclusion.
>
> Does anyone foresee any problems with me postponing Bug #1735407
> <https://bugzilla.mozilla.org/show_bug.cgi?id=1735407> to March?
>
> Thanks,
> Kathleen
>
>
>
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "
> [email protected]" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/6d5313f6-390b-4523-8b05-2d7f97461d22n%40mozilla.org
> <https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/6d5313f6-390b-4523-8b05-2d7f97461d22n%40mozilla.org?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"[email protected]" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/CAErg%3DHHDKGiUUFUz5Gh_QR6hpV56a%2BGMa4u-XZ%2B4VMN6Kc_QUg%40mail.gmail.com.

Reply via email to