On Thu, Dec 27, 2018 at 6:56 PM Jeremy Rowley <jeremy.row...@digicert.com>
wrote:

> The risk is primarily outages of major sites across the web, including
> certs used in Google wallet. We’re thinking that is a less than desirable
> result, but we weren’t sure how the Mozilla community would feel/react.
>

I don’t think that is a particularly helpful framing, to be honest. The
risk these organizations face here is self-inflicted; regardless of the
feeling of underscores, there is unquestionably an issue for organizations
that cannot respond in the BR timeframes, let alone extended ones that
extend for months. That's a real ecosystem issue, and regardless of the CA
these customers partner with, an issue that needs both better understanding
and, to be honest, better prevention.

Matt has spoken at length to the risk to the community, which doesn’t
really seem like it’s been acknowledged, let alone proposed as to how it
will be mitigated. I have to ask again - what steps is DigiCert taking to
avoid these issues going forward?

 We’re still considering revoking all of the certs on Jan 15th based on
> these discussions.  I don’t think we’re asking for leniency (maybe we are
> if that’s a factor?), but I don’t know what happens if you’re faced with
> causing outages vs. compliance.
>

What happens is that you ask why there is risk of outage to begin with and
what can be done to improve going forward? Let’s assume you do revoke, and
it causes an outage - is DigiCert taking steps to ensure no customer of
theirs is ever faced with that risk? If so, what are those steps?

I started the conversation because I feel like we should be good netizans
> and make people aware of what’s going on instead of just following policy.
> I’m actually surprised at least one other CA that has issued a large number
> of underscore character certs hasn’t run into the same timing issues.
>

This seems to suggest that perhaps other CAs have prepared their customers
for revocation. How does this surprise - that no other CA faces this - lead
to tangible changes in the business processes? How would this change, if
another CA did have the same issue? Surely you can see there are real and
fundamental issues that you’re uniquely qualified to help your customers
address in ways that we cannot.

Have you analyzed CT, for example, to see why DigiCert is unique?
Certainly, by sheer volume, it's heavily tilted towards the old Symantec
infrastructure - and the customers that came over to DigiCert. With those
sorts of details, how does this change how things were done, or how they
will be done?

I’m not trying to pick on y’all - I think it is legitimately good that you
provided concrete data. Even if you do revoke on Jan 15, this is still
useful to understand the challenges, but only if this leads to meaningful
changes. What might those look like?

Normally, we would just revoke the certs, but there are a significant
> number of certs in the Alexa top 100. We’ve told most customers, “No
> exception”. I also thought it’s better to get the information out there so
> we can all make rational decisions (DigiCert included) if as many facts are
> known as possible.
>

And this is the framing that I think is incredibly helpful. Understanding
why customers can’t change, and what steps are being done to ensure they
can, is hugely useful. Wayne’s question were to this point - as were mine
towards understanding the problem from the other side, which are steps the
CA is taking. As I've repeatedly highlighted from
https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation , the goal
is not punishment - but understanding how these issues are being addressed.

>
> We are working with the partners to get the certs revoked before the
> deadline. Most will.
>

This seems like a significant improvement from “100% of customers can’t”

By January 15th, I hope there won’t be too many certs left. Unfortunately,
> by then it’s also too late to discuss what happens if the cert is not
> revoked. Ie – what are the benefits of revoking (strict compliance) vs
> revoking the larger impact certs as they are migrated (incident report).
> Unfortunately part 2, there’s no guidance on whether an incident report
> means total distrust v. something on your audit and a stern lecture.
>

I mean, it’s two-fold, right? Any incident can lead to total distrust, but
it’s also unlikely that a single incident leads to total distrust. The way
to balance those competing statements is to do what you’re doing - and to
be transparent. As Matt has highlighted, there’s a huge risk here that this
leads to a moral hazard - and the best way to mitigate that is to discuss
steps being taken to reduce that risk going forward, particularly about
what a core part of the problem statement is - difficulty in revocation.

I’d happily suffer a lecture than take down a top site. Not so willing to
> gamble the whole company. This is why we wanted to have the discussion now,
> despite no violation so far. The response from the browsers is public  -
> that they cannot make that determination. Does that mean we have our
> answer? Revoke is the only acceptable response?
>

I mean, the answer has been to repeatedly highlight
https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation

In a number of ways, an unintentional violation is worse than an
intentional violation. Ignorance is not really an excuse when you hold keys
to the Internet, and being asleep at the wheel is hugely dangerous. So, if
I had to pick between an intentional violation and an unintentional (and
preventable) violation, I'd likely pick intentional. But there's also a
huge hazard with intentional violations - those reveal potentially systemic
issues and a lack of good faith, especially if they become common-place. We
definitely saw CAs perform intentional violations and notify
after-the-fact, and that's far, far worse than those that notify before
intentionally violating (I think every post-facto notification for
intentional incident has, eventually, lead to that CAs distrust).

So somewhere on the scale of things, we're in a better place than most
every alternative. But to ensure this is in that 'good faith' side of
things, understanding what the factors are that have been evaluated, and
what steps are being taken to prevent this, are significant. As I said, I
think the principles captured in
https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation and in the
discussion about how at least some of us see this (that it's related to
underscores incident response) suggests that it's not, in fact, the end of
the world, or the CA, provided that meaningful data behind the decision to
not revoke is given, meaningful plans and timelines for resolution are
given, and meaningful steps to prevent this from ever happening again are
given. It becomes an incident report, and the result is not a stern lecture
- but concrete and quantifiable steps as to how to improve.
_______________________________________________
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy

Reply via email to