I didn't want to hijack the thread so here's a new one.
On 29/11/2018 6:39 μ.μ., Ryan Sleevi wrote:
On Thu, Nov 29, 2018 at 2:16 AM Dimitris Zacharopoulos
<[email protected] <mailto:[email protected]>> wrote:
Mandating that CAs disclose revocation situations that exceed the
5-day
requirement with some risk analysis information, might be a good
place
to start.
This was proposed several times by Google in the Forum, and
consistently rejected, unfortunately.
Times and circumstances change. When I brought this up at the Server
Certificate Working Group of the CA/B Forum
(https://cabforum.org/pipermail/servercert-wg/2018-September/000165.html),
there was no open disagreement from CAs. However, think about CAs that
decide to extend the 5-days (at their own risk) because of extenuating
circumstances. Doesn't this community want to know what these
circumstances are and evaluate the gravity (or not) of the situation?
The only way this could happen in a consistent way among CAs would be to
require it in some kind of policy.
This list has seen disclosures of revocation cases from CAs, mainly as
part of incident reports. What I understand as disclosure is the fact
that CAs shared that certain Subscribers (we know these subscribers
because their Certificates were disclosed as part of the incident
report) would be damaged if the mis-issued certificates were revoked
within 24 hours. Now, depending on the circumstances this might be
extended to 5 days.
I don't consider 5 days (they are not even working days) to be
adequate
warning period to a large organization with slow reflexes and long
procedures.
Phrased differently: You don't think large organizations are currently
capable, and believe the rest of the industry should accommodate that.
"Tolerate" would probably be the word I'd use instead of "accommodate".
Do you believe these organizations could respond within 5 days if
their internet connectivity was lost?
I think there is different impact. Losing network connectivity would
have "real" and large (i.e. all RPs) impact compared to installing a
certificate with -say- 65 characters in the OU field which may cause
very few problems to some RPs that want to use a certain web site.
For example, if many CAs violate the 5-day rule for revocations
related
to improper subject information encoding, out of range, wrong
syntax and
that sort, Mozilla or the BRs might decide to have a separate
category
with a different time frame and/or different actions.
Given the security risks in this, I think this is extremely harmful to
the ecosystem and to users.
It is not the first time we talk about this and it might be worth
exploring further.
I don't think any of the facts have changed. We've discussed for
several years that CAs have the opportunity to provide this
information, and haven't, so I don't think it's at all proper to
suggest starting a conversation without structured data. CAs that are
passionate about this could have supported such efforts in the Forum
to provide this information, or could have demonstrated doing so on
their own. I don't think it would at all be productive to discuss
these situations in abstract hypotheticals, as some of the discussions
here try to do - without data, that would be an extremely unproductive
use of time.
There were voices during the SC6 ballot discussion that wanted to extend
the 5 days to something more. We continuously see CAs that either detect
or learn about having mis-issued Certificates, that fail to revoke
within 24 hours or even 5 days because their Subscribers have problems
and the RPs would be left with no service until the certificates were
replaces. I don't think we are having a hypothetical discussion, we have
seen real cases being disclosed in m.d.s.p. but it would be important to
have a policy in place to require disclosure of more information.
Perhaps that would work as a deterrent for CAs to revoke past the 5 days
if they don't have strong arguments to support their decisions in public.
As a general comment, IMHO when we talk about RP risk when a CA
issues a
Certificate with -say- longer than 64 characters in an OU field, that
would only pose risk to Relying Parties *that want to interact
with that
particular Subscriber*, not the entire Internet.
No. This is demonstrably and factually wrong.
First, we already know that technical errors are a strong sign that
the policies and practices themselves are not being followed - both
the validation activities and the issuance activities result from the
CA following it's practices and procedures. If a CA is not following
its practices and procedures, that's a security risk to the Internet,
full stop.
You describe it as a black/white issue. I understand your argument that
other control areas will likely have issues but it always comes down to
what impact and what damage these failed controls can produce. Layered
controls and compensating controls in critical areas usually lower the
risk of severe impact. The Internet is probably safe and will not break
if for example a certificate with 65-character OU is used on a public
web site. It's not the same as a CA issuing SHA1 Certificates with
collision risk.
Second, it presumes (incorrectly) that interoperability is not
something valuable. That is, if say the three existing, most popular
implementations all do not check whether or not it's longer than 64
characters (for example), and a fourth implementation would like to
come along, they cannot read the relevant standards and implement
something interoperable. This is because 'interoperability' is being
redefined as 'ignoring' the standard - which defeats the purposes of
standards to begin with. These choices - to permit deviations -
creates risks for the entire ecosystem, because there's no longer
interoperability. This is equally captured in
https://tools.ietf.org/html/draft-iab-protocol-maintenance-01
The premise to all of this is that "CAs shouldn't have to follow
rules, browsers should just enforce them," which is shocking and
unfortunate. It's like saying "It's OK to lie about whatever you want,
as long as you don't get caught" - no, that line of thinking is just
as problematic for morality as it is for technical interoperability.
CAs that routinely violate the standards create risk, because they
have full trust on the Internet. If the argument is that the CA's
actions (of accidentally or deliberately introducing risk) is the
problem, but that we shouldn't worry about correcting the individual
certificate, that entirely misses the point that without correcting
the certificate, there's zero incentive to actually follow the
standards, and as a result, that creates risk for everyone.
Revocation, if you will, is the "less worse" alternative to complete
distrust - it only affects that single certificate, rather than every
one of the certificates the CA has issued. The alternative - not
revoking - simply says that it's better to look at distrust options,
and that's more risk for everyone.
I absolutely agree that interoperability is something valuable that
should be pursued by the ecosystem. Browsers and the majority of CAs
work in that direction. It's just the fact that if a browser strictly
enforces a requirement from a standard (e.g. rejects a certificate that
has an OU field with more than 64 characters), it makes a huge
difference towards the goal for interoperability compared to a CA that
just issues certificate with max of 64 characters in the OU. If browsers
enforced these rules, the difference would be so big that the
problematic certificate would be immediately discovered by the
Subscriber, who would complain to the CA and the Certificate would most
likely be revoked immediately since it wouldn't be usable.
What I meant to say in my original argument is that the "damage" created
by a certificate that fails to strictly comply with RFC5280 and the rest
of the X.* standards, as long as popular browsers "allow it", is
primarily an issue between a Subscriber (that maintains a web site), and
the particular Relying Parties that want to establish a secure
connection to that web site. That's not the entire Internet. This is why
I compared it with "a situation where a site operator forgets to send
the intermediate CA Certificate in the chain. These particular RPs will
fail to get TLS working when they visit the Subscriber's web site".
Perhaps I have misunderstood your argument but when we are discussing
about revocation timelines, it looks a little extreme to say that a CA
claiming "some important reasons" (I'm not saying if they are valid
reasons or not) for delaying a certificate revocation, that they have
zero incentive to follow the standards.
Finally, CAs are terrible at assessing the risk to RPs. For example,
negative serial numbers were prolific prior to the linters, and those
have issues in as much as they are, for some systems, irrevocable.
This is because those systems implemented the standards correctly -
serials are positive INTEGERs - yet had to account for the fact that
CAs are improperly encoding them, such as by "making" them positive
(adding the leading zero). This leading zero then doesn't get stripped
off when looking up by Issuer & Serial Number, because they're using
the "spec-correct" serial rather than the "issuer-broken" serial.
That's an example where the certificate "works", no report is filed,
but the security and ecosystem properties are fatally compromised. The
alternatives for such implementation are:
1) Reject such certificates (but see above about market forces and
interoperability)
2) Correct both the certificate and the CRL/OCSP serial number (which
then creates risk because you're not actually checking _any_
certificates true serial)
3) Allow negative serial numbers (which then makes it harder for
others to do #1)
As I said, CAs have been terrible at assessing risk to the ecosystem
for their decisions. The page at
https://wiki.mozilla.org/SecurityEngineering/mozpkix-testing#Things_for_CAs_to_Fix
shows how bad such interoperability harms improvements - for example,
all of these hacks that Mozilla had to add in order to ship a more
secure, more efficient certificate verifier.
As I said earlier, times change. The bar is raised, this industry
matures day-after-day, things are hopefully improving (security-wise).
There is certainly more security awareness today for this ecosystem than
it was 5 or 10 years ago. Specifically for these "past sins", we have
seen browsers using telemetry to see how many certificates fail to
follow specific requirements and should normally see these numbers
decrease over time. Once these numbers reach an acceptably low level, we
usually see code changes that enforce these requirements and remove the
"hacks". Of course, this is a different topic for discussion.
In conclusion, after repeatedly seeing CAs requesting or effectively
taking more time to revoke certificates that the existing requirements,
I believe that a policy rule that would require CAs to disclose
revocation cases requiring more than 5 days to complete (i.e. revoke the
certificate), provided that the CA submits risk analysis information
after working with the affected Subscriber(s), is a reasonable way forward.
Dimitris.
_______________________________________________
dev-security-policy mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-security-policy