I didn't want to hijack the thread so here's a new one.

On 29/11/2018 6:39 μ.μ., Ryan Sleevi wrote:


On Thu, Nov 29, 2018 at 2:16 AM Dimitris Zacharopoulos <[email protected] <mailto:[email protected]>> wrote:

    Mandating that CAs disclose revocation situations that exceed the
    5-day
    requirement with some risk analysis information, might be a good
    place
to start.

This was proposed several times by Google in the Forum, and consistently rejected, unfortunately.

Times and circumstances change. When I brought this up at the Server Certificate Working Group of the CA/B Forum (https://cabforum.org/pipermail/servercert-wg/2018-September/000165.html), there was no open disagreement from CAs. However, think about CAs that decide to extend the 5-days (at their own risk) because of extenuating circumstances. Doesn't this community want to know what these circumstances are and evaluate the gravity (or not) of the situation? The only way this could happen in a consistent way among CAs would be to require it in some kind of policy.

This list has seen disclosures of revocation cases from CAs, mainly as part of incident reports. What I understand as disclosure is the fact that CAs shared that certain Subscribers (we know these subscribers because their Certificates were disclosed as part of the incident report) would be damaged if the mis-issued certificates were revoked within 24 hours. Now, depending on the circumstances this might be extended to 5 days.

    I don't consider 5 days (they are not even working days) to be
    adequate
    warning period to a large organization with slow reflexes and long
procedures.

Phrased differently: You don't think large organizations are currently capable, and believe the rest of the industry should accommodate that.

"Tolerate" would probably be the word I'd use instead of "accommodate".


Do you believe these organizations could respond within 5 days if their internet connectivity was lost?

I think there is different impact. Losing network connectivity would have "real" and large (i.e. all RPs) impact compared to installing a certificate with -say- 65 characters in the OU field which may cause very few problems to some RPs that want to use a certain web site.


    For example, if many CAs violate the 5-day rule for revocations
    related
    to improper subject information encoding, out of range, wrong
    syntax and
    that sort, Mozilla or the BRs might decide to have a separate
    category
    with a different time frame and/or different actions.


Given the security risks in this, I think this is extremely harmful to the ecosystem and to users.

    It is not the first time we talk about this and it might be worth
    exploring further.


I don't think any of the facts have changed. We've discussed for several years that CAs have the opportunity to provide this information, and haven't, so I don't think it's at all proper to suggest starting a conversation without structured data. CAs that are passionate about this could have supported such efforts in the Forum to provide this information, or could have demonstrated doing so on their own. I don't think it would at all be productive to discuss these situations in abstract hypotheticals, as some of the discussions here try to do - without data, that would be an extremely unproductive use of time.

There were voices during the SC6 ballot discussion that wanted to extend the 5 days to something more. We continuously see CAs that either detect or learn about having mis-issued Certificates, that fail to revoke within 24 hours or even 5 days because their Subscribers have problems and the RPs would be left with no service until the certificates were replaces. I don't think we are having a hypothetical discussion, we have seen real cases being disclosed in m.d.s.p. but it would be important to have a policy in place to require disclosure of more information. Perhaps that would work as a deterrent for CAs to revoke past the 5 days if they don't have strong arguments to support their decisions in public.

    As a general comment, IMHO when we talk about RP risk when a CA
    issues a
    Certificate with -say- longer than 64 characters in an OU field, that
    would only pose risk to Relying Parties *that want to interact
    with that
particular Subscriber*, not the entire Internet.

No. This is demonstrably and factually wrong.

First, we already know that technical errors are a strong sign that the policies and practices themselves are not being followed - both the validation activities and the issuance activities result from the CA following it's practices and procedures. If a CA is not following its practices and procedures, that's a security risk to the Internet, full stop.

You describe it as a black/white issue. I understand your argument that other control areas will likely have issues but it always comes down to what impact and what damage these failed controls can produce. Layered controls and compensating controls in critical areas usually lower the risk of severe impact. The Internet is probably safe and will not break if for example a certificate with 65-character OU is used on a public web site. It's not the same as a CA issuing SHA1 Certificates with collision risk.


Second, it presumes (incorrectly) that interoperability is not something valuable. That is, if say the three existing, most popular implementations all do not check whether or not it's longer than 64 characters (for example), and a fourth implementation would like to come along, they cannot read the relevant standards and implement something interoperable. This is because 'interoperability' is being redefined as 'ignoring' the standard - which defeats the purposes of standards to begin with. These choices - to permit deviations - creates risks for the entire ecosystem, because there's no longer interoperability. This is equally captured in https://tools.ietf.org/html/draft-iab-protocol-maintenance-01

The premise to all of this is that "CAs shouldn't have to follow rules, browsers should just enforce them," which is shocking and unfortunate. It's like saying "It's OK to lie about whatever you want, as long as you don't get caught" - no, that line of thinking is just as problematic for morality as it is for technical interoperability. CAs that routinely violate the standards create risk, because they have full trust on the Internet. If the argument is that the CA's actions (of accidentally or deliberately introducing risk) is the problem, but that we shouldn't worry about correcting the individual certificate, that entirely misses the point that without correcting the certificate, there's zero incentive to actually follow the standards, and as a result, that creates risk for everyone. Revocation, if you will, is the "less worse" alternative to complete distrust - it only affects that single certificate, rather than every one of the certificates the CA has issued. The alternative - not revoking - simply says that it's better to look at distrust options, and that's more risk for everyone.


I absolutely agree that interoperability is something valuable that should be pursued by the ecosystem. Browsers and the majority of CAs work in that direction. It's just the fact that if a browser strictly enforces a requirement from a standard (e.g. rejects a certificate that has an OU field with more than 64 characters), it makes a huge difference towards the goal for interoperability compared to a CA that just issues certificate with max of 64 characters in the OU. If browsers enforced these rules, the difference would be so big that the problematic certificate would be immediately discovered by the Subscriber, who would complain to the CA and the Certificate would most likely be revoked immediately since it wouldn't be usable.

What I meant to say in my original argument is that the "damage" created by a certificate that fails to strictly comply with RFC5280 and the rest of the X.* standards, as long as popular browsers "allow it", is primarily an issue between a Subscriber (that maintains a web site), and the particular Relying Parties that want to establish a secure connection to that web site. That's not the entire Internet. This is why I compared it with "a situation where a site operator forgets to send the intermediate CA Certificate in the chain. These particular RPs will fail to get TLS working when they visit the Subscriber's web site".

Perhaps I have misunderstood your argument but when we are discussing about revocation timelines, it looks a little extreme to say that a CA claiming "some important reasons" (I'm not saying if they are valid reasons or not) for delaying a certificate revocation, that they have zero incentive to follow the standards.


Finally, CAs are terrible at assessing the risk to RPs. For example, negative serial numbers were prolific prior to the linters, and those have issues in as much as they are, for some systems, irrevocable. This is because those systems implemented the standards correctly - serials are positive INTEGERs - yet had to account for the fact that CAs are improperly encoding them, such as by "making" them positive (adding the leading zero). This leading zero then doesn't get stripped off when looking up by Issuer & Serial Number, because they're using the "spec-correct" serial rather than the "issuer-broken" serial. That's an example where the certificate "works", no report is filed, but the security and ecosystem properties are fatally compromised. The alternatives for such implementation are: 1) Reject such certificates (but see above about market forces and interoperability) 2) Correct both the certificate and the CRL/OCSP serial number (which then creates risk because you're not actually checking _any_ certificates true serial) 3) Allow negative serial numbers (which then makes it harder for others to do #1)

As I said, CAs have been terrible at assessing risk to the ecosystem for their decisions. The page at https://wiki.mozilla.org/SecurityEngineering/mozpkix-testing#Things_for_CAs_to_Fix shows how bad such interoperability harms improvements - for example, all of these hacks that Mozilla had to add in order to ship a more secure, more efficient certificate verifier.

As I said earlier, times change. The bar is raised, this industry matures day-after-day, things are hopefully improving (security-wise). There is certainly more security awareness today for this ecosystem than it was 5 or 10 years ago. Specifically for these "past sins", we have seen browsers using telemetry to see how many certificates fail to follow specific requirements and should normally see these numbers decrease over time. Once these numbers reach an acceptably low level, we usually see code changes that enforce these requirements and remove the "hacks". Of course, this is a different topic for discussion.

In conclusion, after repeatedly seeing CAs requesting or effectively taking more time to revoke certificates that the existing requirements, I believe that a policy rule that would require CAs to disclose revocation cases requiring more than 5 days to complete (i.e. revoke the certificate), provided that the CA submits risk analysis information after working with the affected Subscriber(s), is a reasonable way forward.


Dimitris.


_______________________________________________
dev-security-policy mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-security-policy
  • CA disclosure of revocation... Dimitris Zacharopoulos via dev-security-policy
    • Re: CA disclosure of r... Ryan Sleevi via dev-security-policy
      • Re: CA disclosure ... Dimitris Zacharopoulos via dev-security-policy
        • Re: CA disclos... Ryan Sleevi via dev-security-policy
          • Re: CA dis... Fotis Loukos via dev-security-policy
            • Re: C... Jakob Bohm via dev-security-policy
              • R... Fotis Loukos via dev-security-policy
                • ... Dimitris Zacharopoulos via dev-security-policy
                • ... Ryan Sleevi via dev-security-policy
                • ... Fotis Loukos via dev-security-policy
                • ... Dimitris Zacharopoulos via dev-security-policy

Reply via email to