On Mon, Sep 23, 2019 at 11:53 PM Andy Warner via dev-security-policy < dev-security-policy@lists.mozilla.org> wrote:
> The practice of revoking non-issued certificates would therefore lead to > CRL growth which would further make reliable revocation checking on > bandwidth constrained clients more difficult. As others have pointed out, it sounds like GTS is confused. This only applies if you need to revoke them. I’m not sure how many times it bears repeating, but the suggestion that you need to revoke if you issued a precert, but not the cert, is patently absurd. Among other things, as you point out, it causes both CRL and OCSP growth. Luckily, every browser participating has seemingly tried to make it clear that’s not expected. So one objection handled :) 2. There seem to be a number of assumptions that precertificate issuance > and certificate issuance is roughly atomic. In reality, a quorum of SCTs is > required prior to final certificate issuance, so that is not the case. Could you point to an example? All of the conversation I’ve seen has highlighted that they are sequential, and can suffer a variety of delays or errors. This is why the conversation has been about the least error prone approach, in that it leads to the most consistent externally observable results. I admit, I’m honestly not sure what part of the conversation is being referred to here. As a result of this, the existence of a precertificate is possible without > a final certificate having been issued. Yup. And it’s been repeatedly acknowledged that is perfectly fine. The proposed language further considers that, but emphasizes that by producing and logging the precertificate, then regardless of the issue, the CA should be prepared to provision services for it for the duration. If you find yourself continually generating precertificates, that suggests an operational/design issue, which you can remediate based on which is cheaper for you: fixing the pipeline to be reliable (as many reliability issues seen, to date, have been on the CA side) or continue to provision services when things go bad. Either works, you can choose. The important part is you need to treat (pre-cert || cert) in scope for all your activities. You must be capable of revoking. You must be capable of searching your databases. You must be capable of validating. 3. This raises the question of how much time a CA has from the time they > issue a precertificate to when the final certificate must be issued. It doesn’t, because it’s a flawed understanding that’s been repeatedly addressed: you don’t have to issue the final certificate. But you MUST be prepared to provision services as if you had. In general, this means you provision and distribute those services ahead of time. My reply to Dimitris earlier today provided a road map to an error prone design, as well as two different ways of accomplishing compliant design. Given that GTS is responding to that thread, I’m surprised to see it come up again so quickly, as it seems like GTS may not have understood? Likewise, there is the question of how soon the revocation information must > be produced and reachable by an interested party (e.g. someone who has > never seen the certificate in question but still wants to know the status > of that certificate). [Aside, Wayne, you specifically said relying parties > earlier, did you intend to say interested party or relying party? We have > some additional questions if relying party was actually intended, as using > it in that context seems to redefine what a relying party is.] I cannot see how it redefined relying party, as anyone who decides to trust GTS becomes a relying party of GTS, and does not change anything. The question of how soon has been mentioned earlier, but again is addressed by earlier replies. We’ve seen the problems with CAs arguing CDN distribution. There is no reasonable way that the relying party community can or should accept the phased rollout delays as compliant, particularly with a 24 hour revocation timeline (for example). A common approach to this is to pregenerate responses for distribution, with edge caches (5019-style) that can talk to an authoritative origin (6960-style) under the hood. If a client queries for the status, the edge cache serves it if it’s a cache hit, otherwise communicates back to the origin and pulls into the cache. This is perhaps unsurprising, as it’s the model many active CDNs use, functioning as it were as a reverse proxy with caching (and the ability to globally evict from the cache). Since the CA already needs to ensure that they can have a globally consistent response distributed within 24-hours, and that any time spent synchronizing is time that the CA itself cannot use to investigate/respond, this design discourages CAs from multihour rollouts (and that’s a good thing). If you can’t meet those timelines, then you’re setting yourself up for a CA incident that other CAs will have designed around. If you think through the logical consequences for relying parties, it’s clear that there are approaches CAs can use that are harmful, and there are approaches they can use that are helpful. As publicly trusted CAs, they are expected to be beyond reproach, and make every decision with the relying parties’ interests at heart: not the Subscriber’s, not the Applicant’s, not the CA’s. Something about putting the user first, and the user here is everyone that will trust a certificate from that CA. This “reachable” part is particularly meaningful in that when using a CDN > there are often phased roll outs that can take hours to complete. Today, > the BRs leave this ambiguous, the only statement in this area is that new > information must be published every four days: > > "The CA SHALL update information provided via an Online Certificate Status > Protocol at least every four days. OCSP responses from this service MUST > have a maximum expiration time of ten days." It’s not ambiguous. Read 4.9.1.1 and 7.1.2.3(c). You aren’t providing a responder if it can’t answer for four days, and you aren’t meeting the revocation timeline if you aren’t publishing revocation information in 24 hours. The normal timeline: - Upon issuance, the definitive response is available - That definitive response is refreshed at least every four days - While the BRs max is ten days, a reminder that Microsoft sets a minimum of 8 hours, requires the maximum be 7 days, and new information available at half that - e.g. 3.5 days - The responder should maintain global consistency (e.g. if using RFC5019, this is easier) When revoking: - That response should be globally available and published with 24 hours or five days. With this change, it would seem there needs to be a lower bound defined for > how quickly the information needs to be available if it is to be an > effective monitoring tool. Again, it sounds like GTS hasn’t been following the thread or the updates, which have clarified as to why the presumed gap (between precert and cert) is irrelevant, and thus a lower bound not needed here. This only becomes an issue if GTS is responding unknown for several hours after issuing certs - but by that logic, GTS is not providing responders for several hours after issuance, which is a BR violation today. * Clarifications > > This in turn raises the question if CAs can re-use authorization data such > as CAA records or domain authorizations from the precertificate? It doesn’t, because the BRs answer this, if GTS reads them. Specifically, 3.2.2.8 answers this for CAA. If a final certificate has not been issued due to a persistent quorum > failure, and that failure persists longer than the validity of the used > authorization data, can the authorizations that were done prior to the > precertificate issuance be re-used? It seems a responsible CA would answer “No”, and ensure that the validity period of any information they use is good for (pre-cert issuance time + time they’re willing to wait for SCTs). They would avoid this whole issue, by avoiding trying to do the “least possible” and recognizing that they have the flexibility to unambiguously avoid any compliance issues here. As such, in our opinion, a roll out period to enable software and > deployment changes to be made would be appropriate. Had this conversation > taken place within the CA/Browser forum, the implementation date would have > been discussed before becoming a formal requirement. We leave it to > Browsers to determine reasonable timelines and we're not seeking to delay, > simply recognition that many changes take time to implement and it is tough > to effectively respond to changes that become new requirements in an > instant. This is entirely unproductive and unhelpful, because it talks around the issue. This is the behaviour we the community largely see of problematic CAs that don’t have user’s security first. If you think there’s an issue with the date, a productive, useful contribution, would be: - Highlight when - Highlight why However, none of the discussion “should” be a functional change for any CA following the rules. Even as a clarification of expectations, it’s trivial to resolve and get into compliance, judging by the responses we’ve seen from CAs to date. I’m most encouraged, and most discouraged, that it seems even still today, GTS is having trouble understanding what’s proposed, and seeing things that simply aren’t there, rather than the things that are. Hopefully, the clarifications to this thread, showing GTS has not followed the conversation, do much to assuage the concerns that GTS is being asked to implement major changes. The only major change I can see is it sounds like GTS may have had other compliance issues with its responder services, likely similarly based on misunderstanding the requirement as a publicly trusted CA. As I said, that’s encouraging and discouraging. I know that’s far more direct than Wayne would be, but any publicly trusted CA that’s been following this Forum should recognize that GTS is following a playbook used by CAs to push back on security improvements unconstructively, as a stalling tactic that usually exists to hide or paper over non-compliance, and then arguing the non-compliance was because of something ambiguous, rather than thinking through the logical consequences of their decisions. This isn’t to say pushing back gets you branded as a poor CA; however, this specific approach, still lacking in actionable data and misunderstanding both Root Policy and the CA/B Forum, absolutely causes a crisis of confidence in the CAs that do this, and for good reason, as time has born out. Browsers should set whatever requirements they believe are in the best > interest of their users, but the more requirements are split across > multiple root program's requirements, the CA/Browser Forum and IETF, the > harder it becomes to reason about what a correct behavior is. Given the > historical precedent of rule making in CA/Browser forum and the fact that > it covers all participants, it seems like the ideal body to ensure > consistency within the ecosystem. This is ahistorical. The historic precedent is and has been root program requirements, especially with respect to matters like the topic at hand, eventually flowing into the BRs. That a CA would suggest that the CA/B Forum is a better place for discussing this than here deeply saddens me, precisely because of the intentional exclusion of a number of people from the Forum that have made incredibly valuable and worthwhile contributions. Honestly, I suppose I had expected GTS to value the openness and transparency this provides over the Forum. It is true that CAs bear the responsibility of following the rules of the root programs they are in, and that can be complex if those requirements aren’t centrally codified. A trivial solution exists for this, but which CAs rarely avail themselves of: they can draft text (if they aren’t voting members) or prepare and sponsor ballots (if they are) to incorporate the root program requirements into text. For example, I continue to remain shocked that no CA has thought to do so with Microsoft’s OCSP requirements, or both Microsoft and Mozilla’s EKU requirements. However, since this neither changes the expectations in the BRs nor requires anything new of CAs, and merely explains the logical consequences of RFC5019/6960 to those who may struggle with it, it does not seem to at all raise to the level suggested here. _______________________________________________ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy