Honestly, I like this approach. I think that describing the whole affected population in the Impact section, while only providing the full certificate data in the appendix for the still-valid certificates, is a good place to land. Specifically, I think that it is most useful for the Appendix to list exactly the set of certificates that will be (or should be) revoked as a result of the incident. This makes it easy for community members to verify that all listed certificates have in fact been revoked as promised, and to determine whether specific certificates identified by third-parties are included in the list.
I think your illustrative language is aimed in exactly the right direction, and I especially like the idea of the "incident heuristic". My two suggestions for refinement would be: 1) In the Impact section, don't bother distinguishing between precertificate and final certificates by default (since these numbers are *usually* nearly identical), but make a note that the CA definitely should list them separately if the incident affected precerts and final certs differently. 2) In the appendix, maybe add to the "preferred format" that we prefer crt.sh sha256 links to precertificates specifically. This removes any ambiguity, helps ensure that CAs don't forget to list certs for which precert issuance succeeded but final issuance failed, and means that CAs don't have to try to submit all the affected final certs to CT and then wait for crt.sh to ingest them as part of incident response. Thanks, Aaron On Thu, Apr 11, 2024 at 1:11 PM Ryan Dickson <[email protected]> wrote: > Hi Aaron, > > You raise some excellent points. Thanks for your feedback. It seems like > (1) we generally agree on a goal of providing the most complete set of data > possible, and (2) there’s an opportunity to balance our desires for the > completeness, usefulness, and practicality of the certificate data in > question. > > With this community’s help (especially yours), the CCADB Steering > Committee launched an updated Incident Report Template > <https://www.ccadb.org/cas/incident-report> in October 2023. Though this > template has only been in use for a short while, we believe there are > opportunities to further promote consistency and transparency in Incident > Reporting. > > One enhancement idea was to include a bulleted list of specific questions > that would better guide responses for all given sections rather than the > current free-form response approach based on some of the section > descriptions on CCADB.org <http://ccadb.org> (i.e., the Impact section). > The Impact and Appendix sections are similar in that they intend to > describe the size and nature of the incident, and it's possible an > enhancement to one can benefit the other. > > For example, the “Impact" section could be transformed… > > From (current): “The Impact section should contain a short description of > the size and nature of the incident. For example: how many certificates, > OCSP responses, or CRLs were affected; whether the affected objects share > features (such as issuance time, signature algorithm, or validation type); > and whether the CA Owner had to cease issuance during the incident.” > > To (illustrative): something like the following (intends to describe > fields and expected responses)… > > > - > > Total number of pre-certificates: [if applicable, the total count of > pre-certificates affected by the issue(s) described in this incident > report, including expired and revoked pre-certificates] > > > > - > > Total number of certificates: [if applicable, the total count of > "final" certificates affected by the issue(s) described in this incident > report, including expired and revoked certificates] > > > > - > > Total number of "remaining valid" certificates: [if applicable, the > total count of "final" certificates affected by the issue(s) described in > this incident report, minus expired and revoked certificates. Minimally, > this set of certificates MUST be disclosed in the Appendix section of this > report.] > > > > - > > Incident heuristic: [if applicable, EITHER: (a) describe a heuristic > that would allow a third-party to assemble the full corpus of affected > certificates, if not provided in the Appendix (e.g., "Any certificate > containing policy OID 1.2.3.4.5.6 and issued between 11/13/2024 and > 4/11/2024 is affected by this incident. Certificates that have been revoked > or are expired are omitted from the certificate list disclosed to the > Appendix.") --- (b) clearly explain why this isn't possible (e.g., "This > incident affected every certificate issued between 5/25/2023 and 6/15/2024 > that relied upon BR Validation Method 3.2.2.4.19. Because the relied upon > validation method is not described in a certificate, this heuristic cannot > be used by a third-party to assemble the full corpus of affected > certificates. Certificates that have been revoked or expired have been > omitted from the certificate list disclosed to the Appendix.), --- or (c) > the full corpus of affected certificates are disclosed in the Appendix.] > > > > - > > Was issuance stopped in response to this incident, and why or why not?: > [yes/no - explanation (e.g., "Yes. As described in the incident timeline, > we stopped issuance after learning of this issue to correct the > corresponding certificate profile.")] > > > This is just an example, and we might need to more thoughtfully consider > incidents that don’t involve certificates before it can be considered for > adoption. What’s helpful to us, though, is that this proposed approach can > more consistently describe the impact of an incident - while also possibly > offering a balance between our desire for completeness (i.e., satisfied by > counts and a clear description of an incident heuristic) and practicality > (only requiring disclosure of the “remaining valid" certificates in the > Appendix). There might be unexpected benefits from this approach, for > example, the heuristic may make it easier for other CA Owners to evaluate > whether they share the same issue being reported. > > We’re interested in your feedback, and that of other community members, in > how this might help better define community expectations - and further > improve the incident reporting process. > > Thanks, > > Ryan > > > On Thu, Apr 11, 2024 at 1:33 PM 'Aaron Gable' via CCADB Public < > [email protected]> wrote: > >> In general, I agree that producing the most complete set of data possible >> is the most desirable course of action. However, I wonder how this desire >> interacts with the full scale of the WebPKI. >> >> Two years ago, Let's Encrypt had an incident >> <https://bugzilla.mozilla.org/show_bug.cgi?id=1751984> which affected >> 100% of our validations conducted via the TLS-ALPN-01 method. The end >> result was 10 zstd-compressed files, 10 megabytes each (due to Bugzilla's >> attachment size limits), together containing 2.7 million crt.sh URIs. Those >> files represented only one version of each certificate: usually the final >> certificate, but sometimes the precertificate for those issuances where >> production of the final certificate failed. The files also represented only >> the certificates which were unexpired at the time that the incident was >> discovered, only 18% of the total incident period. >> >> If the list of affected certificates had included both the pre- and final >> certificates, and had covered the full incident period, it would have been >> a full gigabyte of compressed URLs. (10 MB per file x 10 files x 2 for both >> kinds of certs x 5 to go from 18% to 100% of incident period.) And this >> incident affected only a small fraction of Let's Encrypt's total issuance >> volume -- if the issue had been with our HTTP-01 method over the same >> incident period, the resulting list of URLs would have been nearly 20 >> gigabytes. Would this larger set of certificate data actually have been >> useful to the community, given that they were already untrusted? >> >> Acquiring this fuller list would have significantly increased the time >> taken to conduct the investigation. Let's Encrypt prunes data about >> already-expired certificates from our easily-queriable database to prevent >> it from growing without bound, so the investigation would have had to start >> pulling in log data, which is a much slower process for both writing and >> executing the relevant queries. Would this additional investigation time, >> and correspondingly slower incident response and remediation, have been >> worthwhile? >> >> It is possible that the incident period could have exceeded the audit log >> retention period (currently 2 years) required by the BRs. In that case, >> producing a full list of certificates would have been impossible. The >> certificates themselves contain no indication of what validation method was >> used for each identifier, so reconstruction from CT doesn't work. What >> would be the appropriate action if producing the full list of >> historically-affected certificates is not possible? >> >> Thanks, >> Aaron >> >> On Thu, Apr 11, 2024 at 10:03 AM 'Chris Clements' via CCADB Public < >> [email protected]> wrote: >> >>> Hi Rob, >>> >>> Thank you for the comprehensive survey and for clearly communicating >>> your findings. >>> >>> In response to your questions, and from the perspective of the Chrome >>> Root Program: >>> >>> >>> 1. Is a CA's incident report expected to disclose the affected >>>> certificates that have already expired prior to the CA's response to the >>>> incident? >>>> >>> >>> We see disclosing the full set of affected certificates, regardless of >>> whether they have expired or have been revoked, as presenting the community >>> with the most complete perspective of an incident’s impact. This is our >>> preferred approach. >>> >>> >>> 2. Is a CA's incident report expected to disclose the affected >>>> certificates that have already been revoked prior to the CA's response to >>>> the incident? >>> >>> >>> Yes, similar to the previous question, our preference is to collect the >>> most complete perspective possible. >>> >>> 3. Is a CA's incident report expected to disclose both an affected >>>> precertificate and its corresponding certificate? Or just one of the pair? >>> >>> >>> You raise an opportunity for improvement. Historically, a list of >>> precertificates was considered acceptable. However, having both >>> precertificates and final certificates provides a more comprehensive >>> perspective, which we consider favorable. >>> >>> We appreciate other thoughts and perspectives. >>> >>> Additionally, we’ll plan to sync on these opinions with the other >>> members of the CCADB Steering Committee, which could ultimately lead to an >>> update of https://www.ccadb.org/cas/incident-report. >>> >>> Thanks again! >>> >>> -Chris >>> >>> >>> On Mon, Apr 8, 2024 at 12:45 PM 'Rob Stradling' via CCADB Public < >>> [email protected]> wrote: >>> >>>> In recent weeks, a number of CAs have filed incident reports relating >>>> to mistakes made when setting critical flags in Subscriber certificate >>>> extensions since the TLSBRv2 profiles came into force. We thought it would >>>> be worth performing a comprehensive survey ourselves in order to discover >>>> if any similar incidents at other CAs had not yet been detected. >>>> >>>> I've run [1] against the primary crt.sh DB, which caused it to trawl >>>> through the crt.sh ID space starting around the time TLSBRv2 went into >>>> force to identify any Subscriber certificate containing any common >>>> extension with its critical flag set incorrectly per §7.1.2.7.6. I've >>>> posted a report of the results at [2], which was generated using [3]. >>>> >>>> Seven further incidents were identified. I sent Certificate Problem >>>> Reports to the two CAs whose affected PKI hierarchies are trusted by root >>>> programs whose representatives are active in monitoring Bugzilla. Both of >>>> those CAs responded promptly and filed incident reports: [4] and [5]. >>>> >>>> Having gathered this data, today I've used it to cross-check the lists >>>> of affected certificates that CAs have provided with their incident >>>> reports. I was surprised to find two bugs ([6] and [7]) without any >>>> attached list of affected certificates. I also observed some patterns of >>>> "omissions" in the disclosed lists of affected certificates, for which I >>>> would like to call upon the root program owners to clarify their >>>> expectations; noting that the CCADB incident reporting requirements [8] say >>>> that each incident report's *"Appendix must include a listing of the >>>> complete certificate details of all affected certificates"*: >>>> >>>> 1. Is a CA's incident report expected to disclose the affected >>>> certificates that have already expired prior to the CA's response to the >>>> incident? >>>> 2. Is a CA's incident report expected to disclose the affected >>>> certificates that have already been revoked prior to the CA's response >>>> to >>>> the incident? >>>> 3. Is a CA's incident report expected to disclose both an affected >>>> precertificate and its corresponding certificate? Or just one of the >>>> pair? >>>> >>>> >>>> >>>> [1] >>>> https://gist.github.com/robstradling/6a5ecca872cf28232d90638fc2c44ed5#file-check_extension_criticality-go >>>> [2] >>>> https://gist.github.com/robstradling/6a5ecca872cf28232d90638fc2c44ed5#file-report-csv >>>> [3] >>>> https://gist.github.com/robstradling/6a5ecca872cf28232d90638fc2c44ed5#file-generate_report-sh >>>> [4] https://bugzilla.mozilla.org/show_bug.cgi?id=1888060 >>>> [5] https://bugzilla.mozilla.org/show_bug.cgi?id=1888104 >>>> [6] https://bugzilla.mozilla.org/show_bug.cgi?id=1887096 >>>> [7] https://bugzilla.mozilla.org/show_bug.cgi?id=1883416 >>>> [8] https://www.ccadb.org/cas/incident-report >>>> >>>> -- >>>> Rob Stradling >>>> Senior Research & Development Scientist >>>> Sectigo Limited >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "CCADB Public" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/a/ccadb.org/d/msgid/public/MW4PR17MB47290848C0FE089BD12FA77AAA002%40MW4PR17MB4729.namprd17.prod.outlook.com >>>> <https://groups.google.com/a/ccadb.org/d/msgid/public/MW4PR17MB47290848C0FE089BD12FA77AAA002%40MW4PR17MB4729.namprd17.prod.outlook.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "CCADB Public" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/a/ccadb.org/d/msgid/public/CAAbw9mCwjzphmV3r-W%3DoSyGiGXdoBqhvcvnCyMkSogiq0%2BTthQ%40mail.gmail.com >>> <https://groups.google.com/a/ccadb.org/d/msgid/public/CAAbw9mCwjzphmV3r-W%3DoSyGiGXdoBqhvcvnCyMkSogiq0%2BTthQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "CCADB Public" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/a/ccadb.org/d/msgid/public/CAEmnErci%2ByV6A9LDioRJWyeXUobMpRo3_G4pP4eymTb1d%3DB_Kw%40mail.gmail.com >> <https://groups.google.com/a/ccadb.org/d/msgid/public/CAEmnErci%2ByV6A9LDioRJWyeXUobMpRo3_G4pP4eymTb1d%3DB_Kw%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "CCADB Public" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/a/ccadb.org/d/msgid/public/CAEmnErfXUdGAz1MwsRqw%3D6B3hoyncmBkm7_g%2Bn%3DsvY0W_rRiMQ%40mail.gmail.com.
