Re: Google Trust Services - Minor SCT issue disclosure
The code at issue evolved as CT requirements changed. What started off as a very simple conditional grew into a more complex if / else if block with somewhat complicated logic and inline checks. As part of the fix, we simplified the conditionals and refactored the inline checks to make use of nice clear IsExternallyOperated() and IsGoogleOperated() functions. The end result is a much more readable and clear set of logic that is easier to test and we expanded test coverage. I think the big lesson for the community is that it would have been better to have refactored earlier rather the evolving the code to the point it became more complicated than it needed to be. On Thu, Aug 23, 2018 at 9:40 AM Ryan Sleevi wrote: > > > On Thu, Aug 23, 2018 at 8:50 AM, Andy Warner via dev-security-policy < > dev-security-policy@lists.mozilla.org> wrote: >> >> * NOTE: The bug was due to an 'if/else' chain fall through. The code in >> question has been refactored to be simpler and more readable. >> > > Andy, > > It might be good for the community if you could describe the processes > before and after the change, so that other CAs can help prevent similar > issues with their own embedding systems. > smime.p7s Description: S/MIME Cryptographic Signature ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: Google Trust Services - Minor SCT issue disclosure
On Thu, Aug 23, 2018 at 8:50 AM, Andy Warner via dev-security-policy < dev-security-policy@lists.mozilla.org> wrote: > > * NOTE: The bug was due to an 'if/else' chain fall through. The code in > question has been refactored to be simpler and more readable. > Andy, It might be good for the community if you could describe the processes before and after the change, so that other CAs can help prevent similar issues with their own embedding systems. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: Google Trust Services - Minor SCT issue disclosure
Google provides SCTs via embedding and during SSL handshaking depending on the certificate and how it is served. In this case, all of the affected certs used embedded SCTs and the issue was the selection of which SCTs to include because we submit to more CT logs than required, but only embed the required number of SCTs to keep cert sizes as small as possible. These certs were submitted to 4 CT logs, 2 Google, 2 non-Google, but the embedded certs were only from the 2 Google logs, not one Google and one non-Google. The CA signed 4 correct SCTs and all 4 were submitted to CT logs, the problem was the embedding logic for the SCTs. In response to Q1, the logic involved was specific to selection and embedding of SCTs, not part of validation logic, so a related error would not affect validation. An unrelated error in validation logic could of course affect validation, but all CAs have that risk and like other CAs we have multiple layers of safeguards on validation logic. For Q2, we sample certs regularly and make use of proven external linting libraries and our own linting and audit logic. In this case because the issue was not something normally checked by external tools and the behavior was perfectly fine until the Chrome deadline in April, I can only posit that we would have discovered it fairly quickly via other means. We now have specific checks for this issue and other similar problems we could foresee. For Q3, we could account for the initial submission fully and understand exactly what happened. Google has rigorous version control and enforcement systems to ensure only properly reviewed and built code can enter production and to reconcile running code against the cut point for an approved release. Our CA systems have additional safeguards on top of those standard tools to further ensure that we have strong knowledge of the pedigree of all code and how it was built and deployed. On Thu, Aug 23, 2018 at 10:55 AM Nick Lamb wrote: > On Thu, 23 Aug 2018 05:50:05 -0700 (PDT) > Andy Warner via dev-security-policy > wrote: > > > May 21st 2018, a new tool for issuing certificates within Google was > > made available to internal customers. Within hours we started to > > receive reports that Chrome Canary (v67) with Certificate > > Transparency checks enabled was showing warnings. A coding error led > > to the new tool providing Signed Certificate Timestamps (SCTs) from 2 > > Google CT logs instead of one Google and one non-Google log. > > Feel free to jump in anywhere I've made a mistake, this might totally > invalidate some of my questions. > > Presumably, since you eventually "fixed" this by asking Subscribers to > re-issue, the SCTs are baked into a signed certificate, rather than > provided separately so that the Subscriber can use them with e.g. > Stapling technologies ? > > Which means that this "new tool" also involved a Google controlled > subCA signing these certificates with, as it turns out, the wrong SCTs > in them. It's not clear to me if the tool and CA are operationally one > and the same. > > Q1: Could a more significant "coding error" in this tool have resulted > in certificates being mis-issued (for example with SANs that don't > belong to Google, or lacking mandatory X.509 fields, or without being > CT logged)? If not please explain why the tool couldn't cause this. > > Q2: If this error hadn't caused a negative end-user experience, what > mechanisms if any do you believe would have brought it to your > attention and how soon? e.g. does a team sample resulting certificates > from this tool at some interval? If it samples pre-certificates that > would not have detected this error, but is worth mentioning. > > Q3: Such mistakes are of course inevitable in software development. But > they could also be introduced maliciously. Were you able to confidently > identify which specific individual(s) made the relevant change? (I don't > want names). Are you confident you'd be able to do this even if somehow > the production tool turned out not to match your revision control > systems? > > Thanks as always for satisfying my curiosity > > Nick. > smime.p7s Description: S/MIME Cryptographic Signature ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: Google Trust Services - Minor SCT issue disclosure
On Thu, 23 Aug 2018 05:50:05 -0700 (PDT) Andy Warner via dev-security-policy wrote: > May 21st 2018, a new tool for issuing certificates within Google was > made available to internal customers. Within hours we started to > receive reports that Chrome Canary (v67) with Certificate > Transparency checks enabled was showing warnings. A coding error led > to the new tool providing Signed Certificate Timestamps (SCTs) from 2 > Google CT logs instead of one Google and one non-Google log. Feel free to jump in anywhere I've made a mistake, this might totally invalidate some of my questions. Presumably, since you eventually "fixed" this by asking Subscribers to re-issue, the SCTs are baked into a signed certificate, rather than provided separately so that the Subscriber can use them with e.g. Stapling technologies ? Which means that this "new tool" also involved a Google controlled subCA signing these certificates with, as it turns out, the wrong SCTs in them. It's not clear to me if the tool and CA are operationally one and the same. Q1: Could a more significant "coding error" in this tool have resulted in certificates being mis-issued (for example with SANs that don't belong to Google, or lacking mandatory X.509 fields, or without being CT logged)? If not please explain why the tool couldn't cause this. Q2: If this error hadn't caused a negative end-user experience, what mechanisms if any do you believe would have brought it to your attention and how soon? e.g. does a team sample resulting certificates from this tool at some interval? If it samples pre-certificates that would not have detected this error, but is worth mentioning. Q3: Such mistakes are of course inevitable in software development. But they could also be introduced maliciously. Were you able to confidently identify which specific individual(s) made the relevant change? (I don't want names). Are you confident you'd be able to do this even if somehow the production tool turned out not to match your revision control systems? Thanks as always for satisfying my curiosity Nick. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: Google Trust Services - Minor SCT issue disclosure
Correct, we do not believe there was a policy violation, we're proactively sharing in the interest of transparency and knowledge sharing. I believe there is additional information we could share about how we've modified testing to ensure compliance with Chrome and Safari's SCT inclusion rules and have more flexible tests. I want to discuss this with the engineer who implemented the changes to ensure they agree with how I would summarize the changes. Update to follow. On Thu, Aug 23, 2018 at 8:57 AM Alex Gaynor wrote: > Hi Andy, > > Just so I follow, this is something you're proactively sharing, right? As > far as I can tell, there's no violation of any Mozilla Root Program rules > here, just an issue that caused interstitials in Chrome. > > Either way, I appreciate your sharing. > > You mentioned the issue was do to some overly complex control flow. In > order to help other CAs out, do you think there are testing methodologies > that could have helped catch this earlier? > > Alex > > On Thu, Aug 23, 2018 at 8:50 AM Andy Warner via dev-security-policy < > dev-security-policy@lists.mozilla.org> wrote: > >> Please note, Google wrote this report for internal use immediately after >> the issue. We intended to post it to m.d.s.p at that time, but securing >> internal approvals took a while and the posting ended-up on the back burner >> for a bit. It was a minor issue, but we want the community to be aware of >> it. >> >> Summary: >> >> May 21st 2018, a new tool for issuing certificates within Google was made >> available to internal customers. Within hours we started to receive reports >> that Chrome Canary (v67) with Certificate Transparency checks enabled was >> showing warnings. A coding error led to the new tool providing Signed >> Certificate Timestamps (SCTs) from 2 Google CT logs instead of one Google >> and one non-Google log. >> >> * NOTE: Affected certs were logged at issuance to at least 2 Google CT >> logs and 2 non-Google CT logs. The embedded SCTs for affected certs only >> provided proofs from Google logs instead of Google and non-Google logs as >> required by Chrome. >> >> * NOTE: The bug was due to an 'if/else' chain fall through. The code in >> question has been refactored to be simpler and more readable. >> >> The issue was fully resolved ~14 hours after initial notification. The >> issue was mitigated within 4 hours. Triage and code fixes happened within >> 11 hours and it took ~3 hours to deploy the fixed code and confirm the >> fixed behavior in production. The new code was running in relatively few >> locations, so deployment was quick compared to some changes in our >> infrastructure. >> >> Most affected customers responded quickly to communications that they >> should replace their certificates and revoke the old ones before a given >> deadline. All certificates that were issued with an SCT set that was not >> fully compliant were revoked on 2018-06-19 if they had not already been >> revoked by the customer previously. Most users replaced certificates >> shortly after notification. >> >> Timeline: >> >> 2018-03-22 Bug introduced to codebase. >> 2018-05-21 Push including bug became available to clients. >> 2018-05-22 08:05 UTC First user reports that Chrome Canary presents a CT >> warning for a certificate. >> 2018-05-22 09:25 UTC Bug filed with initial assessment. >> 2018-05-22 12:01 UTC Frontend jobs with the bug are taken offline >> following standard CA procedures. >> 2018-05-22 15:59 UTC Issue conclusively identified. >> 2018-05-22 19:07 UTC Fix is submitted. >> 2018-05-22 21:48 UTC Fix starts to be rolled out. >> 2018-05-22 22:16 UTC Fix fully deployed and tested on test instances >> followed by deployment to production. Access to frontends restored. >> 2018-05-24 Customer communication sent to affected users to ask them to >> renew their certificates and revoke the old ones. >> 2018-06-19 The final handful of certificates that had not already been >> revoked and replaced by users were revoked by the CA. >> >> Findings: >> >> * The operational plan to halt issuance worked as expected and was >> implemented quickly. >> * The problem was quickly found, fully understood and easy to remedy. >> * Tests existed, but did not cover this failure case. >> >> Remediation Plan >> * Completed >> ** Message of the Day (MOTD) functionality was added or improved for all >> issuance systems to make it easier to communicate issues to users when >> issuance is intentionally paused. >> ** Test coverage was expanded to ensure that both the quantity and type >> of SCTs are checked. >> ___ >> dev-security-policy mailing list >> dev-security-policy@lists.mozilla.org >> https://lists.mozilla.org/listinfo/dev-security-policy >> > smime.p7s Description: S/MIME Cryptographic Signature ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: Google Trust Services - Minor SCT issue disclosure
Hi Andy, Just so I follow, this is something you're proactively sharing, right? As far as I can tell, there's no violation of any Mozilla Root Program rules here, just an issue that caused interstitials in Chrome. Either way, I appreciate your sharing. You mentioned the issue was do to some overly complex control flow. In order to help other CAs out, do you think there are testing methodologies that could have helped catch this earlier? Alex On Thu, Aug 23, 2018 at 8:50 AM Andy Warner via dev-security-policy < dev-security-policy@lists.mozilla.org> wrote: > Please note, Google wrote this report for internal use immediately after > the issue. We intended to post it to m.d.s.p at that time, but securing > internal approvals took a while and the posting ended-up on the back burner > for a bit. It was a minor issue, but we want the community to be aware of > it. > > Summary: > > May 21st 2018, a new tool for issuing certificates within Google was made > available to internal customers. Within hours we started to receive reports > that Chrome Canary (v67) with Certificate Transparency checks enabled was > showing warnings. A coding error led to the new tool providing Signed > Certificate Timestamps (SCTs) from 2 Google CT logs instead of one Google > and one non-Google log. > > * NOTE: Affected certs were logged at issuance to at least 2 Google CT > logs and 2 non-Google CT logs. The embedded SCTs for affected certs only > provided proofs from Google logs instead of Google and non-Google logs as > required by Chrome. > > * NOTE: The bug was due to an 'if/else' chain fall through. The code in > question has been refactored to be simpler and more readable. > > The issue was fully resolved ~14 hours after initial notification. The > issue was mitigated within 4 hours. Triage and code fixes happened within > 11 hours and it took ~3 hours to deploy the fixed code and confirm the > fixed behavior in production. The new code was running in relatively few > locations, so deployment was quick compared to some changes in our > infrastructure. > > Most affected customers responded quickly to communications that they > should replace their certificates and revoke the old ones before a given > deadline. All certificates that were issued with an SCT set that was not > fully compliant were revoked on 2018-06-19 if they had not already been > revoked by the customer previously. Most users replaced certificates > shortly after notification. > > Timeline: > > 2018-03-22 Bug introduced to codebase. > 2018-05-21 Push including bug became available to clients. > 2018-05-22 08:05 UTC First user reports that Chrome Canary presents a CT > warning for a certificate. > 2018-05-22 09:25 UTC Bug filed with initial assessment. > 2018-05-22 12:01 UTC Frontend jobs with the bug are taken offline > following standard CA procedures. > 2018-05-22 15:59 UTC Issue conclusively identified. > 2018-05-22 19:07 UTC Fix is submitted. > 2018-05-22 21:48 UTC Fix starts to be rolled out. > 2018-05-22 22:16 UTC Fix fully deployed and tested on test instances > followed by deployment to production. Access to frontends restored. > 2018-05-24 Customer communication sent to affected users to ask them to > renew their certificates and revoke the old ones. > 2018-06-19 The final handful of certificates that had not already been > revoked and replaced by users were revoked by the CA. > > Findings: > > * The operational plan to halt issuance worked as expected and was > implemented quickly. > * The problem was quickly found, fully understood and easy to remedy. > * Tests existed, but did not cover this failure case. > > Remediation Plan > * Completed > ** Message of the Day (MOTD) functionality was added or improved for all > issuance systems to make it easier to communicate issues to users when > issuance is intentionally paused. > ** Test coverage was expanded to ensure that both the quantity and type of > SCTs are checked. > ___ > dev-security-policy mailing list > dev-security-policy@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-security-policy > ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Google Trust Services - Minor SCT issue disclosure
Please note, Google wrote this report for internal use immediately after the issue. We intended to post it to m.d.s.p at that time, but securing internal approvals took a while and the posting ended-up on the back burner for a bit. It was a minor issue, but we want the community to be aware of it. Summary: May 21st 2018, a new tool for issuing certificates within Google was made available to internal customers. Within hours we started to receive reports that Chrome Canary (v67) with Certificate Transparency checks enabled was showing warnings. A coding error led to the new tool providing Signed Certificate Timestamps (SCTs) from 2 Google CT logs instead of one Google and one non-Google log. * NOTE: Affected certs were logged at issuance to at least 2 Google CT logs and 2 non-Google CT logs. The embedded SCTs for affected certs only provided proofs from Google logs instead of Google and non-Google logs as required by Chrome. * NOTE: The bug was due to an 'if/else' chain fall through. The code in question has been refactored to be simpler and more readable. The issue was fully resolved ~14 hours after initial notification. The issue was mitigated within 4 hours. Triage and code fixes happened within 11 hours and it took ~3 hours to deploy the fixed code and confirm the fixed behavior in production. The new code was running in relatively few locations, so deployment was quick compared to some changes in our infrastructure. Most affected customers responded quickly to communications that they should replace their certificates and revoke the old ones before a given deadline. All certificates that were issued with an SCT set that was not fully compliant were revoked on 2018-06-19 if they had not already been revoked by the customer previously. Most users replaced certificates shortly after notification. Timeline: 2018-03-22 Bug introduced to codebase. 2018-05-21 Push including bug became available to clients. 2018-05-22 08:05 UTC First user reports that Chrome Canary presents a CT warning for a certificate. 2018-05-22 09:25 UTC Bug filed with initial assessment. 2018-05-22 12:01 UTC Frontend jobs with the bug are taken offline following standard CA procedures. 2018-05-22 15:59 UTC Issue conclusively identified. 2018-05-22 19:07 UTC Fix is submitted. 2018-05-22 21:48 UTC Fix starts to be rolled out. 2018-05-22 22:16 UTC Fix fully deployed and tested on test instances followed by deployment to production. Access to frontends restored. 2018-05-24 Customer communication sent to affected users to ask them to renew their certificates and revoke the old ones. 2018-06-19 The final handful of certificates that had not already been revoked and replaced by users were revoked by the CA. Findings: * The operational plan to halt issuance worked as expected and was implemented quickly. * The problem was quickly found, fully understood and easy to remedy. * Tests existed, but did not cover this failure case. Remediation Plan * Completed ** Message of the Day (MOTD) functionality was added or improved for all issuance systems to make it easier to communicate issues to users when issuance is intentionally paused. ** Test coverage was expanded to ensure that both the quantity and type of SCTs are checked. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy