Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
Please put also this certificate on that list: https://crt.sh/?id=181538497=cablint,x509lint Best Regards, Jozsef ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Thu, Mar 15, 2018 at 12:22 PM, Tom via dev-security-policy < dev-security-policy@lists.mozilla.org> wrote: > Should another bug be opened for the certificate issued by IdenTrust with > apparently the same encoding problem? > > Yes - this is bug 1446121 ( https://bugzilla.mozilla.org/show_bug.cgi?id=1446121) https://crt.sh/?id=8373036=cablint,x509lint > Does Mozilla expects the revocation of such certificates? > > Yes, within 24 hours per BR 4.9.1.1 (9) "The CA is made aware that the Certificate was not issued in accordance with these Requirements or the CA’s Certificate Policy or Certification Practice Statement;" Mozilla requires adherence to the BRs, and the BRs require CAs to comply with RFC 5280. https://groups.google.com/d/msg/mozilla.dev.security.policy/ > wqySoetqUFM/l46gmX0hAwAJ > > - Wayne ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
Le 15/03/2018 à 20:04, Wayne Thayer a écrit : This incident, and the resulting action to "integrate GlobalSign's certlint and/or zlint into our existing cert-checker pipeline" has been documented in bug 1446080 [1] This is further proof that pre-issuance TBS certificate linting (either by incorporating existing tools or using a comprehensive set of rules) is a best practice that prevents misissuance. I don't understand why all CA's aren't doing this. - Wayne [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1446080 Should another bug be opened for the certificate issued by IdenTrust with apparently the same encoding problem? https://crt.sh/?id=8373036=cablint,x509lint Does Mozilla expects the revocation of such certificates? https://groups.google.com/d/msg/mozilla.dev.security.policy/wqySoetqUFM/l46gmX0hAwAJ ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
This incident, and the resulting action to "integrate GlobalSign's certlint and/or zlint into our existing cert-checker pipeline" has been documented in bug 1446080 [1] This is further proof that pre-issuance TBS certificate linting (either by incorporating existing tools or using a comprehensive set of rules) is a best practice that prevents misissuance. I don't understand why all CA's aren't doing this. - Wayne [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1446080 ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Tuesday, March 13, 2018 at 4:27:23 PM UTC-6, Matthew Hardeman wrote: > I thought I recalled a recent case in which a new root/key was declined > with the sole unresolved (and unresolvable, save for new key generation, > etc.) matter precluding the inclusion being a prior mis-issuance of test > certificates, already revoked and disclosed. Perhaps I am mistaken. I haven't seen this directly addressed. I'm not sure what incident you are referring to, but I'm fairly that the mis-issuance that needed new keys was for certificates that were issued for domains that weren't properly validated. In the case under discussion in this thread, all the mis-issued certificates are only mis-issued due to encoding issues. The certificates are for sub-domains of randomly generated subdomains of aws.radiantlock.org (which, according to whois, is controlled by Let's Encrypt). I presume these domains are created specifically for testing certificate issuance in the production environment in a way that complies with the BRs. To put it succinctly, the issue you are referring to is about issuing certificates for domains that aren't authorized (whether for testing or not), rather than creating test certificates. -- Tom Prince ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Tue, Mar 13, 2018 at 6:27 PM Matthew Hardemanwrote: > Another question this incident raised in my mind pertains to the parallel >>> staging and production environment paradigm: If one truly has the >>> 'courage >>> of conviction' of the equivalence of the two environments, why would one >>> not perform all tests in ONLY the staging environment, with no tests and >>> nothing other than production transactions on the production environment? >>> That tests continue to be executed in the production environment while >>> holding to the notion that a fully parallel staging environment is the >>> place for tests seems to signal that confidence in the staging >>> environment >>> is -- in some measure, however small -- limited. >> >> >> That's ... just a bad conclusion, especially for a publicly-trusted CA :) >> >> > I certainly agree it's possible that I've reached a bad conclusion there, > but I would like to better understand how specifically? Assuming the same > input data set and software manipulating said data, two systems should in > general execute identically. To the extent that they do not, my initial > position would be that a significant failing of change management of > operating environment or data set or system level matters has occurred. I > would think all of those would be issues of great concern to a CA, if for > no other reason than that they should be very very rare. > I get the impression you may not have run complex production systems, especially distributed systems, or spent much time with testing methodology, given statements such as “courage or your conviction.” No testing system is going to be perfect, and there’s a difference between designed redundancy and unnecessary testing. For example, even if you had 100% code coverage through tests, there are still things that are possible to get wrong - for example, you could test every line of your codebase and still fail to properly handle IDNs, for example - or, as other CAs have shown, ampersands. It’s foolish to think that a staging environment will cover every possible permutation - even if you solved the halting problem, you will still have issues with, say, solar radiation induced bitflips, or RAM heat, or any number of other issues. And yes, these are issues still affecting real systems today, not scary stories we tell our SREs to keep them up at night. Look at any complex system - avionics, military command-and-control, certificate authorities, modern scalable websites - and you will find systems designed with redundancy throughout, to ensure proper functioning. It is the madness of inexperience to suggest that somehow this redundancy is unnecessary or somehow a black mark - the Sean Hannity approach of “F’ it, we’ll do it live” is the antithesis of modern and secure design. The suggestion that this is somehow a sign of insufficient testing or design is, at best, naive, and at worse, detrimental towards discussions of how to improve the ecosystem. > ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
> So to clarify I understand this: The same problem was in the staging > environment and there where also certificates with illegal encoding issued in > staging, but you didn't notice them because no one manually validated them > with the crt.sh lint? That's correct. > Or are there differences between staging and production? Yep, there are differences, though of course we try to keep them to a minimum. The most notable is that we don't use trusted keys in staging. That means staging can only submit to test CT logs, and is therefore not picked up by crt.sh. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Tuesday, March 13, 2018 at 23:51:01 UTC+1 js...@letsencrypt.org wrote: > Clearly we should have caught this earlier in the process. The changes we > have in the pipeline (integrating certlint and/or zlint) would have > automatically caught the encoding issue at each staging in the pipeline: in > development, in staging, and in production. So to clarify I understand this: The same problem was in the staging environment and there where also certificates with illegal encoding issued in staging, but you didn't notice them because no one manually validated them with the crt.sh lint? Or are there differences between staging and production? ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Tuesday, March 13, 2018 at 2:02:45 PM UTC-7, Ryan Sleevi wrote: > I'm hoping that LE can provide more details about the change management > process and how, in light of this incident, it may change - both in terms > of automated testing and in certificate policy review. Forgot to reply to this specific part. Our change management process starts with our SDLC, which mandates code review (typically dual code review), unit tests, and where appropriate, integration tests. All unittests and integrations tests are run automatically with every change, and before every deploy. Our operations team checks the automated test status and will not deploy if the tests are broken. Any configuration changes that we plan to apply in staging and production are first added to our automated tests. Each deploy then spends a period of time in our staging environment, where it is subject to further automated tests: periodic issuance testing, plus performance, availability, and correctness monitoring equivalent to our production environment. This includes running the cert-checker software I mentioned earlier. Typically our deploys spend two days in our staging environment before going live, though that depends on our risk evaluation, and hotfix deploys may spend less time in staging if we have high confidence in their safety. Similarly, any configuration changes are applied to the staging environment before going to production. For significant changes we do additional manual testing in the staging environment. Generally this testing means checking that the new change was applied as expected, and that no errors were produced. We don't rely on manual testing as a primary way of catching bugs; we automate everything we can. If the staging deployment or configuration change doesn't show any problems, we continue to production. Production has the same suite of automated live tests as staging. And similar to staging, for significant changes we do additional manual testing. It was this step that caught the encoding issue, when one of our staff used crt.sh's lint tool to double check the test certificate they issued. Clearly we should have caught this earlier in the process. The changes we have in the pipeline (integrating certlint and/or zlint) would have automatically caught the encoding issue at each staging in the pipeline: in development, in staging, and in production. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Tue, Mar 13, 2018 at 4:02 PM, Ryan Sleeviwrote: > > > On Tue, Mar 13, 2018 at 4:13 PM, Matthew Hardeman via dev-security-policy > wrote: > >> I am not at all suggesting consequences for Let's Encrypt, but rather >> raising a question as to whether that position on new inclusions / >> renewals >> is appropriate. If these things can happen in a celebrated best-practices >> environment, can they really in isolation be cause to reject a new >> application or a new root from an existing CA? >> > > While I certainly appreciate the comparison, I think it's apples and > oranges when we consider both the nature and degree, nor do I think it's > fair to suggest "in isolation" is a comparison. > I thought I recalled a recent case in which a new root/key was declined with the sole unresolved (and unresolvable, save for new key generation, etc.) matter precluding the inclusion being a prior mis-issuance of test certificates, already revoked and disclosed. Perhaps I am mistaken. > > I'm sure you can agree that incident response is defined by both the > nature and severity of the incident itself, the surrounding ecosystem > factors (i.e. was this a well-understood problem), and the detection, > response, and disclosure practices that follow. A system that does not > implement any checks whatsoever is, I hope, something we can agree is worse > than a system that relies on human checks (and virtually indistinguishable > from no checks), and that both are worse than a system with incomplete > technical checks. > > I certainly concur with all of that, which is the part of the basis for which I form my own opinion that Let's Encrypt should not suffer any consequence of significance beyond advice along the lines of "make your testing environment and procedures better". > I do agree with you that I find it challenging with how the staging > environment was tested - failure to have robust profile tests in staging, > for example, are what ultimately resulted in Turktrust's notable > misissuance of unconstrained CA certificates. Similarly, given the wide > availability of certificate linting tools - such as ZLint, x509Lint, > (AWS's) certlint, and (GlobalSign's) certlint - there's no dearth of > availability of open tools and checks. Given the industry push towards > integration of these automated tools, it's not entirely clear why LE would > invent yet another, but it's also not reasonable to require that LE use > something 'off the shelf'. > I'm very interested in how the testing occurs in terms of procedures. I would assume, for example, that no test transaction of any kind would ever be "played" against a production environment unless that same exact test transaction had already been "played" against the staging environment. With respect to this case, were these wildcard certificates requested and issued against the staging system with materially the same test transaction data, and if so was the encoding incorrect? If these were not performed against staging, what was the rational basis for executing a new and novel test transaction against the production system first? If they were performed AND if they did not encode incorrectly, then what was the disparity between the environments which led to this? (The implication being that some sort of change management process needs to be revised to keep the operating environments of staging and production better synchronized.) If they were performed and were improperly encoded on the staging environment, then one would presume that the erroneous result was missed by the various automated and manual examinations of the results of the tests. As you note, it's unreasonable to require use of any particular implementation of any particular tool but in as far as the other tools achieve certain results while clearly the LE developed tools did not catch this issue, it would appear that LE needs to better test their testing mechanisms and while it may not be necessary for them to incorporate the competing tools in the live issuance pipeline, it would seem advisable that Let's Encrypt should pass the output results (the certificates) of tests within their staging environment through these various other testing tools as part of a post-staging-deployment testing phase. It would seem logical to take the best of breed tools and stack them up whether automatically or manually and waterfall the final output results of a full suite of test scenarios against the post-deployment state of the staging environment, with a view to identifying discrepancies between the LE tool opinion and the external tool's opinion and reconciling those, rejecting invalid determinations as appropriate. > > I'm hoping that LE can provide more details about the change management > process and how, in light of this incident, it may change - both in terms > of automated testing and in certificate policy review. > > >> Another question this incident
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Tuesday, March 13, 2018 at 2:02:45 PM UTC-7, Ryan Sleevi wrote: > availability of certificate linting tools - such as ZLint, x509Lint, > (AWS's) certlint, and (GlobalSign's) certlint - there's no dearth of > availability of open tools and checks. Given the industry push towards > integration of these automated tools, it's not entirely clear why LE would > invent yet another, but it's also not reasonable to require that LE use > something 'off the shelf'. We are indeed planning to integrate GlobalSign's certlint and/or zlint into our existing cert-checker pipeline rather than build something new. We've already started submitting issues and PRs, in order to give back to the ecosystem: https://github.com/zmap/zlint/issues/212 https://github.com/zmap/zlint/issues/211 https://github.com/zmap/zlint/issues/210 https://github.com/globalsign/certlint/pull/5 If your question is why we wrote cert-checker rather than use something off-the-shelf: cablint / x509lint weren't available at the time we wrote cert-checker. When they became available we evaluated them for production and/or CI use, but concluded that the complex dependencies and difficulty of productionizing them in our environment outweighed the extra confidence we expected to gain, especially given that our certificate profile at the time was very static. A system improvement we could have made here would have been to set "deploy cablint or its equivalent" as a blocker for future certificate profile changes. I'll add that to our list of items for remediation. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Tue, Mar 13, 2018 at 4:13 PM, Matthew Hardeman via dev-security-policy < dev-security-policy@lists.mozilla.org> wrote: > I am not at all suggesting consequences for Let's Encrypt, but rather > raising a question as to whether that position on new inclusions / renewals > is appropriate. If these things can happen in a celebrated best-practices > environment, can they really in isolation be cause to reject a new > application or a new root from an existing CA? > While I certainly appreciate the comparison, I think it's apples and oranges when we consider both the nature and degree, nor do I think it's fair to suggest "in isolation" is a comparison. I'm sure you can agree that incident response is defined by both the nature and severity of the incident itself, the surrounding ecosystem factors (i.e. was this a well-understood problem), and the detection, response, and disclosure practices that follow. A system that does not implement any checks whatsoever is, I hope, something we can agree is worse than a system that relies on human checks (and virtually indistinguishable from no checks), and that both are worse than a system with incomplete technical checks. I do agree with you that I find it challenging with how the staging environment was tested - failure to have robust profile tests in staging, for example, are what ultimately resulted in Turktrust's notable misissuance of unconstrained CA certificates. Similarly, given the wide availability of certificate linting tools - such as ZLint, x509Lint, (AWS's) certlint, and (GlobalSign's) certlint - there's no dearth of availability of open tools and checks. Given the industry push towards integration of these automated tools, it's not entirely clear why LE would invent yet another, but it's also not reasonable to require that LE use something 'off the shelf'. I'm hoping that LE can provide more details about the change management process and how, in light of this incident, it may change - both in terms of automated testing and in certificate policy review. > Another question this incident raised in my mind pertains to the parallel > staging and production environment paradigm: If one truly has the 'courage > of conviction' of the equivalence of the two environments, why would one > not perform all tests in ONLY the staging environment, with no tests and > nothing other than production transactions on the production environment? > That tests continue to be executed in the production environment while > holding to the notion that a fully parallel staging environment is the > place for tests seems to signal that confidence in the staging environment > is -- in some measure, however small -- limited. That's ... just a bad conclusion, especially for a publicly-trusted CA :) ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
The fact that this mis-issuance occurred does raise a question for the community. For quite some time, it has been repeatedly emphasized that maintaining a non-trusted but otherwise identical staging environment and practicing all permutations of tests and issuances -- especially involving new functionality -- on that parallel staging infrastructure is the mechanism by which mis-issuances such as those mentioned in this thread may be avoided within production environments. Let's Encrypt has been a shining example of best practices up to this point and has enjoyed the attendant minimization of production issues (presumably as a result of exercising said best practices). Despite that, however, either the test cases which resulted in these mis-issuances were not first executed on the staging platform or did not result in the mis-issuance there. A reference was made to a Go lang library error / non-conformance being implicated. Were the builds for staging and production compiled on different releases of Go lang? Certainly, I think these particular mis-issuances do not significantly affect the level of trust which should be accorded to ISRG / Let's Encrypt. Having said that, however, it is worth noting that in a fully new and novel PKI infrastructure, it seems likely -- based on recent inclusion / renewal requests -- that such a mis-issuance would recently have resulted in a disqualification of a given root / key with guidance to cut a new root PKI and start the process over. I am not at all suggesting consequences for Let's Encrypt, but rather raising a question as to whether that position on new inclusions / renewals is appropriate. If these things can happen in a celebrated best-practices environment, can they really in isolation be cause to reject a new application or a new root from an existing CA? Another question this incident raised in my mind pertains to the parallel staging and production environment paradigm: If one truly has the 'courage of conviction' of the equivalence of the two environments, why would one not perform all tests in ONLY the staging environment, with no tests and nothing other than production transactions on the production environment? That tests continue to be executed in the production environment while holding to the notion that a fully parallel staging environment is the place for tests seems to signal that confidence in the staging environment is -- in some measure, however small -- limited. On Tue, Mar 13, 2018 at 8:46 AM, josh--- via dev-security-policy < dev-security-policy@lists.mozilla.org> wrote: > On Tuesday, March 13, 2018 at 3:33:50 AM UTC-5, Tom wrote: > > > During final tests for the general availability of wildcard > > certificate support, the Let's Encrypt operations team issued six test > > wildcard certificates under our publicly trusted root: > > > > > > https://crt.sh/?id=353759994 > > > https://crt.sh/?id=353758875 > > > https://crt.sh/?id=353757861 > > > https://crt.sh/?id=353756805 > > > https://crt.sh/?id=353755984 > > > https://crt.sh/?id=353754255 > > > > > Somebody noticed there > > https://community.letsencrypt.org/t/acmev2-and-wildcard- > launch-delay/53654/62 > > that the certificate of *.api.letsencrypt.org (apparently currently in > > use), issued by "TrustID Server CA A52" (IdenTrust) seams to have the > > same problem: > > https://crt.sh/?id=8373036=cablint,x509lint > > I think it's just a coincidence that we got a wildcard cert from IdenTrust > a long time ago and it happens to have the same encoding issue that we ran > into. I notified IdenTrust in case they haven't fixed the problem since > then. > ___ > dev-security-policy mailing list > dev-security-policy@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-security-policy > ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Tuesday, March 13, 2018 at 3:33:50 AM UTC-5, Tom wrote: > > During final tests for the general availability of wildcard > certificate support, the Let's Encrypt operations team issued six test > wildcard certificates under our publicly trusted root: > > > > https://crt.sh/?id=353759994 > > https://crt.sh/?id=353758875 > > https://crt.sh/?id=353757861 > > https://crt.sh/?id=353756805 > > https://crt.sh/?id=353755984 > > https://crt.sh/?id=353754255 > > > Somebody noticed there > https://community.letsencrypt.org/t/acmev2-and-wildcard-launch-delay/53654/62 > that the certificate of *.api.letsencrypt.org (apparently currently in > use), issued by "TrustID Server CA A52" (IdenTrust) seams to have the > same problem: > https://crt.sh/?id=8373036=cablint,x509lint I think it's just a coincidence that we got a wildcard cert from IdenTrust a long time ago and it happens to have the same encoding issue that we ran into. I notified IdenTrust in case they haven't fixed the problem since then. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
> During final tests for the general availability of wildcard certificate support, the Let's Encrypt operations team issued six test wildcard certificates under our publicly trusted root: > > https://crt.sh/?id=353759994 > https://crt.sh/?id=353758875 > https://crt.sh/?id=353757861 > https://crt.sh/?id=353756805 > https://crt.sh/?id=353755984 > https://crt.sh/?id=353754255 > Somebody noticed there https://community.letsencrypt.org/t/acmev2-and-wildcard-launch-delay/53654/62 that the certificate of *.api.letsencrypt.org (apparently currently in use), issued by "TrustID Server CA A52" (IdenTrust) seams to have the same problem: https://crt.sh/?id=8373036=cablint,x509lint ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Mon, Mar 12, 2018 at 11:38 PM jacob.hoffmanandrews--- via dev-security-policywrote: > On Monday, March 12, 2018 at 8:22:46 PM UTC-7, Ryan Sleevi wrote: > > Given that Let's Encrypt has been operating a Staging Endpoint ( > > https://letsencrypt.org/docs/staging-environment/ ) for issuing > wildcards, > > what controls, if any, existed to examine the certificate profiles prior > to > > being deployed in production? Normally, that would flush these out - > > through both manual and automated testing, preferably. > > We continuously run our cert-checker tool ( > https://github.com/letsencrypt/boulder/blob/master/cmd/cert-checker/main.go#L196-L261) > in both staging and production. Unfortunately, it tests mainly the higher > level semantic aspects of certificates rather than the lower level encoding > aspects. Clearly we need better coverage on encoding issues. We expect to > get that from integrating more and better linters into both our CI testing > framework and our staging and production environments. We will also review > the existing controls in our cert-checker tool. > > > Golang's ASN.1 library is somewhat lax, largely in part to both public > and > > enterprise CAs' storied history of misencodings. > > Agreed that Go's asn1 package is lax on parsing, but I don't think that it > aims to be lax on encoding; for instance, the mis-encoding of asterisks in > PrintableStrings was considered a bug worth fixing. > > > What examinations, if any, > > will Let's Encrypt be doing for other classes of potential encoding > issues? > > Has this caused any changes in how Let's Encrypt will construct > > TBSCertificates, or review of that code, beyond the introduction of > > additional linting? > > We will re-review the code we use to generate TBSCertificates with an eye > towards encoding issues, thanks for suggesting it. If there are any broad > classes of encoding issues you think are particularly worth watching out > for, that could help guide our analysis. Well, you’ve already run into one of the common ones I’d seen in the past - more commonly with older OpenSSL-based bespoke/enterprise CAs (due to long-since fixed defaults, but nobody upgrading) Encoding of INTEGERS is another frequent source of pain - minimum length encoding, ensuring positive numbers - but given the Go ASN1 package’s author’s hatred of that, I would be surprised. Reordering of SEQUENCES has been an issue for at least two wholly independent CA software stacks when integrating CT support; at least one I suspect is due to using a HashMap that has non-guarantees ordering semantics / iteration order changing between runs and releases. These seems relevant to Go, given its map designs. SET encodings not being sorted according to their values when encoding. This would manifest in DNs, although I don’t believe Boulder supports equivalent RDNs/AVAs. Explicit encoding of DEFAULT values, most commonly basicConstraints. This is issue most commonly crops up when folks convert ASN.1 schemas to internal templates by hand, rather than using compilers - which is something applicable to Go. Not enforcing size constraints - on strings or sequences. Similar to the above, many folks forget to convert the restrictions when converting by hand. Improper encoding of parameter attributes for signature and SPKI algorithms - especially RSA. This is due to the “ANY DEFINED BY” approach and folks hand rolling, or not closely reading the specs. This is more high-level, but derived from the schema flexibility. Variable encoding of string types between Subject/Issuer or Issuer/NameConstraints. This is more quasigrayarea - there are defined semantics for this, but few get it right. This is more high-level, but derived from the schema flexibility. Not realizing DNSName, URI, and rfc822Name nameConstraints have different semantic rules - this is more high-level than encoding, but within that set. certlint/cablint catches many of these, in a large part through using an ASN.1 schema compiler (asn1c) rather than hand-rolling. Yet even it has had some encoding issues in the past. > Also, is this the correct timestamp? For example, examining > > https://crt.sh/?id=353754255=ocsp > > Shows an issuance time of Not Before: Mar 12 22:18:30 2018 GMT and a > > revocation time of 2018-03-12 23:58:10 UTC , but you stated your > alerting > > time was 2018-03-13 00:43:00 UTC. I'm curious if that's a bug in the > > display of crt.sh, a typo in your timezone computation (considering the > > recent daylight saving adjustments in the US), a deliberate choice to put > > revocation somewhere between those dates (which is semantically valid, > but > > curious), or perhaps something else. > > I believe this was a timezone computation error. By my reading of the > logs, our alerting time was 2018-03-13 23:43:00 UTC, which agrees with your > hypothesis about the recent timezone change (DST) leading to a mistake in > calculating UTC times. >
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Monday, March 12, 2018 at 8:27:06 PM UTC-7, Ryan Sleevi wrote: > Also, is this the correct timestamp? For example, examining > https://crt.sh/?id=353754255=ocsp > > Shows an issuance time of Not Before: Mar 12 22:18:30 2018 GMT and a > revocation time of 2018-03-12 23:58:10 UTC , but you stated your alerting > time was 2018-03-13 00:43:00 UTC. I'm curious if that's a bug in the > display of crt.sh, a typo in your timezone computation (considering the > recent daylight saving adjustments in the US), a deliberate choice to put > revocation somewhere between those dates (which is semantically valid, but > curious), or perhaps something else. Adding a little more detail and precision here: Let's Encrypt backdates certificates by one hour, so "Not Before: Mar 12 22:18:30 2018 GMT" indicates an issuance time of 23:18:30. Also, you may notice that one of the certificates was actually revoked at 23:30:33, before we became aware of the problem. This was done as part of our regular deployment testing, to ensure that revocation was working properly. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Monday, March 12, 2018 at 8:22:46 PM UTC-7, Ryan Sleevi wrote: > Given that Let's Encrypt has been operating a Staging Endpoint ( > https://letsencrypt.org/docs/staging-environment/ ) for issuing wildcards, > what controls, if any, existed to examine the certificate profiles prior to > being deployed in production? Normally, that would flush these out - > through both manual and automated testing, preferably. We continuously run our cert-checker tool (https://github.com/letsencrypt/boulder/blob/master/cmd/cert-checker/main.go#L196-L261) in both staging and production. Unfortunately, it tests mainly the higher level semantic aspects of certificates rather than the lower level encoding aspects. Clearly we need better coverage on encoding issues. We expect to get that from integrating more and better linters into both our CI testing framework and our staging and production environments. We will also review the existing controls in our cert-checker tool. > Golang's ASN.1 library is somewhat lax, largely in part to both public and > enterprise CAs' storied history of misencodings. Agreed that Go's asn1 package is lax on parsing, but I don't think that it aims to be lax on encoding; for instance, the mis-encoding of asterisks in PrintableStrings was considered a bug worth fixing. > What examinations, if any, > will Let's Encrypt be doing for other classes of potential encoding issues? > Has this caused any changes in how Let's Encrypt will construct > TBSCertificates, or review of that code, beyond the introduction of > additional linting? We will re-review the code we use to generate TBSCertificates with an eye towards encoding issues, thanks for suggesting it. If there are any broad classes of encoding issues you think are particularly worth watching out for, that could help guide our analysis. > Also, is this the correct timestamp? For example, examining > https://crt.sh/?id=353754255=ocsp > Shows an issuance time of Not Before: Mar 12 22:18:30 2018 GMT and a > revocation time of 2018-03-12 23:58:10 UTC , but you stated your alerting > time was 2018-03-13 00:43:00 UTC. I'm curious if that's a bug in the > display of crt.sh, a typo in your timezone computation (considering the > recent daylight saving adjustments in the US), a deliberate choice to put > revocation somewhere between those dates (which is semantically valid, but > curious), or perhaps something else. I believe this was a timezone computation error. By my reading of the logs, our alerting time was 2018-03-13 23:43:00 UTC, which agrees with your hypothesis about the recent timezone change (DST) leading to a mistake in calculating UTC times. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Mon, Mar 12, 2018 at 11:22 PM, Ryan Sleeviwrote: > > > On Mon, Mar 12, 2018 at 10:35 PM, josh--- via dev-security-policy < > dev-security-policy@lists.mozilla.org> wrote: > >> During final tests for the general availability of wildcard certificate >> support, the Let's Encrypt operations team issued six test wildcard >> certificates under our publicly trusted root: >> >> https://crt.sh/?id=353759994 >> https://crt.sh/?id=353758875 >> https://crt.sh/?id=353757861 >> https://crt.sh/?id=353756805 >> https://crt.sh/?id=353755984 >> https://crt.sh/?id=353754255 >> >> These certificates contain a subject common name that includes a “*.” >> label encoded as an ASN.1 PrintableString, which does not allow the >> asterisk character, violating RFC 5280. >> >> We became aware of the problem on 2018-03-13 at 00:43 UTC via the linter >> flagging in crt.sh [1]. > > Also, is this the correct timestamp? For example, examining https://crt.sh/?id=353754255=ocsp Shows an issuance time of Not Before: Mar 12 22:18:30 2018 GMT and a revocation time of 2018-03-12 23:58:10 UTC , but you stated your alerting time was 2018-03-13 00:43:00 UTC. I'm curious if that's a bug in the display of crt.sh, a typo in your timezone computation (considering the recent daylight saving adjustments in the US), a deliberate choice to put revocation somewhere between those dates (which is semantically valid, but curious), or perhaps something else. > All six certificates have been revoked. >> >> The root cause of the problem is a Go language bug [2] which has been >> resolved in Go v1.10 [3], which we were already planning to deploy soon. We >> will resolve the issue by upgrading to Go v1.10 before proceeding with our >> wildcard certificate launch plans. >> >> We employ a robust testing infrastructure but there is always room for >> improvement and sometimes bugs slip through our pre-production tests. We’re >> fortunate that the PKI community has produced some great testing tools that >> sometimes catch things we don’t. In response to this incident we are >> planning to integrate additional tools into our testing infrastructure and >> improve our test coverage of multiple Go versions. >> >> [1] https://crt.sh/ >> >> [2] https://github.com/golang/go/commit/3b186db7b4a5cc510e71f906 >> 82732eba3df72fd3 >> >> [3] https://golang.org/doc/go1.10#encoding/asn1 >> >> > Given that Let's Encrypt has been operating a Staging Endpoint ( > https://letsencrypt.org/docs/staging-environment/ ) for issuing > wildcards, what controls, if any, existed to examine the certificate > profiles prior to being deployed in production? Normally, that would flush > these out - through both manual and automated testing, preferably. > > Given that Let's Encrypt is running on an open-source CA (Boulder), this > offers a unique opportunity to highlight where the controls and checks are > in place, particularly for commonNames. RFC 5280 has other restrictions in > place that have tripped CAs up, such as the exclusively using > PrintableString/UTF8String for DirectoryString types (except for backwards > compatibility, which would not apply here), or length restrictions (such as > 64 characters, per the ASN.1 schema), it would be useful to comprehensively > review these controls. > > Golang's ASN.1 library is somewhat lax, largely in part to both public and > enterprise CAs' storied history of misencodings. What examinations, if any, > will Let's Encrypt be doing for other classes of potential encoding issues? > Has this caused any changes in how Let's Encrypt will construct > TBSCertificates, or review of that code, beyond the introduction of > additional linting? > > ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: 2018.03.12 Let's Encrypt Wildcard Certificate Encoding Issue
On Mon, Mar 12, 2018 at 10:35 PM, josh--- via dev-security-policy < dev-security-policy@lists.mozilla.org> wrote: > During final tests for the general availability of wildcard certificate > support, the Let's Encrypt operations team issued six test wildcard > certificates under our publicly trusted root: > > https://crt.sh/?id=353759994 > https://crt.sh/?id=353758875 > https://crt.sh/?id=353757861 > https://crt.sh/?id=353756805 > https://crt.sh/?id=353755984 > https://crt.sh/?id=353754255 > > These certificates contain a subject common name that includes a “*.” > label encoded as an ASN.1 PrintableString, which does not allow the > asterisk character, violating RFC 5280. > > We became aware of the problem on 2018-03-13 at 00:43 UTC via the linter > flagging in crt.sh [1]. All six certificates have been revoked. > > The root cause of the problem is a Go language bug [2] which has been > resolved in Go v1.10 [3], which we were already planning to deploy soon. We > will resolve the issue by upgrading to Go v1.10 before proceeding with our > wildcard certificate launch plans. > > We employ a robust testing infrastructure but there is always room for > improvement and sometimes bugs slip through our pre-production tests. We’re > fortunate that the PKI community has produced some great testing tools that > sometimes catch things we don’t. In response to this incident we are > planning to integrate additional tools into our testing infrastructure and > improve our test coverage of multiple Go versions. > > [1] https://crt.sh/ > > [2] https://github.com/golang/go/commit/3b186db7b4a5cc510e71f90682732e > ba3df72fd3 > > [3] https://golang.org/doc/go1.10#encoding/asn1 > > Given that Let's Encrypt has been operating a Staging Endpoint ( https://letsencrypt.org/docs/staging-environment/ ) for issuing wildcards, what controls, if any, existed to examine the certificate profiles prior to being deployed in production? Normally, that would flush these out - through both manual and automated testing, preferably. Given that Let's Encrypt is running on an open-source CA (Boulder), this offers a unique opportunity to highlight where the controls and checks are in place, particularly for commonNames. RFC 5280 has other restrictions in place that have tripped CAs up, such as the exclusively using PrintableString/UTF8String for DirectoryString types (except for backwards compatibility, which would not apply here), or length restrictions (such as 64 characters, per the ASN.1 schema), it would be useful to comprehensively review these controls. Golang's ASN.1 library is somewhat lax, largely in part to both public and enterprise CAs' storied history of misencodings. What examinations, if any, will Let's Encrypt be doing for other classes of potential encoding issues? Has this caused any changes in how Let's Encrypt will construct TBSCertificates, or review of that code, beyond the introduction of additional linting? ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy