On Thursday, January 11, 2018 at 4:29:09 PM UTC-6, [email protected] wrote: > On Thursday, January 11, 2018 at 3:36:50 PM UTC-6, Ryan Sleevi wrote: > > On Wed, Jan 10, 2018 at 4:33 AM, josh--- via dev-security-policy < > > [email protected]> wrote: > > > > > At approximately 5 p.m. Pacific time on January 9, 2018, we received a > > > report from Frans Rosén of Detectify outlining a method of exploiting some > > > shared hosting infrastructures to obtain certificates for domains he did > > > not control, by making use of the ACME TLS-SNI-01 challenge type. We > > > quickly confirmed the issue and mitigated it by entirely disabling > > > TLS-SNI-01 validation in Let’s Encrypt. We’re grateful to Frans for > > > finding > > > this issue and reporting it to us. > > > > > > We’d like to describe the issue and our plans for possibly re-enabling > > > TLS-SNI-01 support. > > > > > > Problem Summary > > > > > > In the ACME protocol’s TLS-SNI-01 challenge, the ACME server (the CA) > > > validates a domain name by generating a random token and communicating it > > > to the ACME client. The ACME client uses that token to create a > > > self-signed > > > certificate with a specific, invalid hostname (for example, > > > 773c7d.13445a.acme.invalid), and configures the web server on the domain > > > name being validated to serve that certificate. The ACME server then looks > > > up the domain name’s IP address, initiates a TLS connection, and sends the > > > specific .acme.invalid hostname in the SNI extension. If the response is a > > > self-signed certificate containing that hostname, the ACME client is > > > considered to be in control of the domain name, and will be allowed to > > > issue certificates for it. > > > > > > However, Frans noticed that at least two large hosting providers combine > > > two properties that together violate the assumptions behind TLS-SNI: > > > > > > * Many users are hosted on the same IP address, and > > > * Users have the ability to upload certificates for arbitrary names > > > without proving domain control. > > > > > > When both are true of a hosting provider, an attack is possible. Suppose > > > example.com’s DNS is pointed at the same shared hosting IP address as a > > > site controlled by the attacker. The attacker can run an ACME client to > > > get > > > a TLS-SNI-01 challenge, then install their .acme.invalid certificate on > > > the > > > hosting provider. When the ACME server looks up example.com, it will > > > connect to the hosting provider’s IP address and use SNI to request the > > > .acme.invalid hostname. The hosting provider will serve the certificate > > > uploaded by the attacker. The ACME server will then consider the > > > attacker’s > > > ACME client authorized to issue certificates for example.com, and be > > > willing to issue a certificate for example.com even though the attacker > > > doesn’t actually control it. > > > > > > This issue only affects domain names that use hosting providers with the > > > above combination of properties. It is independent of whether the hosting > > > provider itself acts as an ACME client. > > > > > > Our Plans > > > > > > Shortly after the issue was reported, we disabled TLS-SNI-01 in Let’s > > > Encrypt. However, a large number of people and organizations use the > > > TLS-SNI-01 challenge type to get certificates. It’s important that we > > > restore service if possible, though we will only do so if we’re confident > > > that the TLS-SNI-01 challenge type is sufficiently secure. > > > > > > At this time, we believe that the issue can be addressed by having certain > > > services providers implement stronger controls for domains hosted on their > > > infrastructure. We have been in touch with the providers we know to be > > > affected, and mitigations will start being deployed for their systems > > > shortly. > > > > > > Over the next 48 hours we will be building a list of vulnerable providers > > > and their associated IP addresses. Our tentative plan, once the list is > > > completed, is to re-enable the TLS-SNI-01 challenge type with vulnerable > > > providers blocked from using it. > > > > > > We’re also going to be soliciting feedback on our plans from our > > > community, partners and other PKI stakeholders prior to re-enabling the > > > TLS-SNI-01 challenge. There is a lot to consider here and we’re looking > > > forward to feedback. > > > > > > We will post more information and details as our plans progress. > > > > > > > (Wearing a Google Chrome hat on behalf of our root store policy) > > > > Josh, > > > > Thanks for bringing this rapidly to the attention of the broader community > > and proactively reaching out to root programs. > > > > As framing to the discussion, we still believe TLS-SNI is fully permitted > > by the Baseline Requirements, which, while not ideal, still permits > > issuance using this method. As such, the 'root' cause is that the Baseline > > Requirements permit methods that are less secure than desired, and the > > discussion that follows is now around what steps to take - as CAs, as Root > > Programs, for site operators, and for the CA/Browser Forum. > > > > When faced with a vulnerable validation method that is permitted, it's > > always a challenge to balance the need for security - for sites and users - > > with the risk of compatibility and breakage from the removal of such a > > method. Fundamentally, the issues you raise call into question the level of > > assurance of 3.2.2.4.9 and 3.2.2.4.10 in the Baseline Requirements, and are > > not limited to TLS-SNI, and potentially affects every CA using these > > methods. > > > > When evaluating these methods, and their risks, compared to, say, the > > also-weak 3.2.2.4.1 and 3.2.2.4.5 discussions ongoing with the CA/Browser > > Forum, a few key distinctions, although non-exhaustive, apply and are > > factored in to our response and proposal here: > > > > - The average lifetime of certificates using these methods, across CAs, > > compared to 3.2.2.4.1/3.2.2.4.5, is significantly shorter - very close to > > the 90 days that Let's Encrypt uses, based on the available information we > > have. The fact that so many of these certificates are short lived creates a > > situation in where there's simultaneously more risk to the ecosystem to > > rapidly removing these methods as acceptable (due to the need to > > obtain/renew certificate), while there's also much less risk in allowing > > this method to continue to be used for a limited time, due to the fact that > > certificates that could be obtained by exploiting this will expire much > > sooner than the 2-3 years that many other certificates are issued with. > > That is, the security risk of a bad validation that lives for 3 years is > > much greater than the risk of a bad validation that lives for 90 days, and > > the fact that the badness is only valid for 90 days means that it's easier > > to allow it to more gracefully shut down than potentially accepting that > > implied risk for years. > > > > - The ease of which alternative methods exist. Methods that are manual are > > substantially easier to remove quickly, as alternative manual processes can > > also be used during the human-to-human interaction, while methods that are > > highly automated conversely create greater challenges, due to the need to > > update client software to whatever new automated methods may be used. While > > 3.2.2.4.1 and 3.2.2.4.5 are highly human-driven methods, methods like > > 3.2.2.4.9 and .10 are designed for automation - and why we were supportive > > of their addition - but also mean that any mitigations will necessarily > > face ecosystem challenges, much like deploying new versions of TLS or > > deprecating old ones. > > > > - The ease of which alternative automated methods can be used. As automated > > methods are generally designed around integrated systems and certain design > > constraints, it's not always possible to move to an equivalently > > automatable method (as it is with manual methods), and it may be that no > > equivalent automated method exists to fill the design niche. If that design > > niche is a substantial one for clients, and enables otherwise unautomatable > > systems, it can pose greater risk in prematurely removing it. Specific > > applications of the .9 and .10 methods, such as ACME's TLS-SNI, occupy an > > important niche, similar to the 3.2.2.4.6 method and ACME's HTTP-01 method, > > provide a level of automation for systems not directly integrated with DNS, > > and while that means they must be particularly attentive to the security > > risks that come from that, done correctly, they can provide a greater path > > towards security. > > > > - Compared to 3.2.2.4.1 and 3.2.2.4.5, specific applications of 3.2.2.4.9 > > and 3.2.2.4.10 can be evaluated against possible mitigations for the risk, > > both short- and long-term, and steps in which site operators can take to > > affirmatively protect themselves offer better assurances than those that > > rely entirely on the CA's good behaviour. As you call out, the specific > > risks of TLS-SNI are limited to shared providers (not individual users) > > that meet certain conditions, and these shared providers can already take > > existing steps to minimize the immediate risk, such as blocking the use of > > certificates or SNI negotiations that contain the '.invalid' TLD. While > > this is not an ideal long-term solution, by any means, it allows us to > > frame both the immediate and specific risks and the ways to reduce that. > > > > For the sake of brevity, I'll end my comparisons there, but hopefully > > highlights some of the factors we've considered in our response to your > > proposal. > > > > Given the risks identified to 3.2.2.4.9 and 3.2.2.4.10, we think it would > > be best for CAs using these Baseline Requirements-acceptable methods of > > validation to begin immediately transitioning away from them, with the goal > > of either removing them entirely from the Baseline Requirements, or > > identifying ways in which .9 and .10 can be better specified to mitigate > > such risks. That said, given the potential risks to the ecosystem, > > particularly those with pre-existing short-lived certificates, we think > > that, provided that the new certificates are valid for 90 days or less, > > we're open to allowing the specific TLS-SNI methods identified by the ACME > > specification to continue to be used for a limited time, while the broader > > community works to identify potential mitigations (if possible) or > > transition away from these methods. > > > > While we don't think the current status quo represents a viable long-term > > solution, given that the ACME TLS-SNI methods have been broadly reviewed > > within the IETF, that the risks apply to a limited subset of specific > > infrastructures, that mitigations are possible for these infrastructures to > > deploy, that Let's Encrypt is actively working with the community to > > identify, and ideally, share, those that haven't or cannot deploy such > > mitigations, and all of the other items previously mentioned, we think this > > represents an appropriate short-term balance. > > > > If and as new facts become available, it may be necessary to revisit this. > > We may have overlooked additional risks, or failed to consider mitigating > > factors. Further, this response is contextualized in the application of > > ACME's TLS-SNI methods for validation, and such a response may not be > > appropriate for other forms of validations within the framework of > > 3.2.2.4.9 and 3.2.2.4.10. Similarly, this response doesn't apply to > > certificates that may be valid for longer periods, as they may present > > substantially greater risk to making effective improvements to or an > > orderly transition away from these methods. > > > > We look forward to working with other browser vendors, site operators, and > > the relying community to work out ways to provide an orderly and effective > > transition to more secure methods - whether that means away from the > > 3.2.2.4.9/.10 series of domain validations, or to more restrictive forms > > that are more clearly "opt-in" rather than the explicit "opt-out" proposed > > (of 'blacklisting .invalid'). > > > > We're also curious if we've overlooked salient details in our response, and > > thus welcome feedback from Let's Encrypt, other CAs utilizing these > > validation methods (both TLS-SNI and 3.2.2.4.9 and 3.2.2.4.10), and the > > broader community as to our proposed next steps. Please consider this a > > draft response, and we look forward to future updates regarding proposed > > next steps. > > We have published an update on our plans for TLS-SNI: > > https://community.letsencrypt.org/t/2018-01-11-update-regarding-acme-tls-sni-and-shared-hosting-infrastructure/50188 > > The short summary is that we do not plan to generally re-enable TLS-SNI > validation, but we will introduce various forms of whitelists to limit impact > during our transition away from TLS-SNI. > > Thanks to everyone for the feedback on this thread already. Let us know if > you have any questions or concerns.
Another update, the main thing being that we have deployed patches to our CA that allow TLS-SNI for both renewal and whitelisted accounts, as we said we would in our previous update: https://community.letsencrypt.org/t/tls-sni-challenges-disabled-for-most-new-issuance/50316 _______________________________________________ dev-security-policy mailing list [email protected] https://lists.mozilla.org/listinfo/dev-security-policy

