On Wed, Nov 18, 2020 at 8:19 AM Nils Amiet via dev-security-policy < dev-security-policy@lists.mozilla.org> wrote:
> We have carefully read your email, and believe we’ve identified the > following > important points: > > 1. Potential feasibility issue due to lack of path building support > > 2. Non-robust CRL implementations may lead to security issues > > 3. Avoiding financial strain is incompatible with user security. > I wasn't trying to suggest that they are fundamentally incompatible (i.e. user security benefits by imposing financial strain), but rather, the goal is user security, and avoiding financial strain is a "nice to have", but not an essential, along that path. > #1 Potential feasibility issue due to lack of path building support > > If a relying party does not implement proper path building, the EE > certificate > may not be validated up to the root certificate. It is true that the new > solution adds some complexity to the certificate hierarchy. More > specifically > it would turn a linear hierarchy into a non-linear one. However, we > consider > that complexity as still being manageable, especially since there are > usually > few levels in the hierarchy in practice. If a CA hierarchy has utilized the > Authority Information Access extension the chance that any PKI library can > build the path seems to be very high. > I'm not sure which PKI libraries you're using, but as with the presumption of "robust CRL" implementations, "robust AIA" implementations are unquestionably the minority. While it's true that Chrome, macOS, and Windows implement AIA checking, a wide variety of popular browsers (Firefox), operating systems (Android), and libraries (OpenSSL, cURL, wolf/yassl, etc etc) do not. Even if the path could not be built, this would lead to a fail-secure > situation > where the EE certificate is considered invalid and it would not raise a > security issue. That would be different if the EE certificate would > erroneously > be considered valid. > This perspective too narrowly focuses on the outcome, without considering the practical deployment expectations. Failing secure is a desirable property, surely, but there's no question that a desirable property for CAs is "Does not widely cause outages" and "Does not widely cause our support costs to grow". Consider, for example, that despite RFC 5280 having a "fail close" approach for nameConstraints (by MUST'ing the extension be critical), the CA/B Forum chose to deviate and make it a MAY. The reason for this MAY was that the lack of support for nameConstraints on legacy macOS (and OpenSSL) versions meant that if a CA used nameConstraints, it would fail closed on those systems, and such a failure made it impractical for the CA to deploy. The choice to make such an extension non-critical was precisely because "Everyone who implements the extension is secure, and everyone who doesn't implement the extension works, but is not secure". It is unfortunate-but-seemingly-necessary to take this mindset into consideration: the choice between hard-failure for a number of clients (but guaranteed security) or the choice between less-than-ideal security, but practical usability, is often made for the latter, due to the socioeconomic considerations of CAs. Now, as it relates to the above point, the path building complexity would inevitably cause a serious spike in support overhead, because it requires that in order to avoid issues, every server operator would need to take action to reconfigure their certificate chains to ensure the "correct" paths were built to support such non-robust clients (for example, of the OpenSSL forks, only LibreSSL supports path discovery, and only within the past several weeks). As a practical matter, that is the same "cost" of replacing a certificate, but with less determinism of behaviour: instead of being able to test "am I using the wrong cert or the right one", a server operator would need to consider the entire certification path, and we know they'll get that wrong. This is part of why I'm dismissive of the solution; not because it isn't technically workable, but because it's practically non-viable when considering the overall set of ecosystem concerns. These sorts of considerations all factor in to the decision making and recommendation for potential remediation for CAs: trying to consider all of the interests, but also all of the limitations. > #2 Non-robust CRL implementations may lead to security issues > > Relying parties using applications that don’t do a proper revocation check > do > have a security risk. This security risk is not introduced by our proposal > but > inherent to not implementing core functionality of public key > infrastructures. > The new solution rewards relying parties that properly follow the > standards. > Indeed, a relying party that has a robust CRL (or OCSP check) > implementation > already would not have to bear any additional security risk. They also > would > benefit from the point that our solution can be implemented very quickly > and so > they would quickly mitigate the security issue. On the other hand, relying > parties with a non-functional revocation mechanism will have to take > corrective > measures. And we can consider it a fair balancing of work and risk when the > workload and risk is on the actors with poor current implementations. > It can be noted that the new solution gives some responsibility to the > relying > parties. Whereas the original solution gives responsibility to the > subscribers > who did nothing wrong about this in the first place. The new solution can > be > considered fairer with this regard. > I think this greatly misunderstands the priority of constituencies here, in terms of what represents a "fair" solution. The shuffling of responsibility to relying parties is, ostensibly, an anti-goal we try to minimize. As I explained above, the practical impact for server operators is not actually decreased, but increased, and so the overall "risk" here is now greater than the "current" solution. With respect to revocation and threat models, I'm not sure this thread is the best medium to get into it, but "core functionality of public key infrastructure" would arguably be a misstatement. Revocation functionality reflected in X.509/RFC 5280 is tied to a number of inherent assumptions about particular PKI designs and capabilities (e.g. the connectivity of clients to an X.500/LDAP directory, the failure tolerances of clients respective to the security goals, the number and type of certificates issued), and it's not, in fact, a generalized solution. PKI is, at best, a series of tools to be adjusted for different situations, and those tools are not inherently right, or their absence not inherently wrong. As it relates to the set of clients you're considering, again, this is the difference between the "real world" of implementations and the idealized world of the 90s that presumed a directory would co-exist with the PKI (and serve an inherent function). The Directory doesn't exist, nor do PKI clients broadly implement HTTP in the first place, let alone LDAP, and so presumptions about the availability of external datas ources to handle this doesn't actually hold up. This is where the current approach tries to be both practical and pragmatic, by working in a manner that complements the existing design constraints, rather than conflicts with them. #3 Avoiding financial strain is incompatible with user security > > User security is fundamental for the PKI ecosystem. Our proposal increases > the > risk exposure level only for those relying parties with a fundamentally > insecure application. Taking into account economic constraints in real > life, we > believe that the increase in security should be balanced in terms of cost > and > effort. Such a tradeoff may provide sufficient levels of security while > maintaining affordability. This also leads to a quicker time to implement > which > limits the exposure and increases security. > The problem here again is that it misunderstands the total system costs. The preference for a "short-term" solution here underestimates the long-term costs of the approach to the overall ecosystem, and does not tangibly improve security as a practical matter. That's not to say that, on paper, it's not appealing, but rather, that the practice and implementation show that the paper solution doesn't actually work to achieve the goal. You argue that it only increases the risk for "fundamentally insecure" applications, which bears a resemblance to my remarks around the decision with nameConstraints and criticality. I appreciate the prescient symmetry, but I think where this argument breaks down is that by the definition you've offered as to what constitutes a fundamentally insecure application, then by definition, virtually all applications deployed are fundamentally insecure. As such, the solution doesn't really improve security. Our goal in finding solutions is to try to find an appropriate balance here, such that security works for the greatest majority possible. With that constraint, the current solution works, although imperfectly, to address more users "where they are now", and to protect them. This is the benefit of discussing such solutions, because we can evaluate whether or not they remain practically viable. > One last thing we wanted to get your feedback on, is whether you agree > that the > new solution we proposed ensures that if SICA and ICA are considered > revoked, > there is no security risk for SICA2. > Because of the framing I gave above regarding the practical realities, I don't feel comfortable stating that there would be no security risk for SICA2. Given the practical costs to migrate to SICA2, and the limited (if any) benefit it would provide, it seems that investment is better spent through a rotation that unambiguously avoids the issue. _______________________________________________ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy