On Wed, Nov 18, 2020 at 8:19 AM Nils Amiet via dev-security-policy <
dev-security-policy@lists.mozilla.org> wrote:

> We have carefully read your email, and believe we’ve identified the
> following
> important points:
>
> 1. Potential feasibility issue due to lack of path building support
>
> 2. Non-robust CRL implementations may lead to security issues
>
> 3. Avoiding financial strain is incompatible with user security.
>

I wasn't trying to suggest that they are fundamentally incompatible (i.e.
user security benefits by imposing financial strain), but rather, the goal
is user security, and avoiding financial strain is a "nice to have", but
not an essential, along that path.


> #1 Potential feasibility issue due to lack of path building support
>
> If a relying party does not implement proper path building, the EE
> certificate
> may not be validated up to the root certificate. It is true that the new
> solution adds some complexity to the certificate hierarchy. More
> specifically
> it would turn a linear hierarchy into a non-linear one. However, we
> consider
> that complexity as still being manageable, especially since there are
> usually
> few levels in the hierarchy in practice. If a CA hierarchy has utilized the
> Authority Information Access extension the chance that any PKI library can
> build the path seems to be very high.
>

I'm not sure which PKI libraries you're using, but as with the presumption
of "robust CRL" implementations, "robust AIA" implementations are
unquestionably the minority. While it's true that Chrome, macOS, and
Windows implement AIA checking, a wide variety of popular browsers
(Firefox), operating systems (Android), and libraries (OpenSSL, cURL,
wolf/yassl, etc etc) do not.

Even if the path could not be built, this would lead to a fail-secure
> situation
> where the EE certificate is considered invalid and it would not raise a
> security issue. That would be different if the EE certificate would
> erroneously
> be considered valid.
>

This perspective too narrowly focuses on the outcome, without considering
the practical deployment expectations. Failing secure is a desirable
property, surely, but there's no question that a desirable property for CAs
is "Does not widely cause outages" and "Does not widely cause our support
costs to grow".

Consider, for example, that despite RFC 5280 having a "fail close" approach
for nameConstraints (by MUST'ing the extension be critical), the CA/B Forum
chose to deviate and make it a MAY. The reason for this MAY was that the
lack of support for nameConstraints on legacy macOS (and OpenSSL) versions
meant that if a CA used nameConstraints, it would fail closed on those
systems, and such a failure made it impractical for the CA to deploy. The
choice to make such an extension non-critical was precisely because
"Everyone who implements the extension is secure, and everyone who doesn't
implement the extension works, but is not secure".

It is unfortunate-but-seemingly-necessary to take this mindset into
consideration: the choice between hard-failure for a number of clients (but
guaranteed security) or the choice between less-than-ideal security, but
practical usability, is often made for the latter, due to the socioeconomic
considerations of CAs.

Now, as it relates to the above point, the path building complexity would
inevitably cause a serious spike in support overhead, because it requires
that in order to avoid issues, every server operator would need to take
action to reconfigure their certificate chains to ensure the "correct"
paths were built to support such non-robust clients (for example, of the
OpenSSL forks, only LibreSSL supports path discovery, and only within the
past several weeks). As a practical matter, that is the same "cost" of
replacing a certificate, but with less determinism of behaviour: instead of
being able to test "am I using the wrong cert or the right one", a server
operator would need to consider the entire certification path, and we know
they'll get that wrong.

This is part of why I'm dismissive of the solution; not because it isn't
technically workable, but because it's practically non-viable when
considering the overall set of ecosystem concerns. These sorts of
considerations all factor in to the decision making and recommendation for
potential remediation for CAs: trying to consider all of the interests, but
also all of the limitations.


> #2 Non-robust CRL implementations may lead to security issues
>
> Relying parties using applications that don’t do a proper revocation check
> do
> have a security risk. This security risk is not introduced by our proposal
> but
> inherent to not implementing core functionality of public key
> infrastructures.
> The new solution rewards relying parties that properly follow the
> standards.
> Indeed, a relying party that has a robust CRL (or OCSP check)
> implementation
> already would not have to bear any additional security risk. They also
> would
> benefit from the point that our solution can be implemented very quickly
> and so
> they would quickly mitigate the security issue. On the other hand, relying
> parties with a non-functional revocation mechanism will have to take
> corrective
> measures. And we can consider it a fair balancing of work and risk when the
> workload and risk is on the actors with poor current implementations.


> It can be noted that the new solution gives some responsibility to the
> relying
> parties. Whereas the original solution gives responsibility to the
> subscribers
> who did nothing wrong about this in the first place. The new solution can
> be
> considered fairer with this regard.
>

I think this greatly misunderstands the priority of constituencies here, in
terms of what represents a "fair" solution. The shuffling of responsibility
to relying parties is, ostensibly, an anti-goal we try to minimize. As I
explained above, the practical impact for server operators is not actually
decreased, but increased, and so the overall "risk" here is now greater
than the "current" solution.

With respect to revocation and threat models, I'm not sure this thread is
the best medium to get into it, but "core functionality of public key
infrastructure" would arguably be a misstatement. Revocation functionality
reflected in X.509/RFC 5280 is tied to a number of inherent assumptions
about particular PKI designs and capabilities (e.g. the connectivity of
clients to an X.500/LDAP directory, the failure tolerances of clients
respective to the security goals, the number and type of certificates
issued), and it's not, in fact, a generalized solution. PKI is, at best, a
series of tools to be adjusted for different situations, and those tools
are not inherently right, or their absence not inherently wrong.

As it relates to the set of clients you're considering, again, this is the
difference between the "real world" of implementations and the idealized
world of the 90s that presumed a directory would co-exist with the PKI (and
serve an inherent function). The Directory doesn't exist, nor do PKI
clients broadly implement HTTP in the first place, let alone LDAP, and so
presumptions about the availability of external datas ources to handle this
doesn't actually hold up. This is where the current approach tries to be
both practical and pragmatic, by working in a manner that complements the
existing design constraints, rather than conflicts with them.

#3 Avoiding financial strain is incompatible with user security
>
> User security is fundamental for the PKI ecosystem. Our proposal increases
> the
> risk exposure level only for those relying parties with a fundamentally
> insecure application. Taking into account economic constraints in real
> life, we
> believe that the increase in security should be balanced in terms of cost
> and
> effort. Such a tradeoff may provide sufficient levels of security while
> maintaining affordability. This also leads to a quicker time to implement
> which
> limits the exposure and increases security.
>

The problem here again is that it misunderstands the total system costs.
The preference for a "short-term" solution here underestimates the
long-term costs of the approach to the overall ecosystem, and does not
tangibly improve security as a practical matter. That's not to say that, on
paper, it's not appealing, but rather, that the practice and implementation
show that the paper solution doesn't actually work to achieve the goal.

You argue that it only increases the risk for "fundamentally insecure"
applications, which bears a resemblance to my remarks around the decision
with nameConstraints and criticality. I appreciate the prescient symmetry,
but I think where this argument breaks down is that by the definition
you've offered as to what constitutes a fundamentally insecure application,
then by definition, virtually all applications deployed are fundamentally
insecure. As such, the solution doesn't really improve security. Our goal
in finding solutions is to try to find an appropriate balance here, such
that security works for the greatest majority possible. With that
constraint, the current solution works, although imperfectly, to address
more users "where they are now", and to protect them.

This is the benefit of discussing such solutions, because we can evaluate
whether or not they remain practically viable.


> One last thing we wanted to get your feedback on, is whether you agree
> that the
> new solution we proposed ensures that if SICA and ICA are considered
> revoked,
> there is no security risk for SICA2.
>

Because of the framing I gave above regarding the practical realities, I
don't feel comfortable stating that there would be no security risk for
SICA2. Given the practical costs to migrate to SICA2, and the limited (if
any) benefit it would provide, it seems that investment is better spent
through a rotation that unambiguously avoids the issue.
_______________________________________________
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy

Reply via email to