Dear Max,

I do want to add some nuance.

On Tue, Feb 25, 2020 at 09:33:45PM +0200, Max Tulyev wrote:
> To be clear, I mean nobody really uses this RPKI, so 3 days downtime
> was even not noticed by anyone.

"Nobody uses RPKI" ... I don't think that statement holds true from any
angle. Keep in mind that almost 33% of RIPE prefixes are covered by RPKI
ROAs, and globally hundreds of autonomous systems (including the world's
largest IP carriers and IXPs) use RPKI data in some shape or form to
make better BGP best path selection decisions. RPKI is already here and
widely deployed, now we have to deal! :-)

The root of the problem was perhaps there for 3 days, but the
operational issue for relying parties was ~ 1.5 days because of how
things are distributed, cached & expired. ROA provisioning was broken
for 3 days. 

Additionally, the problem was somewhat obfuscated because some widely
used RPKI cache validator implementations didn't consider the broken
repository broken, which kept appearances up. This may seem like a good
thing, but I consider it problematic because it showcased some potential
for security issues in RPKI validation implementations. This is now
actively being discussed in IETF and I expect that this discussion will
result in positive changes in implementations.

I am happy the sky didn't fall on top of us, but that doesn't take away
from the seriousness of the situation and our duty to learn as much as
we can from this to improve our processes.

Kind regards,

Job

Reply via email to