On Fri, 11 Jan 2013, SM wrote:
draft-ietf-sidr-bgpsec-threats-03 discusses about the threat model. It
could be used as a starting point to identify the points of failures.
Some of the failure scenarios I have heard from the dns TLD world:
Zone file was truncated due to lack of disk space. Insufficient checks was
in place to notice the truncated zone file before it was replicated out to
the public-facing DNS infrastructure, thus causing outage.
DNSSEC failure (don't remember the exact details), ie not everything was
correctly signed, thus causing failure. DNSSEC is an example of a protocol
which is quite brittle operationally. You make one mistake, and your
entire domain is toast.
So when implementing RPKI at the RIR level, there should be operational
advice how everything should be done. Preferrably even at the protocol
level, there should be a disaster "off" button which the publisher can use
to just switch off all validation for all records they handle, and this
should be propagated very quickly.
Frankly, I'm scared when I read the documentation for this (=SIDR) thing.
I feel it fails the "KISS" test with all the CA, signing, propagation
infrastructure needed. It's also quite different from anything else that
the normal network engineer is exposed to historically in their day to day
work life. It's my opinion that a lot of careful though put into how this
is actually implemented in real life, making sure there are operational
tools (I still have not signed my private domain with DNSSEC because the
operational tools are too hard to use) in place to aid deployment.
The re-signing needed over time to keep DNSSEC (and SSL certificates) up
to date is something I see people fail routinely still. This needs to be
automated.
Is it too early in the protocol development process to start looking into
the operational aspects?
--
Mikael Abrahamsson email: [email protected]
_______________________________________________
sidr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/sidr