On Dec 18, 2012, at 2:48 PM, Randy Bush wrote:

> I am trying to understand why our fellow engineers at Verisign are
> obsessed with global propagation of RPKI data on the order of a few
> minutes.  Then a friend hit me with the clue by four.  It's about third
> party DDoS (and other attack) mitigation.

It's not just about this, that's certainly AN example.  Operators can envision 
lots of reasons why it'd be a good idea for the routing system to converge 
rapidly.  How many do you need?

> When an NTT (just an example) customer is attacked, the customer can
> request that their upstream sink and scrub the traffic as it naturally
> passes through NTT.  NTT passes the scrubbed traffic on to the customer,
> and that's that.
> 
> But, when the customer's upstream does not provide such services, or the
> customer has other reasons for not using the upstream service, they can
> buy the scrubbing service from a third party, e.g. Verisign.  Verisign
> will announce the customer's prefix(es) from a Verisign AS, receive the
> traffic, scrub it clean, and send the cleaned traffic to the customer,
> often via a tunnel.

Thanks for the explanation, I'm glad you took the time to _listen to operators 
about some of the ways of doing this.

> Well and good, except Verisign is announcing someone else's prefix, a
> basic violation of BGP origination validation.  

Actually, it's not a violation of anything - it's their intent - which is what 
this all boils down to.  It's only a violation if your system doesn't model 
intent and assumes it knows something is a violation... It's the same policy 
that's reflected in your ROA, I'm just not captive to that today.

It's also by default how routing works today, as you well know...

> Naturally, the customer
> who contracts to Verisign for this service issues a ROA for the
> prefix(es) to the Verisign AS, and it is not a violation, all is serene.
> 
> But what if Verisign wants to take on a new customer in panic "Help, I
> am being attacked and will pay you a lot of money to fix this?"  The
> time for the fix is dependent on how quickly a new ROA for the victim's
> prefix(es) can propagate.

Actually, it's not.  We don't know or care about ROAs today on the Internet.

> Alternatively, the victim could pull their normal ROA(s) for their
> prefix(es) and the world would get NotFound and believe Verisign's
> announcement(s).  But this too relies on quick propagation of RPKI data.

And opens the floodgates for downgrade attacks in this very heavy system you've 
been envisioning for over a decade, downgrade attacks that all the folks 
building these systems don't really seem to care about.

"Just fail open, you're no worse off than you are today" -- except that I have 
a bunch of new dependencies with 10x spend on routers and network engineers to 
debug RPKI and BGPSEC in order to route things - all for not.  

Tell a bank that - you've upgraded you security system and it costs 10x, but if 
there's a problem, just fail open, folks can still get money.

> Observe that this is a problem in origin validation, i.e.  what is being
> deployed today.  The RFCs are published, the code is in the routers, ...
> the horse has left the barn.

Hah, nice..  Glad you're considering the requirements now that your gerbil has 
left the barn!

> Yet another item to go into the display case of security vs. utility.

Right, you've designed something that doesn't even fix leaks - which many 
operators believe are an operational security problem and they happen by 
default (e.g., the last big routing security snafu we've seen, and go read GROW 
if you disagree), slows routing convergence on the Internet to hours or days at 
the least (from double digit seconds today), introduces lots of new external 
system dependencies (e.g., RIRs, whom YOU say have no operational clue), 
removes all the scaling properties of BGP over the past 2 decades (adds 
periodic updates, breaks packing, order magnitude larger updates, etc..), and 
you say if there's problems, just fail open (i.e., downgrade attacks).  You 
just admitted that operators can't change keys on two ends of a transport 
connection, and in the next breath you tell me how an Internet-scale system 
that requires global coherency for verification objects for every prefix in the 
entire system across every BGP router and RPKI is going to work, all on the b
 ackbone of rsync?

> Ah well.  But I sympathize with Verisign's business case and have no
> desire to whack it.  And I assume others think similarly.

How kind of you...

> There is this little problem.  Today, the Internet has no technology for
> reliable global fast distribution of a database.  No, DNS is not the
> answer.  But please have that argument on a different thread.  The only
> thing of which I am aware is Google's Spanner [0], which is a brilliant
> piece of work.  But it is not lightweight (e.g. True Time), is
> proprietary, is likely considered very secret sauce, and even Google
> lists it as research as opposed to beta :)

Right, so we should bolt PKI into your heap of rsync cruft onto the routing 
system to solve what problem?  Yet another item to go into the "and Randy Bush 
came down from the mountain, with two tablets in hand, to graciously bestow 
again upon all the operational monkeys clue from a superior galaxy far away, 
under the auspices of operator but funded by the USG, with a new system aimed 
to control what's routed on the Internet, how operators route and configure 
their networks and what external entities they're captive to - who cares if 
it's routed securely!, he proclaimed, I WILL EMPLOY RSYNC AND BOLT PKI ON BGP, 
it will be my legacy, youngsters, I WON, so take your requirements to the 
university, this is the iRBtf!"  The Internet was sad.

> IMIHO, this is a *really* interesting area for research, though I
> suspect sidr is not the place to conduct it.  But it's research, not
> something we can paint on today.  So we are left with propagation on the
> order of a very few hours, AKA 'human time', and our discussions of how
> to reduce load on CAs so we can query them more frequently.

Requirements require research, indeed...  It would have been good if we started 
here, rather than with SBGP-renamed and slamming RPKI directly into routers, 
it's just plain silly.  You should visit the work in GROW on why IRRs are 
challenged, or why leaks ARE operational AND security problems.

I think if you built general purpose Internet number resource certification and 
considered why things like the IRRs have had challenges, and why there should 
be autonomy in today's Internet operations, you'd be a lot better off...  
Instead, you built routed resource certification through an RPKI introducing 
new control points and made RIPv1 scale look like a panacea compared to 
RPKI-enabled BGPSEC, and muddied it all with the likes of random subjects, and 
then fixed with ghostbusters records, and grandparenting for more control 
points, and snail-paced convergence where relying parties have to discover 
everything new in a global system with potentially tens of thousands of 
publication points, and and increased attack surfaces and inter-domain churn 
orders or magnitude - and you couldn't even fix leaks, which happen by default, 
and which you're still going to need to fix with IRRs...

Randy, you've done a fine job socially engineering this work through the IETF, 
I'll give you that!

-danny

EVIL_OPERATOR: Just waiting for a call from one of the RIRs to engage us in 
mitigating DDoS attacks to their repositories; or for the opportunity to 
operationalize an overly complex system that folks are captive to - wohoo!


_______________________________________________
sidr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/sidr

Reply via email to