On 10/Apr/19 10:16, Ben Maddison wrote:

>  
>
>
> We didn't see any *unexpected* impact. Everything covered by a ROA
> issued under the afrinic TAL fell back to Not Found as designed, and
> we had no reported issues.
> I had been concerned that we would see some transient loops form as
> things moved from Valid to Not Found (this is an issue because of
> ios-xe's behaviour of preferring Valid to Not Found regardless of
> configured policy - please go shout at your Cisco SE about this!).

This is, actually, a real implementation issue that will bite many a
network, as they deploy RPKI and ROV.

We first hit this particular problem with Cisco back in 2014 on our
first ROV attempt. We had to pull out of the deployment because of
Cisco's automated policy application when RPKI is setup  (but also
because global deployment was non-existent at the time).

On this second attempt, we have limited deployment to just our peering
and transit routers, as we plan for the same on our customer edge in 3
months. But even then, we came across a corner case... we are learning a
route from a customer in 2 locations, call them Router A and Router B.

On each router, hot-potato routing works as one would expect. However,
Router A is a Junos device (customer edge) while Router B is an IOS XE
device (peering box specifically for CDN mapping). By default, Junos
does not apply any kind of policy against the RPKI database, which we
like. However, Cisco (IOS, IOS XE and IOS XR) automatically do this, and
in the process, violates the RFC, which we don't like.

All Cisco devices will do the below by default:

    1. Prefer Valid routes.
    2. Then prefer Not-Found routes.
    3. Drop all Invalid routes.

You can tell the Cisco to accept Invalid routes and calculate the BGP
best path against them, with an extra command.

In my case, the issue was that Router B was receiving the customer's
route from Router A via iBGP. Because of the default ROV policy on all
Cisco devices, the iBGP path was preferred because all iBGP routes on a
Cisco are assigned an RPKI state of Valid (and as you know, RPKI state
is the now the first rule used in the BGP best path calculation on all
Cisco devices - no other BGP attribute will influence that). However,
the better path was directly toward the customer who is connected to
Router B via eBGP, in lieu of the path via iBGP to Router A.

The only solution was to disable ROV on Router B, which is
less-than-ideal, but not a train smash since Router B does not talk to
the rest of the world. So not running ROV on it does not break our RPKI
policy.

I recall raising this issue together with Randy and Philip back in 2015.
I had hoped Cisco's position would have shifted per the RFC in the time
since then, but alas. At any rate, if you are an all-Junos network, you
will see what you expect. If you are part- or all-Cisco network, please
test your deployments in the lab before you proceed, as nastiness is
almost guaranteed.

Mark.

Reply via email to