On 10/Apr/19 10:16, Ben Maddison wrote:
> > > > We didn't see any *unexpected* impact. Everything covered by a ROA > issued under the afrinic TAL fell back to Not Found as designed, and > we had no reported issues. > I had been concerned that we would see some transient loops form as > things moved from Valid to Not Found (this is an issue because of > ios-xe's behaviour of preferring Valid to Not Found regardless of > configured policy - please go shout at your Cisco SE about this!). This is, actually, a real implementation issue that will bite many a network, as they deploy RPKI and ROV. We first hit this particular problem with Cisco back in 2014 on our first ROV attempt. We had to pull out of the deployment because of Cisco's automated policy application when RPKI is setup (but also because global deployment was non-existent at the time). On this second attempt, we have limited deployment to just our peering and transit routers, as we plan for the same on our customer edge in 3 months. But even then, we came across a corner case... we are learning a route from a customer in 2 locations, call them Router A and Router B. On each router, hot-potato routing works as one would expect. However, Router A is a Junos device (customer edge) while Router B is an IOS XE device (peering box specifically for CDN mapping). By default, Junos does not apply any kind of policy against the RPKI database, which we like. However, Cisco (IOS, IOS XE and IOS XR) automatically do this, and in the process, violates the RFC, which we don't like. All Cisco devices will do the below by default: 1. Prefer Valid routes. 2. Then prefer Not-Found routes. 3. Drop all Invalid routes. You can tell the Cisco to accept Invalid routes and calculate the BGP best path against them, with an extra command. In my case, the issue was that Router B was receiving the customer's route from Router A via iBGP. Because of the default ROV policy on all Cisco devices, the iBGP path was preferred because all iBGP routes on a Cisco are assigned an RPKI state of Valid (and as you know, RPKI state is the now the first rule used in the BGP best path calculation on all Cisco devices - no other BGP attribute will influence that). However, the better path was directly toward the customer who is connected to Router B via eBGP, in lieu of the path via iBGP to Router A. The only solution was to disable ROV on Router B, which is less-than-ideal, but not a train smash since Router B does not talk to the rest of the world. So not running ROV on it does not break our RPKI policy. I recall raising this issue together with Randy and Philip back in 2015. I had hoped Cisco's position would have shifted per the RFC in the time since then, but alas. At any rate, if you are an all-Junos network, you will see what you expect. If you are part- or all-Cisco network, please test your deployments in the lab before you proceed, as nastiness is almost guaranteed. Mark.
