Re: [sidr] WGLC: draft-ietf-sidr-origin-ops

Shane Amante Sun, 30 Oct 2011 22:20:51 -0700

Hi Randy,

On Oct 30, 2011, at 4:57 AM, Randy Bush wrote:
[--snip--]
>> 1)  From Section 3:
>> ---snip---
>>   A local valid cache containing all RPKI data may be gathered from the
>>   global distributed database using the rsync protocol, [RFC5781], and
>>   a validation tool such as rcynic [rcynic].
>> ---snip---
>> 
>> Would it be possible to mention and/or point to how the above process is 
>> supposed to be bootstrapped?  IOW, is it expected that, eventually?, the 
>> RIR's are going to publish to their end-users and maintain URI's of RPKI 
>> publication points?  Since this is an Ops guidelines document, some guidance 
>> and/or pointers are likely to save [lots of] questions down the road.  I'm 
>> not expecting this to be a tutorial document, but some idea on the theory of 
>> how a new SP bootstraps their cache(s) would be helpful.
> 
> uh, i am not clear on what you actually want here.  in the minimal case,
> the op should just run rcynic or some equivalent relying party tool, as
> it says.  in the more complex/large case, good quality RP cache code
> should be able to feed from other RP caches.


Let me try again.  :-)  Assume that I'm Joe Random Operator and I'm new to this 
SIDR thing.  I've just installed my first RPKI cache and would like to 
bootstrap it with a full set of RPKI data from "the world".  To what RIR's or 
RP's (or, both) do I go to bootstrap my RPKI cache for the very first time?


>> 2)  Given that, to my knowledge, the RPKI is [very] loosely synchronized in 
>> a "pull-only" fashion, shouldn't there be some text added below to that 
>> effect that:
>>    a)  It may not be best to go more than, say, 2 levels of RPKI caches deep 
>> inside a single organization/ASN to avoid RPKI caches from being out of sync 
>> with each other?  IOW, there are likely a small set of 1st/top-level RPKI 
>> caches that speak externally to fetch RPKI cache information, (similar to 
>> 'hidden' authoritative DNS servers), then a second tier of RPKI caches that 
>> synchronize (only) from the top-level RPKI caches, (similar to external, 
>> anycast authoritative DNS servers). 
>>    b)  Operators should look at running more aggressive synchronization 
>> intervals _internally_ within their organization/ASN, from "children" 
>> (2nd-level) RPKI caches to the 'parent' (top-level) RPKI cache in their 
>> organization/ASN, compared to more "relaxed" synchronization intervals to 
>> RPKI caches external to their organization (top-level RPKI caches in their 
>> ASN to RIR's)?
>> ---snip---
>>   Validated caches may also be created and maintained from other
>>   validated caches.  Network operators SHOULD take maximum advantage of
>>   this feature to minimize load on the global distributed RPKI
>>   database.  Of course, the recipient SHOULD re-validate the data.
>> ---snip---
> 
> does b not address a, for those who want very tight synch.

I don't believe so: (a) attempts to address the concern in the draft that you 
don't want, potentially, dozens of RPKI caches inside a single operator 
hammering the same set of _external_ RPKI caches "constantly"; (b) suggests 
that since you've got a hierarchy of RPKI caches in your own ASN, that you most 
likely _can_ and _should_ keep the 2nd-level of RPKI caches in sync to each 
other, as best as possible, in order that when that when RPKI information is 
pushed/pulled into the routers, via RPKI-RTR, that you're hopefully affecting 
routing policy in your control plane and, hence, forwarding of traffic through 
your whole AS in a consistent way at nearly the same time.  I view them as two, 
separate, but related matters.

I would be fine if you suggested that for it is envisioned that a tiered 
approach to RPKI caches makes sense for "large backbones" and that for "small 
stub/enterprise/edge networks", (to quote the classification of networks 
already in Section 1), they may be fine with a single-tier of a handful of RPKI 
caches.  


> note that the RIRs were talking 24 hour publication cycles, last i heard
> (long ago, i admit).  [ i thought this was nutso ]  so a lot of this has
> yet to play out.

Re: RIR's + 24 hours … IMO, that mainly affects the interval at which it is 
reasonable for an operator's "top-level" set of RPKI caches go after a "fresh" 
set of RPKI data.  Again, once the 'top-level' RPKI caches get that data, then 
I would strongly prefer the "window of time" (IOW, the 'duration') that it 
takes to cascade that fresh set of data out to my 2nd-level of RPKI caches is 
as narrow as possible in order that when RPKI-RTR pushes or routers pull that 
data from their RPKI caches, they're doing so in a very short window of time.  
That way, if/when BGP policy is affected by new information in the RPKI, it's 
not leading to [very] long-lived improper/inconsistent forwarding in the 
network.


>> While I'm here, I don't think the text in Section 6, "Notes", addresses the 
>> above concerns, at all.  In fact, I find it extremely unhelpful to just 
>> dismiss this concern, out of hand, with the text: "There is no 'fix' for 
>> this, it is the nature of distributed data with distributed caches".  We 
>> know what the answer is here: you tune the synchronization intervals to 
>> strike the appropriate balance between [very] tight synchronization vs. 
>> increased load on the systems being synchronized.  I find it hard to believe 
>> a simple suggestion such as this is not proposed in the text, even including 
>> the phrase "the suggested values for such synchronization are outside the 
>> scope of this document, but will likely be subject to further studies to 
>> determine optimal values based on field experience".
> 
> sorry, dns taught us that the answer is not in just running it more
> frequently.  you can narrow the windows, but you can not eliminate them.
> i wish we could, but the protocols which could provide a globally
> synchronized database would be extremely complex and just do not seem
> worth the effort in this case.

To be clear, I recognize that perfect synchronization is hard; however, I'm 
asking you to acknowledge that one needs to attain much better than 
lackadaisical synchronization so that there isn't, potentially, massive 
sloshing of traffic around the network due to new RPKI data showing up in 
routers at vastly different times.  See just below where I hope you can find 
additional text that might satisfy my concern.


> your suggested text seems useful, and i will steal and modify if you do
> not mind.  but i suspect we would find tuning has topological and delay
> sensitivities which will prevent optimal recipies.
> 
>    <t>Timing of inter-cache synchronization is outside the scope of
>      this document, but depends on things such as how often routers
>      feed from the caches, how often the operator feels the global RPKI
>      changes significantly, etc.</t>

I would appreciate it if you could also acknowledge that cache synchronization 
intervals also has a very important dependency on ensuring that all routers in 
an operator's ASN get updated in as short a time-interval (duration) as 
possible to ensure there is consistent application of BGP policy and, hence, 
forwarding of traffic.  


>> 3)  Granted, the following text is only a "SHOULD", but the text offers no 
>> reasoning as to why caches should be placed close to routers, i.e.: are 
>> there latency concerns (for the RPKI <-> cache protocol), or is it that a 
>> geographically distributed system is one way to avoid a 
>> single-point-of-failure, or something else entirely?  As a start, just 
>> defining "close" would help, e.g.: same POP, same (U.S.) state, same 
>> country, same timezone … but, then a statement as to any latency or 
>> resiliency requirement for geographic deployment of RPKI caches wold be 
>> useful.
> 
> we tried to go down this path and found it just got more and more
> complex with no real improvement.  you probably want them in some
> diameter of transport trust.  you probably want them in some diameter of
> routing bootstrap reach.  you probably want them with reasonable latency
> characteristics.  and there are probably more concerns.  that's why you
> get the big bucks. :)

I'm more concerned for the 100's or 1000's of engineers that were not 
participating in this initial development effort and will be left wondering 
what, at a high-level, the thinking/background/concerns were that went into 
these guidelines.  With that information, future engineers can then weigh those 
criteria for themselves to decide if they are applicable to their environment, 
how they are applicable to their environment, etc. 

Remember, ultimately what you're laying out here is going to be read by 
engineers who are going to look at buying actual server HW, and network ports 
to attach them to, to set-up an RPKI in their network.  The more background 
material you give them the more confident they will be on putting together a 
budget estimate to start such a project … 


>    <t>As RPKI-based origin validation relies on the availability of
>      RPKI data, operators SHOULD locate caches close to routers that
>      require these data and services.  'Close' is, of course, complex.
>      One should consider trust boundaries, routing bootstrap
>      reachability, latency, etc.</t>

While I appreciate the attempt, I don't think the above satisfies my concern.  
What 'trust boundaries' are you referring to, e.g.: within your ASN, not within 
your ASN but within your organization or neither/other?  With respect to 
latency, is what you're attempting to say that it's recommended to engineer for 
a high BW x low delay product, in order to achieve fast xfers of data between 
RPKI caches and also data between the RPKI cache to the routers?  If so, then 
aren't you conceding that low latency is necessary in order to _strive_ to 
attain "good [enough]" synchronization between all RPKI caches in the network 
so that 'new' RPKI data that affects BGP policy gets uniformly applied across 
all routers in the network at roughly the same time? 

Also, you mention 'routing bootstrap reachability'.  That's actually a very 
good point and I don't recall seeing anything about that in this document 
anywhere.  (If it already is, I apologize).  Assuming there's nothing about 
that in here, then wouldn't it be good to state some guidelines around this in 
Section 7, Security Considerations, like:
====
It is recommended that operators SHOULD log _and_ exclude *new* RPKI data that 
downgrades the previous state of ROA's, (e.g.: from Valid -> Invalid or Valid 
-> Not Found), associated with external RPKI caches, root DNS servers and ccTLD 
DNS servers so as not to cause a DoS that would lead to an inability to gather 
fresh, accurate RPKI data.  This information should be evaluated by a human and 
manually pushed out to RPKI caches and routers in the network after it has been 
validated as correct.
====

Speaking of which, is it possible to use anything in ghostbusters to 'aid' in 
the recognition of those critical infrastructure ROA's?

Anyway, the point is we are supposed to be avoiding (automated) circular 
dependencies here and it would be worth pointing those out to operators, when 
they read this, so they don't forget to take them into account.



>>    Furthermore, given the [very] loosely synchronized nature of the RPKI, 
>> should the text point out that the number of RPKI caches (internal to the 
>> organization) be balanced against the potential need of an organization to 
>> maintain a more tightly synchronized view, across their entire network, of 
>> validated routing information?  A concern might be that if routers in 
>> Continent A pull information from their RPKI caches that tell them that ROA 
>> is not "Invalid", but other routers in Continent B are still using 'older' 
>> information in RPKI caches in Continent B that says the same ROA is either 
>> "Not Found" or "Valid", then the result might be that BGP Path Selection 
>> swings all traffic from Continent A to Continent B.  At a minimum, this 
>> could lead to substantially increased latency or, at worst, congestion, 
>> packet-loss or a unintended DoS.  
>> ---snip---
>>   As RPKI-based origin validation relies on the availability of RPKI
>>   data, operators SHOULD locate caches close to routers that require
>>   these data and services.  A router can peer with one or more nearby
>>   caches.
>> ---snip---
> 
> see above

This didn't address my concern, which is primarily about consistent policy 
across the network at any given moment in time.


>> In Section 5, "Routing Policy":
>> 4)  From a practical standpoint, LOCAL_PREF is already widely used to 
>> influence Traffic Engineering, both by an SP as well as by the SP's 
>> customers (through the use of "TE communities" sent by a downstream customer 
>> to the SP) -- the latter of which is done in order so the customer can 
>> influence traffic from the SP toward themselves, (e.g.: one example where a 
>> customer prefers a circuit be 'backup' for another circuit only if their 
>> other SP is not announcing that same prefix).  In reality, I think that 
>> there will have to be significant re-work of an SP's existing BGP policies 
>> to encode dual-meanings inside a single LOCAL_PREF attribute, (route 
>> validity + TE preference).  It may be good to acknowledge this by 
>> recommending that in the text, above, something like:
>> ====
>>    In the short-term, the LOCAL_PREF Attribute may be used to carry both the 
>> validity state of a prefix along with it's Traffic Engineering 
>> characteristic(s).  It is likely that the SP will have to change their BGP 
>> policies such that they can encode these two, separate characteristics in 
>> the same BGP attribute without negatively impacting their existing use or 
>> leading to accidental privilege escalation attacks. 
>> ====
>> ---snip---
>> Some may choose to use the large Local-Preference hammer.
>> ---snip---
> 
> i would hesitate to tell you *how* to deal with local policy matters.
> the whole point of pfx-validate and this document is that you are free
> to do whatever is appropriate to your needs.  we definitely do not want
> to tell you if or how you should complicate your use of local-pref.
> we did our best to avoid assuming you will affect local-pref at all.

I understand, but since the document has waved its hands that LOCAL_PREF is a 
potentially viable method that has a very low-bar to deployment, you could at 
least acknowledge that fact with a little more detail that there are existing 
uses of LOCAL_PREF and how to accommodate those + this new validity data.


>> 5)  I have three comments on the below:
>>    a)  It's not clear, to me, what is meant by "internal metric" below.  Do 
>> you mean MED or IGP metric or something else?  I don't see IGP metric as 
>> being practical, so I'm assuming you mean additively altering MED (up|down) 
>> based on validity state.  Regardless, I would recommend you state more 
>> precisely which BGP Path Attribute you're referring to below.
> 
> we meant MED.  jay caught this the other day, and it is fixed in the
> draft in my edit buffer.
> 
>    <t>Some providers may choose to set Local-Preference based on the
>      RPKI validation result.  Other providers may not want the RPKI
>      validation result to be more important than AS-path length --
>      these providers would need to map RPKI validation result to some
>      BGP attribute that is evaluated in BGP's path selection process
>      after AS-path is evaluated.  Routers implementing RPKI-based
>      origin validation MUST provide such options to operators.</t>

That looks good.


>>    b)  Since MED is passed from one ASN to (only) a second, downstream ASN 
>> to influence ingress TE policy, is it "OK" from a security PoV that MED is a 
>> *trusted* means to convey ROA validity information from one ASN to a second? 
>>  Presumably, the answer should be "heck, no", right?  If that's the case, 
>> then wouldn't it be wise to state that:
>>        i)  MED's, encoded with any ROA validity information, should get 
>> reset on egress from an ASN to remove said validity information and only 
>> carry TE information, as appropriate; and,
>>        ii) MED's should not be trusted on ingress to convey any meaning with 
>> respect to validity information?
>>    c)  What is meant by the statement, "might choose to let AS-Path rule"?  
>> Is your intent to state that an SP may choose to just use MED, which follows 
>> after LOCAL_PREF & AS_PATH in the BGP Path Selection Algorithm, as a means 
>> to determining validity of a particular prefix?  If so, then it would be 
>> much more clear if you just stated that, e.g.:
>> ====
>>    If LOCAL_PREF is not used to convey validity information, then MED is 
>> likely the next best candidate BGP Attribute that can be used to influence 
>> path selection based on the validity of a particular prefix.  As with 
>> LOCAL_PREF, care must be taken to avoid changing the MED attribute and 
>> creating privilege escalation attacks.
>> ====
>> ---snip---
>>   […]  Others
>>   might choose to let AS-Path rule and set their internal metric, which
>>   comes after AS-Path in the BGP decision process.
>> ---snip---
> 
> if you trust MEDs from a neighbor you are either a fool or have a,
> likely rather complex, contractual and technical agreement.  far be it
> from us to get into such matters.  we abjure general inter-provider
> hygenic practices.  this is not an inter-operator best practices
> document, we're just trying to inform you of where origin-validation
> may affect your design.

I believe you've acknowledged this concern with the text just below re: the 
more general statement about passing validity information on to third-parties 
via BGP attributes.


>> Other Comments:
>> 6)  Related to #5, above, BGP Communities are another transitive attribute 
>> that /might/ be used to convey validity information of a prefix, or lack 
>> thereof, from one ASN to a second ASN (or, more).  However, as we know, 
>> there is no means to authenticate BGP Attributes, from one ASN to the next.  
>> So, from a security hygiene perspective, would it be best to say something 
>> along the lines of:
>> ====
>> The validity state of routes MUST NOT be transmitted beyond the borders of 
>> an SP's ASN, since: a) there is no authenticity of BGP Attributes; and, b) 
>> this would place hidden dependencies on the ability of the upstream ASN to 
>> validate routes and pass them along to others, which would increase the 
>> fragility of the overall system.  Finally, ASN's MUST NOT rely on BGP 
>> Attributes received on an eBGP session, to convey any meaning with respect 
>> to validity of a particular prefix for the reasons just stated.
>> ====
> 
> ok, since you keep banging your head against this wall, it is clear that
> something saying "do not listen to validity information from another AS"
> is needed.
> 
>    <t>Validity state signialing SHOULD NOT be accepted from a neighbor
>      AS.  The validity state of a received announcement has only local
>      scope due to issues such as scope of trust, RPKI synchrony and
>      <xref target="I-D.ietf-sidr-ltamgmt"/>.</t>

That looks good, but I would suggest a minor change to the latter part of the 
first sentence:
s/from a neighbor AS/from a neighbor AS that is not under your organization's 
direct control/


>> 7)  Is this document only intended (scoped?) to cover PE's that can (or, 
>> eventually, will) speak the RPKI-RTR protocol for validation?  Or is this 
>> document intended to also cover PE's that do not speak RPKI-RTR, but those 
>> PE's would obviously need some other mechanism, (e.g.: periodically pushing 
>> an updated config to them based on RPKI validated data), in order that they 
>> could influence the policy applied to valid routes in such a way that is 
>> consistent with other more modern routers that do run RPKI-RTR protocol?  If 
>> so, wouldn't it be good to suggest this, even if only as a means to increase 
>> the deployment speed?  Or, to at least let readers know that this needs to 
>> be considered during their deployment so that they can factor in the load on 
>> their [existing] systems that might do this work as well as the effects of 
>> the 'loosely synchronized' aspects of the RPKI?
> 
> the former

OK.  So, then in Section 1, would it be prudent to say something to the affect 
of:
====
The scope of this document is intended to discuss application of RPKI cache 
data to routers that speak the RPKI-RTR protocol.  Other uses, such pushing 
RPKI data into routers through a Service Provider's existing management systems 
or software are outside the scope of this document.
====

Thanks,

-shane
_______________________________________________
sidr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/sidr

Re: [sidr] WGLC: draft-ietf-sidr-origin-ops

Reply via email to