On Fri, Dec 7, 2012 at 12:35 PM, Montgomery, Douglas <do...@nist.gov> wrote:
> suggesting/discussing loading a RIB from DNS queries. I was thought we > were discussing information systems that might allow me to validate the > origin of an router's RIB. That problem is O(500K) at time zero. backing up a bit in the thread, and I hope/think setting some things up a bit better for the conversation... or attempting to :) If we look at the whole system (or a bunch of it) in SIDR/BGPSEC/RPKI, there are likely these moving parts: o RPKI repositories (some number, let's say 1/ASN for simple numbers) o RPKI TAL/TA/'Root' bits (say 5 today, hopefully 1 tomorrow which then lets you walk down the tree to find all the actors) o network operators running networks (again 1/ASN) o gathering hosts/systems at each of the above would talk to all repositories and gather the content for local use/distribution (again 1/ASN at least, probably safe to assume 2/ASN at least) o cache systems inside each ASN, more than 1, less than 1/router seems sane? In the end, the last item is completely up to the AS operator in question. They may choose to run 1 cache/router, or one for their ASN, they are responsible (according to the docs) to keep their cache's semi-coherent, or as close to coherent as they can. So, looking at timing information there's a base time for: "Make a ROA/EE/etc change to the local repository", that timing is almost up to the local operator, then things beyond that are about automatic... first gatherers get data, then local-caches will get sync'd and distribute to the routers the updates required. It's important to note that a smart solution would only pass updates to the caches, or rather the caches would update from some point-in-time that the gatherers kept. (again, this is likely dependent upon the local operator's timing requirements/design). One point that's brought up a bunch on the thread so far is 'cold start'. There are many forms of this: o for a router o for a cache o for a gatherer o for an ASN I think for some the answer is 'easy', and the timings are 'fast'. For others though the timing is longer. For instance, a router cold-start (presuming no special knob request) is: 1) boot 2) load-os/config 3) prefer internal/igp 4) load cache-data 5) bring up bgp (e and i) sessions (it's probably harder to determine if 3 and/or 5 happen before 4 today, but that seems like a vendor tweak to me) Loading the cache data should be essentially lan-speed limited... or at least limited by the cache deployment that the operator picks. A gatherer cold-start is: "Fetch all objects from all remote repositories" and is likely bounded by times calculated in eric's paper... or similar. 1/asn links with X average time to connect/download/digest... An ASN cold-start is ... actually pretty simple, except that they rely upon everyone else finding them, so they are likely bounded on start by the time it takes all remote-asns to walk the system and find them (call it 4hrs based on the timings in the ops-docs?). I think the discussion so far has centered around 'all' of the system, but has variously talked about only 1-2 parts (really) when it comes to timings. Could we think about the problem-space and timings in the above framework? or alter the above to something we can all agree upon? -Chris _______________________________________________ sidr mailing list sidr@ietf.org https://www.ietf.org/mailman/listinfo/sidr