Hi Curtis, See inline.
On 2/9/14 12:38 PM, "Curtis Villamizar" <[email protected]> wrote: > >Les, > >Perhaps you should read the abstract of the document you are >commenting about: > > SPFv3 is a candidate for deployments in environments where auto- > configuration is a requirement. One such environment is the IPv6 > home network where users expect to simply plug in a router and have > it automatically use OSPFv3 for intra-domain routing. This > document describes the necessary mechanisms for OSPFv3 to be > self-configuring. > >Home network! > >Or the introductio: > > OSPFv3 [OSPFV3] is a candidate for deployments in environments > where auto-configuration is a requirement. > > [...] > > 1.2. Acknowledgments > > This specification was inspired by the work presented in the > Homenet working group meeting in October 2011 in Philadelphia, > Pennsylvania. > >The Homenet WG works on what? Home networks! > >So please keep that in mind when commenting. > >Unless a provider were to be so stupid or lazy to use this on a SP >network then most of the comments from both of us don't apply, >*except* the few comments below about "in a home network". > >Perhaps the draft should add text explicitly stating that the last >router-id used successfully should be used on a reboot rather than a >new random number. I notice that only the Router-Hardware-Fingerprint >TLV is persistent across reboots. This is insufficient if we want to >minimize disruption. > >The only case then (if router-id is remembered across reboots) would >be a new router. In that case your uptime rule would help. So >perhaps two things could be reocmmended: > > 1. In section 4, include a "SHOULD remember the most recent > successfully used router-id across reboots and reuse that". > Reword the rest so if that information is not available, then > pick a random number. I will do this. > > 2. a. In section 6, mention the uptime rule. Modify the Router > Uptime TLV as suggested. > > b. Alternately add a flag to the Router-Hardware-Fingerprint > TLV that indicates that since last reboot this router-id has > been used and acheived a "full state". A router just > rebooting would not have ever reached the full state before > noticing a conflict as long as the conflct check is run > before considering itself in the full state. > > Note: A second flag bit indicating that this router-id had > been used successfully in a past reboot might also help but > would only matter among two routers both rebooting and > neither having reached the full state. > >I think #1 above is sufficient and does more to prevent surprises. I agree and appreciate you arguments in previous messages in this thread. > I >think #2 above helps only in the new router case but #2a requires >adding a TLV and so isn't worth it IMHO. Case #2b accomplished the >same thing with only a flag. I would not object to #2b above if #1 >above is also added. I agree that this would be a better mechanism and would only represent a single modification to the hardware fingerprint TLV. However, I really don't think even this is necessary. Thanks, Acee > >See inline anyway. > >In message ><[email protected]> >"Les Ginsberg (ginsberg)" writes: >> >> Curtis - >> >> > -----Original Message----- >> > From: Curtis Villamizar [mailto:[email protected]] >> > Sent: Saturday, February 08, 2014 7:30 AM >> > To: Les Ginsberg (ginsberg) >> > Cc: [email protected]; Acee Lindem; OSPF List >> > Subject: Re: [OSPF] OSPFv3 Autoconfiguration - draft-ietf-ospf-ospfv3- >> > autoconfig-05.txt >> > >> > >> > In message >><[email protected]> >> > "Les Ginsberg (ginsberg)" writes: >> > >> > > Curtis - >> > > >> > > Your reply below is talking about things which I think do not >>directly >> > > bear on the value add of what I have proposed. >> > > >> > > You mention various ways to insure that a given device assigns the >> > > same router-id each time it starts up and ways to insure it picks >>the >> > > same sequence of second/third... choices in the event it has to >>change >> > > its router-id. All good suggestions, but what I am talking about is >> > > what we do in the event a conflict occurs despite our best efforts >>to >> > > avoid it. With the current draft content preference is based solely >>on >> > > a fixed identifier (fingerprint) without regard to which choice >>would >> > > minimize disruption to the network. When preference is given to the >> > > "old router" to retain its existing router-id this shortcoming is >> > > addressed. >> > >> > In the lifetime of a router it only gets added once. In the lifetime >> > of a router we would hope it only reboots zero time but experience so >> > far has been that reboots over a router's lifetime tend to be > 0 and >> > in some cases >> 0. >> > >> > So you are optimizing for a 1 in 4 billion occurance that can happen >> > only once in the lifetime of a router. >> >> The entire duplicate router-id resolution logic is addressing the >> improbable case. My proposal adds - literally - one line of code to >> the logic used to decide whether "I" should change my router-id or >> whether "you" should change your router-id. >> >> > >> > We also need to look at the consequences of this very improbably >> > occurance. Today's routers accomplish IGP convergence in large >> > networks in subsecond times, in some cases << 1 second. >> > >> > Note that if flooding is completed (both withdraw old and install new) >> > in less than the SPF delay which is commonly implemented (some delay >> > after receiving the first flooded IGP change), then there is no impact >> > on routing. >> >> Your analysis does not apply to this scenario. The router which >> changes its router-id is effectively doing a cold start. All >> adjacencies will go down. All LSAs originated by this router become >> invalid. All routes will be removed from the forwarding plane. If >> you are running BGP all the BGP nexthops will be gone on the router >> which is changing its identity. Restoration of the adjacencies and >> reacquisition of the LSDB will take multiple seconds. The best you >> can hope for is several seconds of disruption - it could easily be >> much longer. >> >> For the new node which has usurped the old node's identity it will >> have to purge/replace all of the LSAs generated by the old >> node. While normal operation of the update process will insure that >> this happens in a reliable way the amount of flooding network-wide >> required to bringup a new node has now been roughly doubled >> i.e. the old node must reissue all of its LSAs using a new identity >> and the new node must purge/replace the old node's LSAs with its >> own versions. This will result in multiple SPFs on all nodes in the >> network and likely cause loops/blackholes during the transition >> since some of the SPFs will be run on versions of the LSDB which >> are inaccurate (part old node's old LSAs and part new node's >> LSAs). Suggesting that this could be handled in the same way/time >> as we typically handle a single link failure isn't credible. > >All routers are supposed to keep a fixed router-id across reboots. If >interfaces are changed when down, the last used router-id should be on >flash. If flash is removed and replaced (rather than a new image >installed), then with the same set of interfaces, the same decision >should be made. We are down to a very special case where both flash >and interfaces are removed and replaced yielding no history and a >different set of MACs to pick from. > >> > > Your statement that what I propose is only relevant when two routers >> > > go down does not match the scenarios I envision. If I want to add a >> > > new device to my network or if I need to replace an existing device >>in >> > > my network I am only affecting one device - but as I am introducing >>a >> > > device with a new fingerprint it is possible that it will introduce >>a >> > > conflict with an existing router-id. >> > >> > In provider networks routers are generally added during maintenance >> > windows so should anything unexpected happen, impact is minimized. >> > >> > In home nets, the home user isn't going to notice the convergence time >> > if there is any. A 10 msec SPF delay is likely to be plenty. >> >> As I stated above, disruption will be orders of magnitude longer than >>10 ms. > >In a home net? With perhaps a half dozen routers and a default route? >Someone has a very bad OSPF implementation. :-) Or did you miss the >"In home nets" at the front of the paragraph. > >For example, in a 10 node network with average degree 4, perhpas 40 >links in 10 router LSA exist. A few RTT (less than 1 msec for a >homenet) for each neighbor adjacency (which happen in parallel) and >ten packets from 4 sources is needed to reach the full state followed >by one SPF to be fully up and running. Other routers get one >additional router LSA plus four new links in existing router LSA and >have to run an SPF. Even on a software based homenet router using an >ARM, 10 msec is likely to be enough time and if it is "orders of >magnitude" longer, something is wrong with one of the implementations. >This would be an more complicated than usual home net or even soho, >more likely a small business. > >> > > In a subsequent reply you liked the idea of the new device delaying >> > > advertising reachability until it is has determined that its >>router-id >> > > choice is not in conflict. The old/new router paradigm supports this >> > > strategy by assuring that the old router will not consider changing >> > > its router-id until enough time has elapsed for the new router to >> > > transition to being an old router. >> > >> > If it wins the coin toss, the router would advertise at least one LSA >> > to indicate its existance and could hold back on any additional >> > advertisements until the other router has withdrawn routes. >> > >> >> This suggestion does not alter the fact that if the old node changes >> > > its router-id the network has to respond to three events: >> >> 1)Loss of the old node >> 2)Introduction of the old-node with a new identity >> 3)Introduction of the new node with the identity of the old-node > >Again, the old node should remember the last router-id used and try to >reuse it. > >> If however we insure that the old-node does not change its identity >> then the network only has to respond to a single event - the >> introduction of the new-node. > >Yes and if it were up and won the resolution last time, it will have >saved that router-id and will reuse it. If it came up previously and >lost the resolution, then it will remember the router-id it used, >whether second or third pick, and use that. > >> > > Finally, what I propose is extremely simple to implement. I think it >> > > isn't much of an exaggeration to say that any one of us could have >> > > implemented the enhancement in the time it has taken to discuss its >> > > merits. So we aren't overengineering for a case which is admittedly >> > > very unlikely to occur - we are adding a modest extension to make >>our >> > > solution less disruptive. >> > >> > Yes but it it *bad* for the more common case where routers go down >> > occasionally. >> >> You are going to have to clarify exactly what "bad side effects" you >> see for what I propose - because I don't see any - whereas I do >> see benefits as described above. > >If router-id is not remembered between reboots, then there is the one >in 4 billion time number of routers (less than 10 for a home net >today, but maybe more in the future). > >If router-id is remembered between reboots, then no matter how long a >router has been down, if nothing else in the network changed, there is >zero chance of having a collision. > >With either method, if router-id is remembered between reboots, then >there is zero chance of collision. > >IMO should this ever be used on a managed network (including a home >net / soho / small business net that happens to be managed) then >having routers come back from a reboot with the same router-ids would >be a big plus. For example, after a power outage NMS discovery would >not have to be repeated. > >> Les >> >> >> > >> > > Les >> > >> > Curtis >> > >> > >> > > > -----Original Message----- >> > > > From: Curtis Villamizar [mailto:[email protected]] >> > > > Sent: Friday, February 07, 2014 9:22 AM >> > > > To: Les Ginsberg (ginsberg) >> > > > Cc: Acee Lindem; Curtis Villamizar; OSPF List >> > > > Subject: Re: [OSPF] OSPFv3 Autoconfiguration - >>draft-ietf-ospf-ospfv3- >> > > > autoconfig-05.txt >> > > > >> > > > >> > > > In message <F3ADE4747C9E124B89F0ED2180CC814F23C619A9@xmb-aln- >> > x02.cisco.com> >> > > > "Les Ginsberg (ginsberg)" writes: >> > > > > >> > > > > So, I am one person who raised this concern to Acee - but the >>proposal >> > > > > outlined by Acee is not what I had in mind. There is no need to >>use >> > > > > "uptime" or to invent some unusual exchange of LSAs prior to >>Exchange >> > > > > state. >> > > > > >> > > > > Also, in regards to Curtis's comment - it is not DOS attacks >>that I am >> > > > > trying to mitigate here. As he says if an attacker is in your >>network >> > > > > and able to originate credible packets no strategy is safe. >> > > > > >> > > > > The motivating use case is to minimize disruption of a stable >>network >> > > > > when a new router is added or an existing router is >> > > > > replaced/rebooted. In other words non-disruptive handling of the >> > > > > common maintenance/upgrade scenarios. >> > > > > >> > > > > What I have in mind is this: >> > > > > >> > > > > 1) A router needs a way to advertise that it has been up and >>running >> > > > > for a minimum length of time - for the sake of discussion >>let's say >> > > > > 20 minutes. Routers then fall into two categories: >> > > > > >> > > > > o Old routers (up >= minimum time) >> > > > > o New routers (up < minimum time) >> > > > > >> > > > > 2) When a duplicate router-id is detected, the first tie >>breaker is >> > > > > between old routers and new routers. The old router gets to >>keep >> > > > > its router-id and the new router picks a new router-id. If >>both >> > > > > routers are "new" or both routers are "old" then we revert >>to the >> > > > > existing tie breakers defined in the document (link local >>address >> > > > > for directly connected routers and fingerprint info for >> > > > > non-neighbors). >> > > > > >> > > > > 3) Advertisement of the "old/new" state requires a single bit - >>but it >> > > > > has to be available both in hellos and the new AC-LSA. >>Adding it to >> > > > > the AC-LSA is easy to do. For hellos, there are two >>possibilities: >> > > > > >> > > > > o Use one of the Options Bits >> > > > > o Use LLS >> > > > > >> > > > > Be interested in how folks feel about this. >> > > > > >> > > > > Les >> > > > >> > > > >> > > > Les, >> > > > >> > > > Excluding DoS attack, we are talking about a one in 4 billion case >> > > > (for any two routers, so with 400 routers, still well under one >>in 1M) >> > > > where two routers hash a MAC address or pick a one time random >>number >> > > > from out of nowhere and end up with the same number. >> > > > >> > > > If that does happen (and one in 1M is certainly possible), then it >> > > > would be nice if the routers always ended up with the same >>router-id. >> > > > This could be accomplished by some fixed method such as hashing a >> > > > constant with the first choice or router-id or using the >>router-id as >> > > > a seed for the random number generator (which will pick the same >> > > > sequence of random numbers each time). If this is done, then a >> > > > conflict would always produce the same set of next picks. The >>set of >> > > > routers in a given network would always end up with the same >> > > > router-ids once they all came up and if only one went down at a >>time >> > > > then it would always end up with the same router-id when it came >>up. >> > > > >> > > > Zero conf was mainly intended for unmanaged networks (motivated by >> > > > work in the homenet WG). In these small unmanaged networks it >>doesn't >> > > > matter which router gets what router-id as long as they end up >>unique >> > > > and convergence is in a reasonable time relative to keeping >>eyeballs >> > > > happy. It could be applied to enterprise or providers but in >>either >> > > > case having the routers end up with the same router-ids would >>make for >> > > > easier management. >> > > > >> > > > For your scenario to matter at all with current rules, both >>routers in >> > > > the conflict would have to go down. If only the one that is >>preferred >> > > > goes down, the other is not going to change its router-id as a >>result >> > > > so when it comes up it gets its first pick with no conflict. If >>the >> > > > one that was not preferred goes down, it comes up, sees a >>conflict and >> > > > takes its second pick (loses the conflict every time). It is >>only if >> > > > they both go down and the one that normally loses the conflict >>comes >> > > > up first that there is a change in router-id. That too can be >>solved >> > > > with a rule that you always come up with the last router-id used. >> > > > >> > > > Curtis >> > > > >> > > > >> > > > > > -----Original Message----- >> > > > > > From: OSPF [mailto:[email protected]] On Behalf Of Acee >>Lindem >> > > > > > Sent: Thursday, February 06, 2014 5:12 PM >> > > > > > To: Curtis Villamizar >> > > > > > Cc: OSPF List >> > > > > > Subject: Re: [OSPF] OSPFv3 Autoconfiguration - >>draft-ietf-ospf- >> > ospfv3- >> > > > > > autoconfig-05.txt >> > > > > > >> > > > > > Hi Curtis, >> > > > > > I agree and believe the significance of this use case where a >>new >> > router >> > > > is >> > > > > > inserted into an auto-configured domain has been greater >>exaggerated. >> > > > > > Thanks, >> > > > > > Acee >> > > > > > On Feb 5, 2014, at 3:58 PM, Curtis Villamizar >><[email protected]> >> > > > wrote: >> > > > > > >> > > > > > > >> > > > > > > In message <cf17dd4e.2696b%[email protected]> >> > > > > > > Acee Lindem writes: >> > > > > > > >> > > > > > >> The OSPFv3 autoconfiguration draft was cloned and >>presented in the >> > > > > > >> ISIS WG >>(http://www.ietf.org/id/draft-liu-isis-auto-conf-00.txt). >> > In >> > > > > > >> the ISIS WG, there was a concern that the resolution of a >> > duplicate >> > > > > > >> system ID did not include the amount of time the router was >> > > > > > >> operational when determining which router would need to >>choose a >> > new >> > > > > > >> router ID. With additional complexity, we could >>incorporate router >> > > > > > >> uptime into the resolution process. One way to do this >>would be >> > to: >> > > > > > >> >> > > > > > >> 1. Add a Router Uptime TLV to the OSPFv3 AC-LSA. It >>would >> > include >> > > > > > >> the uptime in seconds. >> > > > > > >> >> > > > > > >> 2. Use the Router Uptime TLV as the primary >>determinant in >> > > > > > >> deciding which router must choose a new OSPFv3 >>Router >> > > > > > >> ID. Router uptimes less than 3600 (MaxAge) seconds >>apart >> > are >> > > > > > >> considered equal. >> > > > > > >> >> > > > > > >> 3. When an OSPFv3 Hello is received with a different >>link- >> > local >> > > > > > >> source address but a different router-id, unicast the >> > OSPFv3 >> > > > > > >> AC-LSA to the neighbor so that OSPFv3 duplicate router >> > > > > > >> resolution can proceed as in the case where it is >>received >> > > > > > >> through the normal flooding process. This is somewhat >>of a >> > > > > > >> hack as the we'd also need to accept OSPF Link State >> > Updates >> > > > > > >> from a neighbor that is not in Exchange State or >>greater. >> > > > > > >> >> > > > > > >> An alternative to #3 would be to use Link-Local Signaling >>(LLS) >> > for >> > > > > > >> signaling the contents of the OSPFv3 AC-LSA. However, >>you'd only >> > want >> > > > > > >> to send the Router-Uptime and Router Hardware Fingerprint >>when a >> > > > > > >> duplicate Router-ID is detected. This requires >>implementing the >> > > > > > >> resolution two ways but may be preferable since it doesn't >>require >> > > > > > >> violating the flooding rules. >> > > > > > >> >> > > > > > >> In any case, I'd like to get other opinions as to whether >>this >> > problem >> > > > > > >> is worth solving. >> > > > > > >> >> > > > > > >> Thanks, >> > > > > > >> Acee >> > > > > > > >> > > > > > > >> > > > > > > Acee, >> > > > > > > >> > > > > > > If the basis for router-id on boot up results in a fixed >>value, and >> > if >> > > > > > > a duplicate will occur on a give network, then which of two >> > duplicate >> > > > > > > routers gets that value may change after one of them >>reboots. If >> > > > > > > uptime is not considered, it will never change as long as >>one >> > router >> > > > > > > stays up at any given time. >> > > > > > > >> > > > > > > We are talking about a very low probability event (a >>duplicate) >> > except >> > > > > > > if this is a DoS attack and then either using or not using >>uptime >> > > > > > > won't matter since the attacker will claim an impossibly >>long >> > uptime. >> > > > > > > >> > > > > > > Curtis _______________________________________________ OSPF mailing list [email protected] https://www.ietf.org/mailman/listinfo/ospf
