New theory! might work :) My assumptions:
1) R1 and R2 are your ABR's, R2's link into the backbone is a dial on demand link only used when R1's link fails. 2) Due to the above, the primary problem is that when the non-backbone area becomes partitioned, R1 will not be able to deliver to certain nets south of R2 as it does not see R2 as a valid hop toward those nets (since it doesn't see the type1/2 advertisements from that area). In this case, R1 either forwards via default toward the core and loops traffic for those unreachable nets, or matches a null0 route for the summary and discards. 3) R2 will have this problem only when R1 loses connectivity to the core _and_ the non backbone area becomes partitioned. Hence, fixing this problem is less important that fixing #2. Solution: Disable the creation of a null0 route for the aggregate on R1 and instead add a static route for the aggregate on R1 toward R2. With this config, if the area becomes partitioned, while R1's ethernet toward the core is live, when R1 pulls traffic based on the summary toward unreachable nets behind R2, this route will push the traffic toward R2. Should R2 not be able to reach those nets, the can be safely considered unreadable and R2's null0 route will discard the traffic thereby eliminating loops. The only downside is that some truly unreachable traffic might transit the R1-R2 link before being eliminated. This will not help the situation where the area is partitioned and R1 loses core connectivity, but this is a much less likely occurrence. Plus, in this case your dialup link might be strained anyway so dropping a bunch of traffic might be helpful :) In summary, assume 192.168/16 is the summary R1 ip route 192.168.0.0 255.255.0.0 R2 R2 ip route 192.168.0.0 255.255.0.0 null0 Adding the cable is also helpful, but costs money and requires you to touch a bunch of routers. At 09:04 AM 4/5/2002 -0500, Peter van Oene wrote: >Adding a point to point link between ABR's would enhance the resiliency >between the two and tend to protect against Area partitioning. Depending >on the capabilities of the backbone routers, letting more specifics into >the backbone might be helpful as well as it would deliver more optimal >routing and also help solve this problem. > >Shorter answer is, ya, thats a good idea in my opinion :) > >Pete > > >At 01:39 PM 4/4/2002 -0500, you wrote: > >At 11:59 AM 4/4/02, Chuck wrote: > > >that was going to be my guess as well. I've done a number of lab >experiments > > >with similar themes, and have in my own mind at least, confirmed what is > > >stated in the RFC - that the only serious routing issue with partitioned > > >non-backbone areas results from overlapping > > > >She does seem to have overlapping summarization, if that makes sense. She > >said: > > > >The area range statements on Rtr2 are... > >[various area 0 range statements snipped] > > area 2.1.0.0 range 2.0.0.0 255.128.0.0 > > area 2.2.0.0 range 2.128.0.0 255.224.0.0 > > > >On Rtr1 the statements are... > >[same area 0 range statements snipped] > > area 2.1.0.0 range 2.0.0.0 255.128.0.0 > > > >If you look at her ASCII art e-mail, you'll see that the WAN links were not > >assigned contiguously unless I'm missing something. Rt1 has 2.101.0.0/16 > >and 2.109.0.0/16. Rtr 2 has 2.120.0.0/16, 2.104.0.0/16, and 2.130.0.0/16 > > > >It's probably too late now, but perhaps if all the WAN links connected to > >Rtr 1 had been summarizable into a group that was distinct from the WAN > >links connected to Rtr 2, she wouldn't have the problem?? (Of course, she > >has that area 2.2.0.0 to deal with too, but perhaps it could be something > >different entirely....) > > > >But I don't think she's looking for a redesign. She's looking for a quick > >fix for now. What did you guys think of the idea of adding another direct > >connection between the two switches and putting it in area 2.1.0.0? > > > >Priscilla > > > > > > >Chuck > > > > > >""Peter van Oene"" wrote in message > > >[EMAIL PROTECTED]">news:[EMAIL PROTECTED]... > > > > HI Jenny, > > > > > > > > Is it safe to say that your problem is that when your non backbone area > > > > becomes partitioned, you lose reachability to one side of the > > > > partition? When you use large summarizes to describe entire areas and > > >have > > > > multiple entry points into those areas themselves, this is a normal > > > > occurrence. If this is the problem, the solution likely involves the >use > > > > of less specific summaries per ABR, and/or greater L2 resiliency to > > >protect > > > > against partitions. If that's not the problem, can you indicate where > > >I've > > > > misread the problem description? > > > > > > > > Thanks > > > > > > > > Pete > > > > > > > > > > > > > > > > At 09:05 PM 4/2/2002 -0500, [EMAIL PROTECTED] wrote: > > > > >Hi all, > > > > > > > > > >This is actually a real-life scenario, but I think it throws up some > > > > >interesting points about OSPF that some people may not have come >across. > > > > >And it has a couple of bits that I don't understand. Please excuse >the > > > > >verbosity. > > > > > > > > > >Currently, (part of) this particular network is as described below. >It > > > > >normally works fine, but during certain types of failures, >connectivity > > > > >breaks although there is still a physical path. I am contemplating >what > > > > >the best way to fix it would be, and would be interested in comments. > > > > > > > > > >Set-up - I don't think my ascii art is up to this but I'll give it a >go > > >if > > > > >the description isn't clear enough: > > > > > > > > > >Two ABRs (Rtr1 and Rtr2), running IOS 12.1, connected to each other >by a > > > > >direct ethernet cable in area 0, and also by several local ethernet > > > > >networks in area 2.1.0.0. The details of the local ethernets can > > >probably > > > > >remain a fluffy cloud, but note that failure of a single component can > > > > >potentially cause all area 2.1.0.0 neighbour connectivity between Rtr1 > > >and > > > > >Rtr2 to be lost, although the local ethernets may remain up on one or > > >both > > > > >routers. > > > > > > > > > >Both routers have a connection back to the core of the network (on >Rtr2 > > >it > > > > >is dialup, so not usually active), which is in area 0. Both routers > >have > > > > >WAN links to several sites (not dual-homed - each site has a link to > >only > > > > >one ABR), in area 2.1.0.0. Rtr2 may also have WAN links to several > >sites > > > > >in area 2.2.0.0, but that's probably not too relevant. > > > > > > > > > >Both ABRs summarise the networks in area 2.1.0.0 to a single summary > > > > >network (Rtr2 summarises the networks in 2.2.0.0, if any, to another > > > > >summary network). > > > > > > > > > >This usually works fine - traffic from the core to sites connected to > > >Rtr2 > > > > >(in area 2.1.0.0) travels from Rtr1 to Rtr2 across the local ethernets > > > > >(area 2.1.0.0), and in reverse from Rtr2 to Rtr1 across the Area 0 > > > > >ethernet. This, while perhaps not ideal, is as expected, and works >well > > > > >under normal circumstances. (If you're not sure why this is expected, > > > > >read up on hot potato routing policy - Howard gave a good description >in > > > > >the context of stub areas in > > > > >http://www.groupstudy.com/archives/cisco/200001/msg01579.html) > > > > > > > > > >The problem happens if the area 2.1.0.0 neighbour connections between > > >Rtr1 > > > > >and Rtr2 are lost. Even though there is still an area 0 link between > > > > >them, area 2.1.0.0 sites connected to rtr2 lose connectivity to the > >core. > > > > >Area 2.2.0.0 sites are OK (this is good - I'd be really confused if >they > > > > >lost it too). > > > > >Despite Doyle claiming that partitioned non-backbone areas are not a > > > > >problem (he does, on page 462 of Routing TCP/IP Vol 1), it seems they > >can > > > > >be. As far as I can see, it's because when summarising the 2.1.0.0 > > > > >networks, Rtr1 also installs a route to null0 for the summary route - > > > > >which overrides the summary route that Rtr2 generates (and which would > > > > >otherwise cover the 'lost' sites). > > > > > > > > > >I can see a couple of possibilities for fixing this... > > > > >1) Install a second direct ethernet cable between Rtr1 and Rtr2, in >area > > > > >2.1.0.0. This may not be particularly elegant, but it should be > > > > >comparatively easy to do and effective (there are plenty of spare > > >ethernet > > > > >ports). It also has the useful side-effect of getting the through > > >traffic > > > > >off the local ethernets. > > > > > > > > > >2) Use the "no discard-route internal" command - this doesn't appear >to > > >be > > > > >documented but is mentioned at > > > > >http://www.cisco.com/warp/public/104/3.html#12.0 > > > > >I haven't tested it, but I think it should prevent the null0 route >from > > > > >being installed by Rtr1, so my theory is that then the summary >generated > > > > >by Rtr2 should come into play. This, of course, goes against all >Cisco > > > > >recommendations, which say that having the null0 route is A Good Thing > >to > > > > >prevent routing loops. > > > > > > > > > >3) Muck about with the arrangement of switches within the internal > > > > >networks. I think this will cause more trouble than it's worth, since > > >any > > > > >rearrangement has to be duplicated at twenty sites. In theory at >least, > > > > >the whole network may be redesigned from scratch over the next year or > > >so, > > > > >so a quick and dirty fix isn't necessarily a problem. > > > > > > > > > >BUT... I am also not positive that my understanding of what is >happening > > > > >and why is correct, because the support guys have told me that this > > > > >problem has been around since we were running IOS 11.2 on the ABRs >(not > > > > >that long ago, believe it or not), and I'm pretty sure that no route >to > > > > >null0 was being generated then (summarisation was the same). > > > > >So can anyone explain to me why connectivity would fail even if no >null0 > > > > >route was being generated? What am I missing? > > > > >And does anyone feel like commenting on the options for fixing it? > > > > > > > > > >JMcL > >________________________ > > > >Priscilla Oppenheimer > >http://www.priscilla.com Message Posted at: http://www.groupstudy.com/form/read.php?f=7&i=40595&t=40269 -------------------------------------------------- FAQ, list archives, and subscription info: http://www.groupstudy.com/list/cisco.html Report misconduct and Nondisclosure violations to [EMAIL PROTECTED]

