Hi John,
Trying to dismantle this… We are saying that a site is integral. You are is asking : what happens if a site becomes partitioned so that some prefixes are accessible through one GW and some through another. Consider a site with a set of prefixes S Consider two GWs: GW1 and GW2 Initially GW1 and GW2 discover each other. So GW1 advertises reachability to S, and by the way GW2 exists GW2 advertises reachability to S, and by the way GW1 exists Now the site becomes partitioned so that GW1 can reach S1 and GW2 can reach S2. (S = S1 U S2, S1 n S2 = E) You ask: 1. What happens to packets for S2 arriving at GW1? 2. What is the remedy in the protocol? My answer to 1. is that the packets will be black-holed either at GW1 or inside the site. My observation is that: a. GW1 cannot reach GW2 inside the site. If it could, then S2 would be reachable via GW1 b. It is contrary to BCP38 for GW1 to forward a packet back into the external AS to be routed to GW2 My answer to 2. is that when the site becomes partitioned: * GW1 will stop advertising the whole of S and will fall back to advertising just S1 * GW2 will stop advertising the whole of S and will fall back to advertising just S2 * Initially, GW1 and GW2 will still advertise each other’s existence, but will “soon” un-auto-discover each other At this point the site is effectively two sites that use the same site identifier. How quickly this takes place depends on precisely what the failure case is, how fast the failure detection is done, and how fast BGP converges. *Perhaps* there is a wrinkle *if* the autodetection advertisements are sent external to the site. In this case, GW1 would continue to discover GW2 and so would readvertise it (and vice versa). This would continue to lead to the broken condition you noted. I think we assumed that the peering between GW1 and GW2 would be internal to the site (because otherwise it would constitute traffic leaving the site and re-entering it (breaking BCP38 again). If it would help, we could make this point clear by saying that the peering between GW1 and GW2 must be within the site. Cheers, Adrian From: John Scudder <[email protected]> Sent: 14 May 2021 22:25 To: Adrian Farrel <[email protected]> Cc: The IESG <[email protected]>; [email protected]; [email protected]; [email protected]; Matthew Bocci <[email protected]> Subject: Re: John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT) Having re-read Section 3 carefully (and skimmed the rest) I still think what the document says (as opposed to what’s in the authors’ heads?) is the first description I give below. Let me know if you want me to walk through my reasoning in detail with reference to the document. —John On May 14, 2021, at 4:12 PM, John Scudder <[email protected]> wrote: Hi Adrian, Thanks for your reply. Pressed for time at the moment but one partial response: On May 14, 2021, at 1:04 PM, Adrian Farrel <[email protected] <mailto:[email protected]> > wrote: Agree with you that "stuff happens." I think that what you have described is a window not a permanent situation. When GW2 knows it can't reach X any more, it will stop advertising X, and GW1 will receive that and will update what it advertises on behalf of GW2. Ah, perhaps I have badly misunderstood the way this works. I had thought it went something like this: - GW1 knows it can reach GW2 because of GW2’s auto discovery route - GW1 knows the set S of internal prefixes it can reach - GW1 advertises each prefix from S with both GW1 and GW2 in the tunnel attribute In the description above, there’s no notion of GW2 telling GW1 what internal prefixes GW2 can reach, or GW1 caring. Now I suppose you are telling me that it goes: - GW1 knows it can reach GW2 because of GW2’s auto discovery route - GW1 knows the full set of prefixes GW2 can reach. _How does it know this?_ - GW1 constructs each advertisement listing only the correct set of gateways in the tunnel attribute The key question is the one I’ve highlighted: how does GW1 come to know GW2’s internally-reachable prefixes? I didn’t notice any of this in the spec. Maybe it was just my sloppy reading, I’ll look again. Further, if GW1 can no longer receive advertisements from GW2 then it will stop advertising on behalf of GW2. Yes, that’s understood, but I was positing a case where just because GW1 can reach GW2 stably, and just because GW1 can reach X stably, it does not imply GW2 can reach X. —John
_______________________________________________ BESS mailing list [email protected] https://www.ietf.org/mailman/listinfo/bess
