Yes, it really is common to announce sink routes via bgp from destination services / proxies and to have those announcements be dynamically based on service viability.
On Wed, Oct 6, 2021, 12:56 Jared Mauch <ja...@puck.nether.net> wrote: > This is quite common to tie an underlying service announcement to BGP > announcements in an Anycast or similar environment. It doesn’t have to be > externally visible like this event for that to be the case. > > I would say more like Application availability caused the BGP routes to be > withdrawn. > > I know several network operators that run DNS internally (even on > raspberry pi devices) and may have OSPF or BGP announcements internally to > ensure things work well. If the process dies (crash, etc) they want to > route to the next nearest cluster. > > Of course if they all are down there’s negative outcomes. > > - Jared > > > > On Oct 6, 2021, at 1:42 PM, Michael Thomas <m...@mtcc.com> wrote: > > > > So if I understand their post correctly, their DNS servers have the > ability to withdraw routes if they determine are sub-optimal (fsvo). I can > certainly understand for the DNS servers to not give answers they think are > unreachable but there is always the problem that they may be partitioned > and not the routes themselves. At a minimum, I would think they'd need some > consensus protocol that says that it's broken across multiple servers. > > > > But I just don't understand why this is a good idea at all. Network > topology is not DNS's bailiwick so using it as a trigger to withdraw routes > seems really strange and fraught with unintended consequences. Why is it a > good idea to withdraw the route if it doesn't seem reachable from the DNS > server? Give answers that are reachable, sure, but to actually make a > topology decision? Yikes. And what happens to the cached answers that still > point to the supposedly dead route? They're going to fail until the TTL > expires anyway so why is it preferable withdraw the route too? > > > > My guess is that their post while more clear that most doesn't go into > enough detail, but is it me or does it seem like this is a really weird > thing to do? > > > > Mike > > > > > > On 10/5/21 11:56 PM, Bjørn Mork wrote: > >> Masataka Ohta <mo...@necom830.hpcl.titech.ac.jp> writes: > >> > >>> As long as name servers with expired zone data won't serve > >>> request from outside of facebook, whether BGP routes to the > >>> name servers are announced or not is unimportant. > >> I am not convinced this is true. You'd normally serve some semi-static > >> content, especially wrt stuff you need yourself to manage your network. > >> Removing all DNS servers at the same time is never a good idea, even in > >> the situation where you believe they are all failing. > >> > >> The problem is of course that you can't let the servers take the > >> decision to withdraw from anycast if you want to prevent this > >> catastrophe. The servers have no knowledge of the rest of the network. > >> They only know that they've lost contact with it. So they all make the > >> same stupid decision. > >> > >> But if the servers can't withdraw, then they will serve stale content if > >> the data center loses backbone access. And with a large enough network > >> then that is probably something which happens on a regular basis. > >> > >> This is a very hard problem to solve. > >> > >> Thanks a lot to facebook for making the detailed explanation available > >> to the public. I'm crossing my fingers hoping they follow up with > >> details about the solutions they come up with. The problem affects any > >> critical anycast DNS service. And it doesn't have to be as big as > >> facebook to be locally critical to an enterprise, ISP or whatever. > >> > >> > >> > >> Bjørn > >