Re: Openbgpd routing for redundancy.
On Fri, May 06, 2005 at 10:36:38PM +0100, Stuart Henderson wrote: --On 06 May 2005 14:35 -0600, Abraham Al-Saleh wrote: uptime, and our SLA only guarantees us 99.999%. So, I'm currently You sometimes find that SLA means something like we'll charge you more so that when things break, we can pay some of it back... talking with several companies to have another T1 brought in, and I'm planning on using OpenBGPD to provide fault tolerance. The only problem? I've never done anything like this before. I'm already While BGP can be used to improve reliability, it also gives you interesting and varied ways to break your network. What's more, it's quite possible to break your connectivity for extended periods of time (through flap dampening), and there's nothing that can be done to fix it, you just have to sit it out. So it must be done with thought and care. Everything you plug into your network gives you interesting and varied ways to break your network. Btw. route flap dampening is considered evil, it was invented to protect the lousy underpowered routers created by Cisco. On a redundant setup route flap dampening should never kick in as the announced network never disappears it just switches between two different pathes. For a good redundant setup you need more than one router. Every uplink goes to one independent OpenBGPD box. From there you should use an IBGP mesh and carp(4) to connect the servers redundant to your backbone. Last note: even bgp normaly needs some time to reroute traffic so getting a real 100% connectivity from all over the world is impossible. e.g. the default holdtime is 90sec and it may take so long until your connection goes down. good resources on bgp in particular (books, websites, See http://www.bgp4.as/books - maybe look at Stewart BGP4, van Beijnum BGP, Halabi Internet routing architectures. Typically, config examples are given for IOS, but many concepts are portable. van Beijnum is probably the easier read, Stewart has good information about the protocol (probably will help you to understand the RFC better), Halabi is published by Cisco Press so understandably IOS-centric, quite a lot of good material. A test network is pretty much essential to help you get to grips with things... Absolutly, without a real test lab where you can play through different scenarios, you may end up with a worse solution. I remember people connecting fully redundant servers to the same braker or getting two independent uplinks but using the same inhouse cable duct. In both cases there was a longer downtime because of this bad design. -- :wq Claudio
Re: Openbgpd routing for redundancy.
On Fri, 06 May 2005 16:58:39 -0600, Abraham Al-Saleh [EMAIL PROTECTED] wrote: I should additionally add (sorry about that), that it's not something that hasn't been considered in the past, and I'm considering it again, I just need to weigh costs for this with the costs for making our internet connection redundant, as well as the man power required, time it will take, and risks associated with each, which is why I came to the list asking for more information on using openbgpd, or bgp in particular. Abe, I suggest you reconsider your stance on collocation. The answer (due to HIPPA) may not be a provider of collocation facilities but actually having another physical site controlled by your company. I haven't actually read all of the HIPPA requirements but due to friends, I've got a good idea how much of a pain in the ass they can be. The reason for collocating is logical. Sure, you may have a pair of APC Matrix 5000 units and a generator at your current site... -But heck, even my garage has the very same equipment! The difference is life and death decisions are not made based on the ability to access the machines in my garage. In your business, any inability to access medical records could cause people to die. You're in a totally different league and have to face a ton of liability if something goes wrong. Let's say you go through the expense of full redundancy at your single site and when I say full I mean everything from multiple power drops from different chunks of the local grids, to at least pairs of generators, custom redundant wiring/circuits, staged UPS's all the way down the proverbial power line to the CPU's... -You're still vulnerable. The reason is simple, anything from a major disaster in Farmington Utah, to something as trivial as a fiber cut (i.e. someone with a backhoe accidentally ripping out network lines), you're still hosed. Having multiple sites is the same logic as having redundant APC Matrix 5K units but it's applied on a more effective scale; If one gets hosed, you cross your fingers and hope the second will pick up the load. If you have only one site, you still have a single point of failure regardless of how many redundant lines you attach to it. I understand the costs involved with having a second site, but in general the industry understands HIPPA compliance is expensive and worse yet, liability is even more expensive. The multi-site redundancy, though costly, would be a sales advantage due to the reduced liability it offers. Even if you can not afford to do it now, it would still be worthwhile to have plans in place on how it (eventually) will be done. If the legal department of some HMO client/partner requires site redundancy, you add implementing your plan to the costs of their contract... ;-) JCR
Openbgpd routing for redundancy.
Alright, before I go to far, I'm going to present what I know, what I need, and what I've read so far. We had a recent scare at my company, we lost conectivity with our isp for about ten minutes because of a glitch. Due to the nature of our company, we have to have 100% uptime, and our SLA only guarantees us 99.999%. So, I'm currently talking with several companies to have another T1 brought in, and I'm planning on using OpenBGPD to provide fault tolerance. The only problem? I've never done anything like this before. I'm already comfortable with openbsd, as we've been using it on all of OUR routers (not the managed router from our T1 provider... but that's going to be going away if we do this) and we've been very happy with it due to the likes of carp and pf. I've read the bgpd, bgpd.conf, and bgpctl man pages, I've skimmed rfc 1771, I've read the slides presented by Henning Brauer to the Chaos Communication Congress, and I've been googling like mad. What I'm looking for is a thorough overview of implementing openbgpd in a situation like mine, good resources on bgp in particular (books, websites, anything anyone else has found useful), or just general tips that anyone would be willing to give me. I'll be the first to admit that I haven't spent a lot of time in the down deeps of routing, but I'm not against reading large technical manuals. Any help would be hot, thank you everyone.
Re: Openbgpd routing for redundancy.
eric wrote: On Fri, 2005-05-06 at 14:35:09 -0600, Abraham Al-Saleh proclaimed... Alright, before I go to far, I'm going to present what I know, what I need, and what I've read so far. We had a recent scare at my company, we lost conectivity with our isp for about ten minutes because of a glitch. Due to the nature of our company, we have to have 100% uptime, and our SLA only guarantees us 99.999%. At best you'll get 5 9's. Why don't you look at multiple locations? After all, if your business is that critical, a power outage due to a bad circuit in the street outside where there is *supposed* to be redundancy but there isn't will cause pain. I'd also be curious to know what kind of location you're at if you need 100% uptime with T1 links. - Eric We have a backup generator that will run for five days and can be refilled while in operation, as well as dual matrix 5000 UPS'. We're working on an online medical prescribing and patient management solution, but we're currently small, we don't have the staff or the money to support two locations (yet). -- Cordially, Abraham Al-Saleh Systems Administrator CaduRx
Re: Openbgpd routing for redundancy.
On Fri, 2005-05-06 at 14:54:31 -0600, Abraham Al-Saleh proclaimed... We have a backup generator that will run for five days and can be refilled while in operation, as well as dual matrix 5000 UPS'. We're working on an online medical prescribing and patient management solution, but we're currently small, we don't have the staff or the money to support two locations (yet). Cool, it's good to see what kind of markets obsd can get into. Hopefully you find two ISP's that don't have simultaneous failures!
Re: Openbgpd routing for redundancy.
eric wrote: On Fri, 2005-05-06 at 14:54:31 -0600, Abraham Al-Saleh proclaimed... We have a backup generator that will run for five days and can be refilled while in operation, as well as dual matrix 5000 UPS'. We're working on an online medical prescribing and patient management solution, but we're currently small, we don't have the staff or the money to support two locations (yet). Cool, it's good to see what kind of markets obsd can get into. Hopefully you find two ISP's that don't have simultaneous failures! Yes, there's only so much I can do to keep everything redundant at present, something that will change later when we have sufficient money, a big concern is that someone might dig out our local loop with a back hoe, nothing I can do about that at present. I'm just trying to minimize as many risks as possible.
Re: Openbgpd routing for redundancy.
Stuart Henderson wrote: --On 06 May 2005 14:35 -0600, Abraham Al-Saleh wrote: uptime, and our SLA only guarantees us 99.999%. So, I'm currently You sometimes find that SLA means something like we'll charge you more so that when things break, we can pay some of it back... talking with several companies to have another T1 brought in, and I'm planning on using OpenBGPD to provide fault tolerance. The only problem? I've never done anything like this before. I'm already While BGP can be used to improve reliability, it also gives you interesting and varied ways to break your network. What's more, it's quite possible to break your connectivity for extended periods of time (through flap dampening), and there's nothing that can be done to fix it, you just have to sit it out. So it must be done with thought and care. good resources on bgp in particular (books, websites, See http://www.bgp4.as/books - maybe look at Stewart BGP4, van Beijnum BGP, Halabi Internet routing architectures. Typically, config examples are given for IOS, but many concepts are portable. van Beijnum is probably the easier read, Stewart has good information about the protocol (probably will help you to understand the RFC better), Halabi is published by Cisco Press so understandably IOS-centric, quite a lot of good material. A test network is pretty much essential to help you get to grips with things... Thanks for the tips, the funny thing is I just sent a request to my boss to purchase the books by Stewart and Beijnum. And thanks for the advice about testing, I was pretty sure that my weekends and evenings were shot for awhile anyway...
Re: Openbgpd routing for redundancy.
On Fri, 6 May 2005, Abraham Al-Saleh wrote: Yes, there's only so much I can do to keep everything redundant at present, something that will change later when we have sufficient money, a big concern is that someone might dig out our local loop with a back hoe, nothing I can do about that at present. I'm just trying to minimize as many risks as possible. Why haven't you CoLo'd a set of backup servers? Doesn't cost much to have a rack on a backbone, .. you can even shop for one in a different part of the country/world. Lee Leland V. Lammert[EMAIL PROTECTED] Chief Scientist Omnitec Corporation Network/Internet Consultants www.omnitec.net
Re: Openbgpd routing for redundancy.
L. V. Lammert wrote: On Fri, 6 May 2005, Abraham Al-Saleh wrote: Yes, there's only so much I can do to keep everything redundant at present, something that will change later when we have sufficient money, a big concern is that someone might dig out our local loop with a back hoe, nothing I can do about that at present. I'm just trying to minimize as many risks as possible. Why haven't you CoLo'd a set of backup servers? Doesn't cost much to have a rack on a backbone, .. you can even shop for one in a different part of the country/world. Lee Leland V. Lammert[EMAIL PROTECTED] Chief Scientist Omnitec Corporation Network/Internet Consultants www.omnitec.net Because, with the type of colocation we require, it DOES cost much. We can't stick our servers on a simple rack, we have to have cage space, and that costs more than a little. We store medical data, and HIPAA compliance (if you've ever heard of that?) is a bitch, to put it simply. What's worse is that many medical organizations misunderstand it, and put even more stringent practices that must be adhered to. -- Cordially, Abraham Al-Saleh Systems Administrator CaduRx
Re: Openbgpd routing for redundancy.
--On 06 May 2005 14:35 -0600, Abraham Al-Saleh wrote: uptime, and our SLA only guarantees us 99.999%. So, I'm currently You sometimes find that SLA means something like we'll charge you more so that when things break, we can pay some of it back... talking with several companies to have another T1 brought in, and I'm planning on using OpenBGPD to provide fault tolerance. The only problem? I've never done anything like this before. I'm already While BGP can be used to improve reliability, it also gives you interesting and varied ways to break your network. What's more, it's quite possible to break your connectivity for extended periods of time (through flap dampening), and there's nothing that can be done to fix it, you just have to sit it out. So it must be done with thought and care. good resources on bgp in particular (books, websites, See http://www.bgp4.as/books - maybe look at Stewart BGP4, van Beijnum BGP, Halabi Internet routing architectures. Typically, config examples are given for IOS, but many concepts are portable. van Beijnum is probably the easier read, Stewart has good information about the protocol (probably will help you to understand the RFC better), Halabi is published by Cisco Press so understandably IOS-centric, quite a lot of good material. A test network is pretty much essential to help you get to grips with things...