Re: Openbgpd routing for redundancy.

2005-05-07 Thread Claudio Jeker
On Fri, May 06, 2005 at 10:36:38PM +0100, Stuart Henderson wrote:
 --On 06 May 2005 14:35 -0600, Abraham Al-Saleh wrote:
 
 uptime, and our SLA only guarantees us 99.999%. So, I'm currently
 
 You sometimes find that SLA means something like we'll charge you more 
 so that when things break, we can pay some of it back...
 
 talking with several companies to have another T1 brought in, and I'm
 planning on using OpenBGPD to provide fault tolerance. The only
 problem? I've never done anything like this before. I'm already
 
 While BGP can be used to improve reliability, it also gives you 
 interesting and varied ways to break your network. What's more, it's 
 quite possible to break your connectivity for extended periods of time 
 (through flap dampening), and there's nothing that can be done to fix 
 it, you just have to sit it out. So it must be done with thought and 
 care.
 

Everything you plug into your network gives you interesting and varied
ways to break your network. Btw. route flap dampening is considered evil,
it was invented to protect the lousy underpowered routers created by
Cisco. On a redundant setup route flap dampening should never kick in as
the announced network never disappears it just switches between two different
pathes.

For a good redundant setup you need more than one router. Every uplink
goes to one independent OpenBGPD box. From there you should use an IBGP
mesh and carp(4) to connect the servers redundant to your backbone.

Last note: even bgp normaly needs some time to reroute traffic so getting
a real 100% connectivity from all over the world is impossible.
e.g. the default holdtime is 90sec and it may take so long until your
connection goes down.

 good resources on bgp in particular (books, websites,
 
 See http://www.bgp4.as/books - maybe look at Stewart BGP4, van 
 Beijnum BGP, Halabi Internet routing architectures. Typically, 
 config examples are given for IOS, but many concepts are portable. van 
 Beijnum is probably the easier read, Stewart has good information about 
 the protocol (probably will help you to understand the RFC better), 
 Halabi is published by Cisco Press so understandably IOS-centric, quite 
 a lot of good material.
 
 A test network is pretty much essential to help you get to grips with 
 things...
 

Absolutly, without a real test lab where you can play through different
scenarios, you may end up with a worse solution.
I remember people connecting fully redundant servers to the same braker or
getting two independent uplinks but using the same inhouse cable duct.
In both cases there was a longer downtime because of this bad design.

-- 
:wq Claudio



Re: Openbgpd routing for redundancy.

2005-05-07 Thread J.C. Roberts
On Fri, 06 May 2005 16:58:39 -0600, Abraham Al-Saleh [EMAIL PROTECTED]
wrote:

I should additionally add (sorry about that), that it's not something 
that hasn't been considered in the past, and I'm considering it again, I 
just need to weigh costs for this with the costs for making our internet 
connection redundant, as well as the man power required, time it will 
take, and risks associated with each, which is why I came to the list 
asking for more information on using openbgpd, or bgp in particular.


Abe,

I suggest you reconsider your stance on collocation. The answer (due
to HIPPA) may not be a provider of collocation facilities but
actually having another physical site controlled by your company. I
haven't actually read all of the HIPPA requirements but due to
friends, I've got a good idea how much of a pain in the ass they can
be.

The reason for collocating is logical. Sure, you may have a pair of
APC Matrix 5000 units and a generator at your current site... -But
heck, even my garage has the very same equipment! The difference is
life and death decisions are not made based on the ability to access
the machines in my garage. In your business, any inability to access
medical records could cause people to die. You're in a totally
different league and have to face a ton of liability if something goes
wrong.

Let's say you go through the expense of full redundancy at your single
site and when I say full I mean everything from multiple power drops
from different chunks of the local grids, to at least pairs of
generators, custom redundant wiring/circuits, staged UPS's all the way
down the proverbial power line to the CPU's... -You're still
vulnerable. The reason is simple, anything from a major disaster in
Farmington Utah, to something as trivial as a fiber cut (i.e. someone
with a backhoe accidentally ripping out network lines), you're still
hosed.

Having multiple sites is the same logic as having redundant APC Matrix
5K units but it's applied on a more effective scale; If one gets
hosed, you cross your fingers and hope the second will pick up the
load. If you have only one site, you still have a single point of
failure regardless of how many redundant lines you attach to it.

I understand the costs involved with having a second site, but in
general the industry understands HIPPA compliance is expensive and
worse yet, liability is even more expensive. The multi-site
redundancy, though costly, would be a sales advantage due to the
reduced liability it offers. Even if you can not afford to do it now,
it would still be worthwhile to have plans in place on how it
(eventually) will be done. If the legal department of some HMO
client/partner requires site redundancy, you add implementing your
plan to the costs of their contract... ;-)

JCR



Openbgpd routing for redundancy.

2005-05-06 Thread Abraham Al-Saleh
Alright, before I go to far, I'm going to present what I know, what I 
need, and what I've read so far. We had a recent scare at my company, we 
lost conectivity with our isp for about ten minutes because of a glitch. 
Due to the nature of our company, we have to have 100% uptime, and our 
SLA only guarantees us 99.999%. So, I'm currently talking with several 
companies to have another T1 brought in, and I'm planning on using 
OpenBGPD to provide fault tolerance. The only problem? I've never done 
anything like this before. I'm already comfortable with openbsd, as 
we've been using it on all of OUR routers (not the managed router from 
our T1 provider... but that's going to be going away if we do this) and 
we've been very happy with it due to the likes of carp and pf. I've read 
the bgpd, bgpd.conf, and bgpctl man pages, I've skimmed rfc 1771, I've 
read the slides presented by Henning Brauer to the Chaos Communication 
Congress, and I've been googling like mad. What I'm looking for is a 
thorough overview of implementing openbgpd in a situation like mine, 
good resources on bgp in particular (books, websites, anything anyone 
else has found useful), or just general tips that anyone would be 
willing to give me. I'll be the first to admit that I haven't spent a 
lot of time in the down deeps of routing, but I'm not against reading 
large technical manuals.

Any help would be hot, thank you everyone.


Re: Openbgpd routing for redundancy.

2005-05-06 Thread Abraham Al-Saleh
eric wrote:
On Fri, 2005-05-06 at 14:35:09 -0600, Abraham Al-Saleh proclaimed...

Alright, before I go to far, I'm going to present what I know, what I 
need, and what I've read so far. We had a recent scare at my company, we 
lost conectivity with our isp for about ten minutes because of a glitch. 
Due to the nature of our company, we have to have 100% uptime, and our 
SLA only guarantees us 99.999%. 

At best you'll get 5 9's. Why don't you look at multiple locations? After
all, if your business is that critical, a power outage due to a bad circuit
in the street outside where there is *supposed* to be redundancy but there
isn't will cause pain.
I'd also be curious to know what kind of location you're at if you need 100%
uptime with T1 links.
- Eric

We have a backup generator that will run for five days and can be 
refilled while in operation, as well as dual matrix 5000 UPS'. We're 
working on an online medical prescribing and patient management 
solution, but we're currently small, we don't have the staff or the 
money to support two locations (yet).

--
Cordially,
Abraham Al-Saleh
Systems Administrator
CaduRx


Re: Openbgpd routing for redundancy.

2005-05-06 Thread eric
On Fri, 2005-05-06 at 14:54:31 -0600, Abraham Al-Saleh proclaimed...

 We have a backup generator that will run for five days and can be 
 refilled while in operation, as well as dual matrix 5000 UPS'. We're 
 working on an online medical prescribing and patient management 
 solution, but we're currently small, we don't have the staff or the 
 money to support two locations (yet).

Cool, it's good to see what kind of markets obsd can get into.

Hopefully you find two ISP's that don't have simultaneous failures!



Re: Openbgpd routing for redundancy.

2005-05-06 Thread Abraham Al-Saleh
eric wrote:
On Fri, 2005-05-06 at 14:54:31 -0600, Abraham Al-Saleh proclaimed...

We have a backup generator that will run for five days and can be 
refilled while in operation, as well as dual matrix 5000 UPS'. We're 
working on an online medical prescribing and patient management 
solution, but we're currently small, we don't have the staff or the 
money to support two locations (yet).

Cool, it's good to see what kind of markets obsd can get into.
Hopefully you find two ISP's that don't have simultaneous failures!
Yes, there's only so much I can do to keep everything redundant at 
present, something that will change later when we have sufficient money, 
a big concern is that someone might dig out our local loop with a back 
hoe, nothing I can do about that at present. I'm just trying to minimize 
as many risks as possible.



Re: Openbgpd routing for redundancy.

2005-05-06 Thread Abraham Al-Saleh
Stuart Henderson wrote:
--On 06 May 2005 14:35 -0600, Abraham Al-Saleh wrote:
uptime, and our SLA only guarantees us 99.999%. So, I'm currently

You sometimes find that SLA means something like we'll charge you more 
so that when things break, we can pay some of it back...

talking with several companies to have another T1 brought in, and I'm
planning on using OpenBGPD to provide fault tolerance. The only
problem? I've never done anything like this before. I'm already

While BGP can be used to improve reliability, it also gives you 
interesting and varied ways to break your network. What's more, it's 
quite possible to break your connectivity for extended periods of time 
(through flap dampening), and there's nothing that can be done to fix 
it, you just have to sit it out. So it must be done with thought and care.

good resources on bgp in particular (books, websites,

See http://www.bgp4.as/books - maybe look at Stewart BGP4, van 
Beijnum BGP, Halabi Internet routing architectures. Typically, 
config examples are given for IOS, but many concepts are portable. van 
Beijnum is probably the easier read, Stewart has good information about 
the protocol (probably will help you to understand the RFC better), 
Halabi is published by Cisco Press so understandably IOS-centric, quite 
a lot of good material.

A test network is pretty much essential to help you get to grips with 
things...


Thanks for the tips, the funny thing is I just sent a request to my boss 
to purchase the books by Stewart and Beijnum. And thanks for the advice 
about testing, I was pretty sure that my weekends and evenings were shot 
for awhile anyway...



Re: Openbgpd routing for redundancy.

2005-05-06 Thread L. V. Lammert
On Fri, 6 May 2005, Abraham Al-Saleh wrote:

 Yes, there's only so much I can do to keep everything redundant at
 present, something that will change later when we have sufficient money,
 a big concern is that someone might dig out our local loop with a back
 hoe, nothing I can do about that at present. I'm just trying to minimize
 as many risks as possible.

Why haven't you CoLo'd a set of backup servers? Doesn't cost much to have
a rack on a backbone, .. you can even shop for one in a different part of
the country/world.

Lee



  Leland V. Lammert[EMAIL PROTECTED]
Chief Scientist Omnitec Corporation
 Network/Internet Consultants   www.omnitec.net




Re: Openbgpd routing for redundancy.

2005-05-06 Thread Abraham Al-Saleh
L. V. Lammert wrote:
On Fri, 6 May 2005, Abraham Al-Saleh wrote:

Yes, there's only so much I can do to keep everything redundant at
present, something that will change later when we have sufficient money,
a big concern is that someone might dig out our local loop with a back
hoe, nothing I can do about that at present. I'm just trying to minimize
as many risks as possible.
Why haven't you CoLo'd a set of backup servers? Doesn't cost much to have
a rack on a backbone, .. you can even shop for one in a different part of
the country/world.
Lee

  Leland V. Lammert[EMAIL PROTECTED]
Chief Scientist Omnitec Corporation
 Network/Internet Consultants   www.omnitec.net


Because, with the type of colocation we require, it DOES cost much. We 
can't stick our servers on a simple rack, we have to have cage space, 
and that costs more than a little. We store medical data, and HIPAA 
compliance (if you've ever heard of that?) is a bitch, to put it simply. 
What's worse is that many medical organizations misunderstand it, and 
put even more stringent practices that must be adhered to.

--
Cordially,
Abraham Al-Saleh
Systems Administrator
CaduRx


Re: Openbgpd routing for redundancy.

2005-05-06 Thread Stuart Henderson
--On 06 May 2005 14:35 -0600, Abraham Al-Saleh wrote:
uptime, and our SLA only guarantees us 99.999%. So, I'm currently
You sometimes find that SLA means something like we'll charge you more 
so that when things break, we can pay some of it back...

talking with several companies to have another T1 brought in, and I'm
planning on using OpenBGPD to provide fault tolerance. The only
problem? I've never done anything like this before. I'm already
While BGP can be used to improve reliability, it also gives you 
interesting and varied ways to break your network. What's more, it's 
quite possible to break your connectivity for extended periods of time 
(through flap dampening), and there's nothing that can be done to fix 
it, you just have to sit it out. So it must be done with thought and 
care.

good resources on bgp in particular (books, websites,
See http://www.bgp4.as/books - maybe look at Stewart BGP4, van 
Beijnum BGP, Halabi Internet routing architectures. Typically, 
config examples are given for IOS, but many concepts are portable. van 
Beijnum is probably the easier read, Stewart has good information about 
the protocol (probably will help you to understand the RFC better), 
Halabi is published by Cisco Press so understandably IOS-centric, quite 
a lot of good material.

A test network is pretty much essential to help you get to grips with 
things...