On Fri, 21 Mar 2003, Gordon Messmer wrote:

> Jon Nelson wrote:
> > On Thu, 20 Mar 2003, Gordon Messmer wrote:
> > 
> >>The new system cost about $15000.  It is built with an NFS backend 
> >>running Red Hat Linux 7.3 on a 1TB RAID 5 set attached to a 3ware 7500 
> >>card, one 1.8 Ghz CPU and 1GB of RAM.  There are two Courier servers 
> >>configured identically, load balanced with DNS round-robin.  Each has an 
> > 
> > 
> > I would strongly suggest taking a good look at:
> > http://www.linuxvirtualserver.org/Documents.html
> 
> I'm well aware of LVS techniques.  I can't, however, see fit to justify 
> throwing in two servers (without failover, LVS becomes a single point of 
> failure) in order to provide load-balancing and fail-over to two other 
> servers.
> 
> At some point that will likely change, but when it does the LVS boxes 
> will be providing service to other services in the network, like our 
> LDAP boxes, in addition to the email servers.
> 
> > DNS round-robin has /so/ many problems -- you have to set the ttl
> > incredibly low for it to work at all
> 
> That's not correct.  "ping mail-test.real.com" ten times and you should 
> get about half of the lookups to one box, and half to the other.

The problems I'm talking about involve /caching/ of the response.
The typical ttl on a response is usually 24 hours.  Even /if/ your
network is set up such that clients ask the server directly when
resolving, /and/ the clients *don't* do any caching, you *still* get
roughly 50% of the answers wrong.  By "wrong" I mean I'll get an IP for
a server that isn't up.  

> > and /many/ email clients /cache/
> > the IP beyond the ttl.  Thus, if you name your servers A and B, and A
> > goes down (and A is the "primary"), many clients will continue trying to
> > contact A despite it being down and the ttl having long expired.
> 
> There's no "primary" in a round-robin.  Each server is equal.  Clients 
> that we've tested work as intended in the event of failure.  HA will be 
> introduced later on.

That's exactly the problem.  Server A goes down. Client X says, "resolve
mail.domain for me", and gets the /IP/ for A, roughly 50% of the time.

By your own statements, if I ping mail-test.real.com ten times, I get
roughly 50% ICMP packets sent to one host, the remainder to the other.
If one of those hosts is /down/, DNS round-robin *doesn't change the
fact that roughly 50% of my packets will be destined for a downed host*.
Are you performing some type of availability test /on/ the DNS server
such that if A goes down resolutions for mail-test.real.com always
return B?  

> > The LVR/NAT and LVS/DR solutions are much "better" from a high level
> > perspective. Heck, you could probably get away with a Pentium 200 level
> > machine as the NAT/DR "router" - it just passes and mangles packets.
> 
> What sense does it make to spend 15K on a cluster of boxes and then 
> skimp on the HA gateways?

Who says you are skimping?  If you need a certain amount of horsepower
to perform a job, why bother with grossly exceeding that limit?

--
Applying computer technology is simply finding the right wrench to
  pound in the correct screw.

Jon Nelson <[EMAIL PROTECTED]>
C and Python Code Gardener


-------------------------------------------------------
This SF.net email is sponsored by:Crypto Challenge is now open! 
Get cracking and register here for some mind boggling fun and 
the chance of winning an Apple iPod:
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0031en
_______________________________________________
courier-users mailing list
[EMAIL PROTECTED]
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users

Reply via email to