Re: [Linux-HA] Single Point of Failure

Miles Fidelman Thu, 12 Jan 2012 12:59:52 -0800

Paul O'Rorke wrote:
> Please excuse me if this is documented and I failed to find it.  I have
> been investigating ha-linux to provide Business Continuity in our mail
> server.  Currently we have a single mail server in our main office.  We
> would like to set up a second server in a geographically different location
> and make a cluster of 2 nodes so that we can continue doing business if one
> fails.
>
> As far as I can tell ha-linux with Pacemaker is ideally suited to this.  My
> question is around how the cluster handles requests to the mail server(s).
> Can anyone suggest some appropriate reading for where/how this is
> handled?   My concern is that should there be a failure at the location
> that is receiving the requests how does it know to use the second node?  Is
> this typically done through zone files and an priority?  Obviously I am
> missing some important reading here because it would seem to me that there
> could still be a single point of failure and that doesn't seem right.


Not really.  ha-linux is primarily for local clusters, not 
geographically dispersed ones.  You can use ha-linux to make a single 
mail server more reliable, but you need to do something different for 
geographic redundancy - which is really a question for the list 
associated with whatever mail software you're using (e.g., 
postfix-users).  But... having said that:

If you want to make a single mail server more reliable (or make each 
distributed server more redundant), then you can do what we do, and run 
a stack that looks roughly like this:
   mail server & list manager (and antivirus and antispam, ...)
   xen-VM
   DRBD
   pacemaker/etc.

If a node fails, the entire VM fails over to another node - all the data 
(mail spool, inboxes) is replicated by DRBD, all the IP addresses 
migrate with a failover, and everything keep humming along.

----
For geographic redundancy, you need to do deal with several things, that 
have to do with DNS records and mail server configs:

1.  For outgoing mail:

- it depends on whether mail clients do DNS lookups and send mail 
directly to their destinations, or whether the clients route everything 
through your central server -- if the former, you don't need to do 
anything; if the later:

- set up a 2nd server and either configure your clients to know about it 
(not always possible), or set up the DNS record for your outgoing server 
to contain records for both outgoing servers -- for the most part, this 
will take care of things, with three caveats:
-- depending on how the clients do DNS lookups, and how they do retries 
if they can't reach a server, stuff might sit in queue for a while
-- if mail is in transit between client and server, and the server 
fails, that message might get lost (depends on the client behavior)
-- mail that's queued on the server, when it fails, will probably get 
sent when the server comes back up, but also might get lost, depending 
on the type of failure (note: there are some ways to configure some 
servers, so that a mail session does not complete until the mail goes to 
the next hop - not sure off the top of my head if you can set things up 
so that an incoming session is kept open until mail has made it through 
the server to its next hop)

2a. For incoming mail - case 1: SMTP, mailboxes, POP, IMAP  all on the 
same server:

- first off, make sure that you have some redundancy on that server, so 
that you don't lose mail

- you can set up a 2ndry server (give it an MX record with lower 
priority than the primary server) - it will receive mail when the 
primary goes down; and you can set up the mail config to forward stuff 
automatically to the primary server when it comes back up -- people 
won't be able to get to their mail until the primary comes back up, but 
mail will get accepted and will eventually get delivered

- if you want to have geographic failover for mailboxes/POP/IMAP, things 
get a lot trickier (e.g. replication of mail directories, configuring 
DNS and/or clients to know about alternate locations)

2b. For incoming mail - case 2: SMTP on one machine, mailboxes/POP/IMAP 
on other machines (e.g., incoming mail host forwards to local servers):

- this is easy, set up MX records for each incoming server (they can be 
of equal priority if you want, for load leveling)

- incoming mail will go to one server or the other, and get forwarded by 
whichever server handles the message, to the local destinations

- if one server fails, mail will continue to flow through the other one 
-- the only stuff that will get delayed or lost is stuff that's been 
queued on the server, but not yet forwarded (and, as noted above, it may 
be possible to set up your servers so that failure recovery is pushed 
back to the sending host - i.e., if the mail hasn't been forwarded, the 
incoming transaction fails and the sending host tries again)


Hope this helps,

Miles Fidelman

-- 
In theory, there is no difference between theory and practice.
In practice, there is.   .... Yogi Berra


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Single Point of Failure

Reply via email to