> Today, I strongly advice *against* using backup MX servers
> because...
> I wonder what the general recommendation for backup MX is among the
> list members.
My response is that there's a difference between having a "warm"
backup MX (published, plugged in, but not accepting mail until
triggered) and a "hot" backup MX (published and always active).
If you have a hot backup, agreed, you have to enforce the exact same
anti-abuse policies as the primary, including recipient validation.
If, either due to workload or technical reasons, you can't keep a hot
backup in basic sync, yep, shouldn't try to have one. There's an
in-between area when each MX has the exact same policies defined, but
due to geographic distribution and/or different management, they can't
share state of self-adjusting policies such as Bayes and greylisting
databases; you can go either way in this case. (Sharing either one of
these dbs over SQL and T-1 type speeds is not feasible. We tried both
in the WAN simulator lab and found performance dismal; I dismiss
untested theories to the contrary.)
If you have a warm backup and a reliable monitoring script that only
brings it online when the primary is unavailable, it's a ton less
important to share policies exactly, though it is obviously still
preferable during periods of extended downtime to not have to explain
that you've failed over to your backup and are thus seeing more FPs
and FNs than usual. With a warm backup, you can perform periodic
imports, or even unidirectional real-time replication, of
self-adjusting policy databases over slow links (block-level
unidirectional replication takes a fraction of the bandwidth of live
SQL reads/writes from a remote location), keeping those policies
current as well. A warm backup can be a win-win for maintenance and
effectiveness.
With either option, I don't agree with your perception that "If you
use a backup MX, the sender has no way to notice that his mail is
delayed. If your mail server has a problem, have the *sender* Mailhost
retry, rather than a third (backup MX) system." The same arguments
could be made if your internal mailbox server is down, but your MX is
still accepting mail. I don't think they are persuasive. Whether
you've got a hot or warm backup MX, one *intent* of having a backup is
that senders don't get hard bounces if your primary is down for a
short time, where "short" = "greater than the sender's queue lifetime,
though less than either side's effective screaming threshold." [1]
So if a sender's MTA tries 8 times over 2 hours -- IMO, a reasonable
average expiration time for business communication, neither giving the
expectation of instantaneous delivery, nor delaying failure
unreasonably, and allowing alternate means of communication to be
pursued if necessary -- and you go down for 2.5 hours in the late
evening, you're going to be triggering bounces that will be
embarrassing and confusing for business communication. If you had a
backup MX (either hot or warm) during that period, the senders
wouldn't be the wiser, and if your recipients don't check e-mail until
the next day, they wouldn't know, either. Better yet, if your internal
mailbox server wasn't down (if you didn't have a site-wide outage, but
just the primary MX at the primary site was down), the recipient, if
they were checking, would have gotten the message immediately! You'll
have saved yourself two potential support calls. That
store-and-forward backup is your *ally*, not your enemy, in these
cases, so I could never conclude that a warm backup MX is a poor
investment.
There're other areas to consider. That backup MXs have a reputation
for not enforcing policies makes them a strong attractor for spammers,
as you note. But that *also* means that when your primary MX is up,
the backup becomes a great spam honeypot. With the primary known to be
up, you can harvest and block addresses only hitting your backup with
an extremely high degree of accuracy. And deflecting abuse from your
primary to your secondary obviously is beneficial for performance.
Obviously, such a setup requires a hot backup MX -- in a sense, it's a
super-hot backup, with policies in excess of the primary, and then can
cool down to a standard hot backup if the primary is detected down.
[2]
Personally, what I like to do is publish one virtual IP as the primary
MX and load-balance that invisibly across some physical boxes.
Off-site, I keep a warm backup waiting, polling the virtual IP. A
spammer can't hammer on any particular box that way and can't force
lots of junk across the WAN wire, etc., etc..
In all, people use lots of perfectly workable MX configs... built-in
DNS round-robin and warm backup MXs... dynamic DNS-based
load-balancing based on uptime, utilization, and geolocation...
network-layer load balancing... etc. I'd advise thinking about both
senders and recipients as your client base and modeling a setup whose
benefits/consequences you could put in lay terms. The
benefit/detriment of having a backup MX at any one time of day or in
any one business (or, frustratingly, for a set of business
relationships within a business) will vary, but you have to play the
averages.
--Sandy
[1] Note that I feel that sending transient delay notifications to
senders ("Your message has been delayed...") for such short queue
lifetimes is a bad idea. IME more users misinterpret it as a hard
bounce than take the time to read it through and take any comfort from
it. YMMV there. It's true that *if* a given sender could be counted on
to read and deal with the transient bounce in a productive way (like a
phone call, not a blunt resend!), that some of my logic would break in
that case. Contrariwise, if the MTA starts sending transient bounces
after 1 hour for a 2-hour queue lifetime (or 2 hours for a 4-hour
queue lifetime, etc.) that _would have_ led to confusion, then the
usefulness of your backup MX is actually _increased_, as it starts to
actively save your sanity after 1 hour of downtime on the primary.
[2] In such a setup, any backoff policy enforced through 4xxs on the
primary predicts a legit hit on the super-hot secondary; the secondary
is still a honeypot, but you have to exclude any duplicate hits you in
fact triggered yourself by 4xxing on the primary... this can get
hairier than you feel like managing, esp. if your spam catch rate is
already acceptable.
------------------------------------
Sanford Whiteman, Chief Technologist
Broadleaf Systems, a division of
Cypress Integrated Systems, Inc.
e-mail: [EMAIL PROTECTED]
SpamAssassin plugs into Declude!
http://www.imprimia.com/products/software/freeutils/SPAMC32/download/release/
Defuse Dictionary Attacks: Turn Exchange or IMail mailboxes into IMail Aliases!
http://www.imprimia.com/products/software/freeutils/exchange2aliases/download/release/
http://www.imprimia.com/products/software/freeutils/ldap2aliases/download/release/
To Unsubscribe: http://www.ipswitch.com/support/mailing-lists.html
List Archive: http://www.mail-archive.com/imail_forum%40list.ipswitch.com/
Knowledge Base/FAQ: http://www.ipswitch.com/support/IMail/