>  Today,   I  strongly  advice  *against*  using  backup  MX  servers
> because...

> I  wonder what the general recommendation for backup MX is among the
> list members.

My  response  is  that  there's  a  difference between having a "warm"
backup  MX  (published,  plugged  in,  but  not  accepting  mail until
triggered) and a "hot" backup MX (published and always active).

If  you  have a hot backup, agreed, you have to enforce the exact same
anti-abuse  policies  as  the primary, including recipient validation.
If,  either due to workload or technical reasons, you can't keep a hot
backup  in  basic  sync,  yep,  shouldn't  try to have one. There's an
in-between  area when each MX has the exact same policies defined, but
due to geographic distribution and/or different management, they can't
share  state  of self-adjusting policies such as Bayes and greylisting
databases;  you can go either way in this case. (Sharing either one of
these  dbs over SQL and T-1 type speeds is not feasible. We tried both
in  the  WAN  simulator  lab  and  found performance dismal; I dismiss
untested theories to the contrary.)

If  you  have a warm backup and a reliable monitoring script that only
brings  it  online  when  the  primary is unavailable, it's a ton less
important  to  share  policies  exactly,  though it is obviously still
preferable  during periods of extended downtime to not have to explain
that  you've  failed  over to your backup and are thus seeing more FPs
and  FNs  than  usual.  With  a  warm backup, you can perform periodic
imports,    or   even   unidirectional   real-time   replication,   of
self-adjusting   policy   databases   over   slow  links  (block-level
unidirectional  replication  takes a fraction of the bandwidth of live
SQL  reads/writes  from  a  remote  location),  keeping those policies
current  as  well.  A warm backup can be a win-win for maintenance and
effectiveness.

With  either  option,  I don't agree with your perception that "If you
use  a  backup  MX,  the  sender has no way to notice that his mail is
delayed. If your mail server has a problem, have the *sender* Mailhost
retry,  rather  than  a  third (backup MX) system." The same arguments
could  be made if your internal mailbox server is down, but your MX is
still  accepting  mail.  I  don't  think  they are persuasive. Whether
you've got a hot or warm backup MX, one *intent* of having a backup is
that  senders  don't  get  hard  bounces if your primary is down for a
short time, where "short" = "greater than the sender's queue lifetime,
though less than either side's effective screaming threshold." [1]

So  if  a sender's MTA tries 8 times over 2 hours -- IMO, a reasonable
average expiration time for business communication, neither giving the
expectation   of   instantaneous   delivery,   nor   delaying  failure
unreasonably,  and  allowing  alternate  means  of communication to be
pursued  if  necessary  --  and  you go down for 2.5 hours in the late
evening,   you're   going  to  be  triggering  bounces  that  will  be
embarrassing  and  confusing  for business communication. If you had a
backup  MX  (either  hot  or  warm)  during  that  period, the senders
wouldn't be the wiser, and if your recipients don't check e-mail until
the next day, they wouldn't know, either. Better yet, if your internal
mailbox server wasn't down (if you didn't have a site-wide outage, but
just  the  primary MX at the primary site was down), the recipient, if
they  were checking, would have gotten the message immediately! You'll
have    saved    yourself    two   potential   support   calls.   That
store-and-forward  backup  is  your  *ally*,  not your enemy, in these
cases,  so  I  could  never  conclude  that a warm backup MX is a poor
investment.

There're  other  areas  to consider. That backup MXs have a reputation
for not enforcing policies makes them a strong attractor for spammers,
as  you  note.  But that *also* means that when your primary MX is up,
the backup becomes a great spam honeypot. With the primary known to be
up,  you can harvest and block addresses only hitting your backup with
an  extremely  high degree of accuracy. And deflecting abuse from your
primary  to  your  secondary  obviously is beneficial for performance.
Obviously, such a setup requires a hot backup MX -- in a sense, it's a
super-hot backup, with policies in excess of the primary, and then can
cool  down  to  a standard hot backup if the primary is detected down.
[2]

Personally, what I like to do is publish one virtual IP as the primary
MX  and  load-balance  that  invisibly  across  some  physical  boxes.
Off-site,  I  keep  a  warm  backup waiting, polling the virtual IP. A
spammer  can't  hammer  on any particular box that way and can't force
lots of junk across the WAN wire, etc., etc..

In  all,  people use lots of perfectly workable MX configs... built-in
DNS   round-robin   and   warm   backup   MXs...   dynamic   DNS-based
load-balancing   based  on  uptime,  utilization,  and  geolocation...
network-layer  load  balancing...  etc. I'd advise thinking about both
senders  and recipients as your client base and modeling a setup whose
benefits/consequences    you    could    put   in   lay   terms.   The
benefit/detriment  of  having a backup MX at any one time of day or in
any   one   business   (or,  frustratingly,  for  a  set  of  business
relationships  within  a business) will vary, but you have to play the
averages.

--Sandy


[1]  Note  that  I  feel that sending transient delay notifications to
senders  ("Your  message  has  been  delayed...") for such short queue
lifetimes  is  a  bad  idea.  IME more users misinterpret it as a hard
bounce than take the time to read it through and take any comfort from
it. YMMV there. It's true that *if* a given sender could be counted on
to read and deal with the transient bounce in a productive way (like a
phone call, not a blunt resend!), that some of my logic would break in
that  case.  Contrariwise, if the MTA starts sending transient bounces
after  1  hour  for  a  2-hour queue lifetime (or 2 hours for a 4-hour
queue  lifetime,  etc.)  that  _would have_ led to confusion, then the
usefulness  of your backup MX is actually _increased_, as it starts to
actively save your sanity after 1 hour of downtime on the primary.

[2]  In  such a setup, any backoff policy enforced through 4xxs on the
primary predicts a legit hit on the super-hot secondary; the secondary
is still a honeypot, but you have to exclude any duplicate hits you in
fact  triggered  yourself  by  4xxing  on  the primary... this can get
hairier  than  you feel like managing, esp. if your spam catch rate is
already acceptable.


------------------------------------
Sanford Whiteman, Chief Technologist
Broadleaf Systems, a division of
Cypress Integrated Systems, Inc.
e-mail: [EMAIL PROTECTED]

SpamAssassin plugs into Declude!
  http://www.imprimia.com/products/software/freeutils/SPAMC32/download/release/

Defuse Dictionary Attacks: Turn Exchange or IMail mailboxes into IMail Aliases!
  
http://www.imprimia.com/products/software/freeutils/exchange2aliases/download/release/
  
http://www.imprimia.com/products/software/freeutils/ldap2aliases/download/release/

To Unsubscribe: http://www.ipswitch.com/support/mailing-lists.html
List Archive: http://www.mail-archive.com/imail_forum%40list.ipswitch.com/
Knowledge Base/FAQ: http://www.ipswitch.com/support/IMail/

Reply via email to