2014-02-18, Jason Haar wrote:
We have a geographically distributed edge mail relay network (some in
the US and some in Europe) and I'm wondering if the new REDIS support
could be used to centralize our Bayes?

If you have a fast and reliable connection between the two,
then in principle it could work, although even the roundtrip
time across the globe is several times the time needed for a
local transaction, so this is probably not a desirable setup.
One server in each continent might be acceptable, but hasn't
been tried.

Bear in mind that a redis server offers no access controls of
its own, so IP restrictions need to be handled by a firewall
if redis binds to a publicly reachable interface.

Is anything special required to be done to get 4-6 spamd servers
to use the same REDIS backend?

No, this is normal. It is no different that having multiple spamd
or amavisd child processes under a single master process, each
process accesses a database completely independently.

Will network outages (which will happen) cause
corruption that could impact the others? (eg what if spamd is trying
to upload 3 records to redis and only the first two go through)

No corruption can happen due to network problems. Cases where some
but not all tokens are learned, or tokens learned but 'seen' entry
not added are non-problematic if it doesn't happen too often.
Token updates usually fit within a single IP packet, so in most
cases either all of the transaction gets committed or none,
even in case of network problems.

A full network breakdown (or server down) would cause SpamAssassin
to log warnings for each mail message, but will move on anyway,
just without Bayes checks. Depending on the mail traffic rate
and the duration of outage the volume of such warnings may be
undesirable. Intermittent network problems or slowness would be
more problematic, as it could slow down mail checking substantially,
as timeouts for failing rules and checks are rather large.

  Mark


Reply via email to