On 2/22/2011 10:29 PM, Robert Goodyear wrote:
I know this topic has been flogged to death, and perhaps for good reason, but
I'm trying to determine the best outbound high-volume ecosystem for Postfix.
As I understand it, the RELAYHOST parameter will allow an FQDN that, when
bracketed, can skip MX lookup and just return the DNS result. If I use
roundrobin A records for my mesh of MTAs out in my datacenters, I've got a
reasonable randomization going on. If I use MX records of equal weight, Postfix
will do the randomization (assuming I've not disabled that in general with the
SMTP randomize param elsewhere.)
I know that I could theoretically do some hijinx with a transport map, but that
doesn't seem wise.
So, the downside of direct addressing of the roundrobin A record of my MTAs
might be a lack of fallback that would have been described in my MX record. But
for HA outbound stuff, I've got bigger problems if one of my MTAs on the edge
goes down, so let's ignore that for now.
With that assumption, it seems like MX versus roundrobin A addressing of my
MTAs is pretty much equally performant, save for the failover inherent in MX.
However, should I consider backoff/retry requests from these MTAs as a winning
proposition for letting DNS and MX do their thing properly? If so, it would
seem that roundrobin A addressing is therefore considered harmful. Is there a
difference in TTLs or anything else that I should consider from the origin
Postfix server which might weigh in here?
Obviously, true load balancing would be the best option, perhaps by leveraging
RabbitMQ and some realtime health metrics for each relay would need to affect
the routing from Postfix to that relay, and then we're getting into a
completely different architecture here. Using MX and SMTP negotiation the way
it was intended is a lot simpler, but I'm just looking to optimize everything I
can here.
Thanks in advance for any clarification.
I disclaim that there are some hackishnesses in my suggestion, but:
You could leverage the built-in priority of MX records with a custom DNS
load balancer. The postfix server could be configured to use as its
exclusive downstream DNS source a rather fickle, private, exclusive, DNS
server set to serve its personal MX records with a TTL of, say, 1
minute, and otherwise behave normally for the public internet domain
space. This private DNS server's zone file would be scriptable to query
(SNMP? MySQL tables?) your queue lengths of your MTAs and translate
these into MX priority values. If this is a good idea, it's probably
been done before, and you can copy an existing technique.
-Daniel