On 2/22/2011 9:29 PM, Robert Goodyear wrote:
I know this topic has been flogged to death, and perhaps for good reason, but
I'm trying to determine the best outbound high-volume ecosystem for Postfix.
As I understand it, the RELAYHOST parameter will allow an FQDN that, when
bracketed, can skip MX lookup and just return the DNS result. If I use
roundrobin A records for my mesh of MTAs out in my datacenters, I've got a
reasonable randomization going on. If I use MX records of equal weight, Postfix
will do the randomization (assuming I've not disabled that in general with the
SMTP randomize param elsewhere.)
I know that I could theoretically do some hijinx with a transport map, but that
doesn't seem wise.
So, the downside of direct addressing of the roundrobin A record of my MTAs
might be a lack of fallback that would have been described in my MX record. But
for HA outbound stuff, I've got bigger problems if one of my MTAs on the edge
goes down, so let's ignore that for now.
With that assumption, it seems like MX versus roundrobin A addressing of my
MTAs is pretty much equally performant, save for the failover inherent in MX.
However, should I consider backoff/retry requests from these MTAs as a winning
proposition for letting DNS and MX do their thing properly? If so, it would
seem that roundrobin A addressing is therefore considered harmful. Is there a
difference in TTLs or anything else that I should consider from the origin
Postfix server which might weigh in here?
Obviously, true load balancing would be the best option, perhaps by leveraging
RabbitMQ and some realtime health metrics for each relay would need to affect
the routing from Postfix to that relay, and then we're getting into a
completely different architecture here. Using MX and SMTP negotiation the way
it was intended is a lot simpler, but I'm just looking to optimize everything I
can here.
Thanks in advance for any clarification.
Postfix will internally randomize either A records or
equal-weight MX records, so it doesn't make too much
difference which you use. A transport_maps entry that
resolves to either multiple A records or multiple equal-weight
MX records will perform about the same as a relayhost setting
(assuming the normal case of negligible time spent on
transport_maps lookup).
The postfix connection caching algorithm will automatically
limit the damage caused by a subset of slow-responding relayhosts.
You can increase concurrency for relayhosts under your direct
control if they can handle the load (it's impolite to open
dozens/hundreds of connections to someone else's server
without prior agreement).
The default settings should give very good performance. For
knobs to twist, please see:
http://www.postfix.org/TUNING_README.html#mailing_tips
http://www.postfix.org/QSHAPE_README.html
http://www.postfix.org/QSHAPE_README.html#backlog
-- Noel Jones