On 2/22/2011 9:29 PM, Robert Goodyear wrote:
I know this topic has been flogged to death, and perhaps for good reason, but 
I'm trying to determine the best outbound high-volume ecosystem for Postfix.

As I understand it, the RELAYHOST parameter will allow an FQDN that, when 
bracketed, can skip MX lookup and just return the DNS result. If I use 
roundrobin A records for my mesh of MTAs out in my datacenters, I've got a 
reasonable randomization going on. If I use MX records of equal weight, Postfix 
will do the randomization (assuming I've not disabled that in general with the 
SMTP randomize param elsewhere.)

I know that I could theoretically do some hijinx with a transport map, but that 
doesn't seem wise.

So, the downside of direct addressing of the roundrobin A record of my MTAs 
might be a lack of fallback that would have been described in my MX record. But 
for HA outbound stuff, I've got bigger problems if one of my MTAs on the edge 
goes down, so let's ignore that for now.

With that assumption, it seems like MX versus roundrobin A addressing of my 
MTAs is pretty much equally performant, save for the failover inherent in MX. 
However, should I consider backoff/retry requests from these MTAs as a winning 
proposition for letting DNS and MX do their thing properly? If so, it would 
seem that roundrobin A addressing is therefore considered harmful. Is there a 
difference in TTLs or anything else that I should consider from the origin 
Postfix server which might weigh in here?

Obviously, true load balancing would be the best option, perhaps by leveraging 
RabbitMQ and some realtime health metrics for each relay would need to affect 
the routing from Postfix to that relay, and then we're getting into a 
completely different architecture here. Using MX and SMTP negotiation the way 
it was intended is a lot simpler, but I'm just looking to optimize everything I 
can here.

Thanks in advance for any clarification.


Postfix will internally randomize either A records or equal-weight MX records, so it doesn't make too much difference which you use. A transport_maps entry that resolves to either multiple A records or multiple equal-weight MX records will perform about the same as a relayhost setting (assuming the normal case of negligible time spent on transport_maps lookup).

The postfix connection caching algorithm will automatically limit the damage caused by a subset of slow-responding relayhosts.

You can increase concurrency for relayhosts under your direct control if they can handle the load (it's impolite to open dozens/hundreds of connections to someone else's server without prior agreement).

The default settings should give very good performance. For knobs to twist, please see:
http://www.postfix.org/TUNING_README.html#mailing_tips
http://www.postfix.org/QSHAPE_README.html
http://www.postfix.org/QSHAPE_README.html#backlog


  -- Noel Jones

Reply via email to