I know this topic has been flogged to death, and perhaps for good reason, but I'm trying to determine the best outbound high-volume ecosystem for Postfix.
As I understand it, the RELAYHOST parameter will allow an FQDN that, when bracketed, can skip MX lookup and just return the DNS result. If I use roundrobin A records for my mesh of MTAs out in my datacenters, I've got a reasonable randomization going on. If I use MX records of equal weight, Postfix will do the randomization (assuming I've not disabled that in general with the SMTP randomize param elsewhere.) I know that I could theoretically do some hijinx with a transport map, but that doesn't seem wise. So, the downside of direct addressing of the roundrobin A record of my MTAs might be a lack of fallback that would have been described in my MX record. But for HA outbound stuff, I've got bigger problems if one of my MTAs on the edge goes down, so let's ignore that for now. With that assumption, it seems like MX versus roundrobin A addressing of my MTAs is pretty much equally performant, save for the failover inherent in MX. However, should I consider backoff/retry requests from these MTAs as a winning proposition for letting DNS and MX do their thing properly? If so, it would seem that roundrobin A addressing is therefore considered harmful. Is there a difference in TTLs or anything else that I should consider from the origin Postfix server which might weigh in here? Obviously, true load balancing would be the best option, perhaps by leveraging RabbitMQ and some realtime health metrics for each relay would need to affect the routing from Postfix to that relay, and then we're getting into a completely different architecture here. Using MX and SMTP negotiation the way it was intended is a lot simpler, but I'm just looking to optimize everything I can here. Thanks in advance for any clarification.
