Re: LoadShared Failover
Hi List, I have now looked all over the web to try and find best possible solution for me... (redundant loadshared sending-only mailgw)... this is currently what I think of doing...: 1. Setup 2 postfix servers in 2 physical different location with same configuration (handles by our HostConfigurationManagementSys tem). 2. DNS will be configured like: ; zone file fragment IN MX 10 mail.example.com . mailIN A 10.10.10.100 IN A 10.10.20.100 3. Clients will use mail.example.com as server. Only problem I see now is when one of the postfix servers dies. Clients will still try to send mails to it as they are DNS RR'ed, but would get no response ofcause if they hit the dead one. (How) Do I handle this ? or will I just have to live with the time-loss, clients connecting to dead postfix server, gives me when it has to retry ? I can compensate a bit by setting low DNS TTL (like 15 minutes) and remove dead DNS entry manually when our monitoring system alerts about port not responding - but would like to implement a real redundant system if at all possible... How do I do this - any howto I might have missed... ? Thanks in advance :) ! ~maymann 2012/3/28 Michael Maymann mich...@maymann.org Hi List, I have now looked all over the web to try and find best possible solution for me... (redundant loadshared sending-only mailgw)... this is currently what I think of doing...: 1. Setup 2 postfix servers in 2 physical different location with same configuration (handles by our HostConfigurationManagementSystem). 2. DNS will be configured like: ; zone file fragment IN MX 10 mail.example.com . mailIN A 10.10.10.100 IN A 10.10.20.100 3. Clients will use mail.example.com as server. Only problem I see now is when one of the postfix servers dies. Clients will still try to send mails to it as they are DNS RR'ed, but would get no response ofcause if they hit the dead one. (How) Do I handle this ? or will I just have to live with the time-loss, clients connecting to dead postfix server, gives me when it has to retry ? I can compensate a bit by setting low DNS TTL (like 15 minutes) and remove dead DNS entry manually when our monitoring system alerts about port not responding - but would like to implement a real redundant system if at all possible... How do I do this - any howto I might have missed... ? Thanks in advance :) ! ~maymann 2012/3/13 Stan Hoeppner s...@hardwarefreak.com On 3/12/2012 1:29 PM, Michael Maymann wrote: Hi, Stan: thanks for your reply. I was talking about NIC bonding: http://www.howtoforge.com/nic_bonding But if that is not the way to go, then that won't matter anymore... and no need for RedHat support either... NIC bonding isn't applicable to your dual relay host scenario. I'm a simple SMTP/PostFix beginner and just trying to learn as I go along - thought the mailinglist would be a good offset to get some initial answers so I can start looking in the right places - first things first... :) ! You have it backwards. The Postfix mailing list is a last resort resource and is meant more for troubleshooting that system design assistance or education. You are expected to read all applicable Postfix and RFC/BCP documentation and troubleshoot issues until you are sure you cannot resolve them on your own. *Then* post a help query on the Postfix list. It is not a teaching resource. Please don't treat it as such. If RR DNS is the way forward, then I guess I would need to configure: ; zone file fragment IN MX 10 mail.example.com. mailIN A 192.168.0.4 IN A 192.168.0.5 and point all my MUA's to mail.example.com Just to try and understand better how this communication would be working: 1. Does the MUAs then just retry if it doesn't get answer from one of the MTAs ? 2. If so, will this then always generate a new nslookup / will it use a cache / do I need to configure this on the MUA's ? 3. Is there a default number of retries (and does this differentiate from MUA to MUA) or are they just queued forever on the MUAs until properly delivered to a responsive MTA ? See the bind manual, or the manual of whichever DNS server daemon you happen to be using, and other applicable guides to round robin DNS. -- Stan
Re: LoadShared Failover
Michael Maymann: ; zone file fragment IN MX 10 mail.example.com . mailIN A 10.10.10.100 IN A 10.10.20.100 3. Clients will use mail.example.com as server. Only problem I see now is when one of the postfix servers dies. Clients will still try to send mails to it as they are DNS RR'ed, but would get no response ofcause if they hit the dead one. In that case the client should try the other IP address. Wietse
RE: LoadShared Failover
From: owner-postfix-us...@postfix.org [mailto:owner-postfix-us...@postfix.org] On Behalf Of Michael Maymann Sent: Thursday, March 29, 2012 4:01 AM To: postfix-users@postfix.org Subject: Re: LoadShared Failover Hi List, Only problem I see now is when one of the postfix servers dies. Clients will still try to send mails to it as they are DNS RR'ed, but would get no response ofcause if they hit the dead one. (How) Do I handle this ? or will I just have to live with the time-loss, clients connecting to dead postfix server, gives me when it has to retry ? [Aaron Bennett] Or buy a commercial load balancer, or build one out of something like the linux-ha project (http://www.linux-ha.org/wiki/Main_Page).
Re: LoadShared Failover
Hi, Stan: My question is not how I setup the solution, but how I *BEST* (best practice) setup the loadshared/failover postfix solution I described earlier. If there isn't a nice howto already, I guess I can figure this out myself - bonding is easy, if this is the prefered solution for a postfix install like mine - but if it is: how do you cope with the question I asked earlier: - How do I solve client-server communication, when requests will not get answered from same IP - or can it be - and if so: how do I do this, is there a how-to on setting this up on RHEL6 ? Would just like to hear the lists opinion before going in any specific direction, and figuring out that was the wrong one...:) ! Best regards ~maymann 2012/3/12 Stan Hoeppner s...@hardwarefreak.com On 3/10/2012 8:30 AM, Michael Maymann wrote: How do I best setup a loadshared failover postfix mailrelay solution for this on RHEL6 ? You consult the RHEL6 documentation. If you don't find the answer there, you contact Red Hat support who will point you in the right direction. Isn't this why you use a paid commercial Linux distro? -- Stan
Re: LoadShared Failover
On 3/12/2012 2:28 AM, Michael Maymann wrote: Hi, Stan: My question is not how I setup the solution, but how I *BEST* (best practice) setup the loadshared/failover postfix solution I described earlier. I dunno if there is a BCP covering smtp submission/relay server load balancing/fail over. I'd make an educated guess that just about everyone with more than one submission/relay server is using round robin DNS. If there isn't a nice howto already, I guess I can figure this out myself - There are many. You can Google faster than I can point you to lmgtfy. I'd have thought you'd have already done so... bonding is easy, if this is the prefered solution for a postfix install What kind of bonding are you referring to here? like mine - but if it is: how do you cope with the question I asked earlier: Like yours? You have two outbound submission/relay servers, correct? Nothing unique here. - How do I solve client-server communication, when requests will not get answered from same IP - or can it be - and if so: how do I do this, is You're over thinking this. there a how-to on setting this up on RHEL6 ? My point was that you've already paid for support. Simply call and ask RHEL support the question you're asking here. Surely they'd point you in the right direction. Would just like to hear the lists opinion before going in any specific direction, and figuring out that was the wrong one...:) ! This question doesn't come up very often. When it does the OP is working at scale (think dozens of relays) and he's after parallel performance optimization, not simple fail over redundancy. The extremely low frequency of this question should tell you something about the solution people are using, and the level of difficulty required to implement it. I.e. this requirement is mundane, has been around forever, as has the solution, which is round robin DNS. -- Stan
Re: LoadShared Failover
Stan Hoeppner wrote: On 3/12/2012 2:28 AM, Michael Maymann wrote: Hi, Stan: My question is not how I setup the solution, but how I *BEST* (best practice) setup the loadshared/failover postfix solution I described earlier. I dunno if there is a BCP covering smtp submission/relay server load balancing/fail over. I'd make an educated guess that just about everyone with more than one submission/relay server is using round robin DNS. *raises hand* We're using Linux Virtual Servers for load-balancing all public-facing bits (and a handful of internal bits) of our entire mail cluster. Once you beat the ARP configuration into shape to prevent a random real server from taking over the load-balanced IP it seems to work well. We found that DNS-based round-robin strategies didn't actually balance the load very well. -kgd
Re: LoadShared Failover
Kris Deugau: We found that DNS-based round-robin strategies didn't actually balance the load very well. This looks like the same problem that was found (and solved) with Postfix outbound connection caching; if a destination host became slow for whatever reason, it became a fatal attractor for connections. For example, twice as slow - twice as many clients. With outbound connection caching, this was solved in the Postfix SMTP client, by limiting the total duration of an SMTP session. For example, twice as slow - half the number of sessions With inbound SMTP, it is not possible to tell clients to go somewhere else except by interposition, for example with an layer-3 proxy (nginx), with a layer 2 switch/nat/etc, or by interposing at the DNS level (adjust DNS replies according to server load). I don't know if the last is in use for SMTP. Wietse
Re: LoadShared Failover
There is one correction, in-line. Kris Deugau: We found that DNS-based round-robin strategies didn't actually balance the load very well. This looks like the same problem that was found (and solved) with Postfix outbound connection caching; if a destination host became slow for whatever reason, it became a fatal attractor for connections. For example, twice as slow - twice as many clients. With outbound connection caching, this was solved in the Postfix SMTP client, by limiting the total duration of an SMTP session. For example, twice as slow - half the number of sessions Correction: half the number of deliveries. With inbound SMTP, it is not possible to tell clients to go somewhere else except by interposition, for example with an layer-3 proxy (nginx), with a layer 2 switch/nat/etc, or by interposing at the DNS level (adjust DNS replies according to server load). I don't know if the last is in use for SMTP. Wietse
Re: LoadShared Failover
On 3/10/2012 8:30 AM, Michael Maymann wrote: How do I best setup a loadshared failover postfix mailrelay solution for this on RHEL6 ? You consult the RHEL6 documentation. If you don't find the answer there, you contact Red Hat support who will point you in the right direction. Isn't this why you use a paid commercial Linux distro? -- Stan
Re: LoadShared Failover
If RoundRobin is best practise/preferred solution, should I then do: ; zone file fragment IN MX 10 mail.example.com. mailIN A 192.168.0.4 IN A 192.168.0.5 IN A 192.168.0.6 or ; zone file fragment IN MX 10 mail.example.com. IN MX 10 mail1.example.com. IN MX 10 mail2.example.com. mail IN A 192.168.0.4 mail1 IN A 192.168.0.5 mail2 IN A 192.168.0.6 I think I would prefer the first solution - as a single hostname can be distributed to endusers. Will this automatically interfere with our corporate mail on the same domain - is there anything DHCP/DNS MX-to-clients update-wise I should be aware of ? Thanks in advance :) ! ~maymann 2012/3/10 Michael Maymann mich...@maymann.org Hi List, I would like to setup a LoadShared Failover internal mail-relay solution (only for sending mail internal-external). My thoughts: - Setup virtual+physical server in same VLAN (different physical locations) with same OS+Postfix+config - Configure DNS RoundRobin - Have logging from both servers pointing to same NFS-dir and have awstats create statistics from there Internal traffic: - Requests would all be received on RoundRobin_IP, and therefore LoadShared between the servers - Answers would all be send through Server_IP External traffic: - All traffic is done through Server_IP 1. Are the clients ok with answers coming from different IP than send-to ... or how do I prevent this from disrupting client-server communication - some PostFix/other magic ?) 2. What happens if one of my servers dies. Will RoundRobin still try to send traffic to it, and if so how will clients react on this ? 3. Would Bonding be a better solution for my purpose ? 4. Is there already a RHEL6 howto somewhere, that you can recommend ? 5. What is best practice ? Thanks in advance :-) ! ~maymann
Re: LoadShared Failover
Michael Maymann: If RoundRobin is best practise/preferred solution, should I then do: ; zone file fragment IN MX 10 mail.example.com. mailIN A 192.168.0.4 IN A 192.168.0.5 IN A 192.168.0.6 or ; zone file fragment IN MX 10 mail.example.com. IN MX 10 mail1.example.com. IN MX 10 mail2.example.com. mail IN A 192.168.0.4 mail1 IN A 192.168.0.5 mail2 IN A 192.168.0.6 I think I would prefer the first solution - as a single hostname can be distributed to endusers. MX lookups are for MTAs, end-user mail clients should connect to the A record on port 587. Wietse
Re: LoadShared Failover
Hi, Wietse: thanks for your quick reply :) ! We have the following internal clients: - RD Linux sendmail clients - some special_home_brew websolutions that endusers maintain - NetApp storage systems - etc. Mail path: Internal_clients-my_postfix_mailrelay(s)-external_receiving_mailserver We're not receiving any external mails...! How do I best setup a loadshared failover postfix mailrelay solution for this on RHEL6 ? thanks in advance :-) ! ~maymann 2012/3/10 Wietse Venema wie...@porcupine.org Michael Maymann: If RoundRobin is best practise/preferred solution, should I then do: ; zone file fragment IN MX 10 mail.example.com. mailIN A 192.168.0.4 IN A 192.168.0.5 IN A 192.168.0.6 or ; zone file fragment IN MX 10 mail.example.com. IN MX 10 mail1.example.com. IN MX 10 mail2.example.com. mail IN A 192.168.0.4 mail1 IN A 192.168.0.5 mail2 IN A 192.168.0.6 I think I would prefer the first solution - as a single hostname can be distributed to endusers. MX lookups are for MTAs, end-user mail clients should connect to the A record on port 587. Wietse
Re: LoadShared Failover
Michael Maymann: How do I best setup a loadshared failover postfix mailrelay solution for this on RHEL6 ? To repeat my previous response: - MX records are useful only for MTAs. - If you have end-user clients, use A records. Perhaps surprisingly, that response still stands. The mail protocols and implementations haven't changed in the few minutes since I first replied. It is up to you to decide what is best for your network. Wietse
Re: LoadShared Failover
Den 2012-03-10 09:47, Michael Maymann skrev: ; zone file fragment IN MX 10 mail.example.com. mail IN A 192.168.0.4 IN A 192.168.0.5 IN A 192.168.0.6 dont list rfc1918 ip in mx, but if its just a question on model, go for this solution
Re: LoadShared Failover
Hi, Wietse: always nice with a bit of humor... :) ! I guess I then only need A records, as this will be our only mailserver inhouse for RD. Benny: I guess this is not needed then, but just out of curiosity: for a internal sending-only mailrelay why can't I use RFC1918 IPs ? 1. Is best practice to set this up with bonding then ? 2. How do I solve client-server communication, when requests will not get answered from same IP - or can it be - and if so: how do I do this, is there a how-to on setting this up on RHEL6 ? Thanks in advance :-) ! ~maymann 2012/3/10 Benny Pedersen m...@junc.org Den 2012-03-10 09:47, Michael Maymann skrev: ; zone file fragment IN MX 10 mail.example.com. mail IN A 192.168.0.4 IN A 192.168.0.5 IN A 192.168.0.6 dont list rfc1918 ip in mx, but if its just a question on model, go for this solution