Re: [vchkpw] My single point of failure... failed
DAve wrote: Tren Blackburn wrote: Hi DAve; -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, October 05, 2007 11:39 AM To: vpopmail Subject: [vchkpw] My single point of failure... failed I got bit hard this morning and I am looking for a solution. I have been slowly getting our email system up to snuff moving from a pair of servers to two gateway AV scanners, three vpopmail toasters, and two outbound qmail servers. The toasters mount the Maildirs via NFS, the AV scanners talk to the toasters via milter-ahead, and the NFS mailstore hosts MySQL for vpopmail. I've just gotten load balancers installed and moved the outbound traffic there first, getting a good load test on vpopmaild for smtp-auth. I had promised to provide the scripts and now I am actually seeing how well they work. Problems arose when my NFS server went stupid this morning and all mail stopped. AV scanners couldn't verify mailboxes because the toasters couldn't see MySQL, the outbound servers couldn't do smtp-auth for the same reason. It wouldn't have mattered anyway because my Maildirs were offline. NFS is my single point of failure, even though it is RAID5, dual NIC, dual power supply (SUN Enterprise 250), it went offline. I need to fix that, I can cluster MySQL but I am looking for ways to have either a clustered NFS with rw permissions and appropriate locking/syncing, or NFS failover from the toasters. I am looking at GFS and active/active NFS and HaNFS. Has anyone gone down this path yet? I have. There's a couple ways of doing this. I've never played with GFS so I can't comment on that. The easiest solution I've found is doing an Active/Standby configuration between 2 nodes using DRBD to replicate the data in real time. There's quite a few solutions out there to handle resource seizure on node failure. If you want absolutely simple, go heartbeat v1. If you want to break your mailstore into 2 pieces (I have no idea how large of a mailstore you're working with. Mine is breaking 70G pretty soon) then you can do an Active/Active configuration using the High Availability manager from LinuxHA.net. I like that product mainly because it's written specifically for 2 node active/active clusters. And if you really want to muddy the waters, you can go with heartbeat v2 (I still have a bad taste in my mouth from it though) It's always best to keep major components on their own sets of boxen. My MySQL servers are a 2 node load balanced multi-master replicated pair. My Mailstore is a 2 node Active/Passive pair as described above (I cheat a bit and do some iSCSI exports on the "passive" box to the Windows people who demanded I share my storage with them. It's also handled by the HA software, so if the box exporting the iSCSI targets goes down, it shuffles across to the NFS box, and vice-versa) My inbound/outbound SMTP is across 4 dedicated load-balanced boxen. IMAP4(s)/POP3(s) is on its own pair and same with Web. If any of this seems useful to you let me know. No one should have to go through the nightmare of a key server going down. I hate getting yelled at. :) I am at least on the right or similar track. Here is some more background. Currently the gateways run MailScanner/sendmail/spamassassin/clamav/bitdefender, we have vpopmail/chkuser on the eclusters (toasters) providing pop and webmail, and the outbound servers provide smtp and smtp-auth (to become smtp-auth only) also running spamassassin and clamav via simscan. Everything sits behind a PIX and everything will eventually sit behind two Coyote Point EQ350si devices. Right now only the outbound servers are being load balanced. I am liking the look of HaNFS and DRDB but I have to look toward the future which involves sending half my mail system to a remote NOC. We have a dedicated 1GB fiber to provide a private LAN between the NOCs. My concern is over resyncing the mailstores after a fiber failure, which I KNOW will happen sooner or later. Not real sure if active/active or active/passive will be the best option, resyncing in general doesn't look inviting. My mailstore is only 60GB, few clients use webmail, most download everything all day. But it would certainly be a concern. When I setup MySQL as a cluster I will also be installing a local RO slave on each ecluster (toaster), just for auth purposes. I am assuming you found no problems running vpopmail/qmail on your mailstores? How do you handle failover? Any problems with qmail-local during deliveries? Thanks for the response. DAve This is my setup, it seems to work fairly well. I was using NFS for the mail stores at one point but because I couldn't get a handle on my performance problems I dropped it and put the mail stores on the local machine. I have two machines with two drives in each machine. Disk sda1 on each machine is the OS, sda2 is config
Re: [vchkpw] My single point of failure... failed
Tren Blackburn wrote: Hi DAve; -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, October 05, 2007 11:39 AM To: vpopmail Subject: [vchkpw] My single point of failure... failed I got bit hard this morning and I am looking for a solution. I have been slowly getting our email system up to snuff moving from a pair of servers to two gateway AV scanners, three vpopmail toasters, and two outbound qmail servers. The toasters mount the Maildirs via NFS, the AV scanners talk to the toasters via milter-ahead, and the NFS mailstore hosts MySQL for vpopmail. I've just gotten load balancers installed and moved the outbound traffic there first, getting a good load test on vpopmaild for smtp-auth. I had promised to provide the scripts and now I am actually seeing how well they work. Problems arose when my NFS server went stupid this morning and all mail stopped. AV scanners couldn't verify mailboxes because the toasters couldn't see MySQL, the outbound servers couldn't do smtp-auth for the same reason. It wouldn't have mattered anyway because my Maildirs were offline. NFS is my single point of failure, even though it is RAID5, dual NIC, dual power supply (SUN Enterprise 250), it went offline. I need to fix that, I can cluster MySQL but I am looking for ways to have either a clustered NFS with rw permissions and appropriate locking/syncing, or NFS failover from the toasters. I am looking at GFS and active/active NFS and HaNFS. Has anyone gone down this path yet? I have. There's a couple ways of doing this. I've never played with GFS so I can't comment on that. The easiest solution I've found is doing an Active/Standby configuration between 2 nodes using DRBD to replicate the data in real time. There's quite a few solutions out there to handle resource seizure on node failure. If you want absolutely simple, go heartbeat v1. If you want to break your mailstore into 2 pieces (I have no idea how large of a mailstore you're working with. Mine is breaking 70G pretty soon) then you can do an Active/Active configuration using the High Availability manager from LinuxHA.net. I like that product mainly because it's written specifically for 2 node active/active clusters. And if you really want to muddy the waters, you can go with heartbeat v2 (I still have a bad taste in my mouth from it though) It's always best to keep major components on their own sets of boxen. My MySQL servers are a 2 node load balanced multi-master replicated pair. My Mailstore is a 2 node Active/Passive pair as described above (I cheat a bit and do some iSCSI exports on the "passive" box to the Windows people who demanded I share my storage with them. It's also handled by the HA software, so if the box exporting the iSCSI targets goes down, it shuffles across to the NFS box, and vice-versa) My inbound/outbound SMTP is across 4 dedicated load-balanced boxen. IMAP4(s)/POP3(s) is on its own pair and same with Web. If any of this seems useful to you let me know. No one should have to go through the nightmare of a key server going down. I hate getting yelled at. :) I am at least on the right or similar track. Here is some more background. Currently the gateways run MailScanner/sendmail/spamassassin/clamav/bitdefender, we have vpopmail/chkuser on the eclusters (toasters) providing pop and webmail, and the outbound servers provide smtp and smtp-auth (to become smtp-auth only) also running spamassassin and clamav via simscan. Everything sits behind a PIX and everything will eventually sit behind two Coyote Point EQ350si devices. Right now only the outbound servers are being load balanced. I am liking the look of HaNFS and DRDB but I have to look toward the future which involves sending half my mail system to a remote NOC. We have a dedicated 1GB fiber to provide a private LAN between the NOCs. My concern is over resyncing the mailstores after a fiber failure, which I KNOW will happen sooner or later. Not real sure if active/active or active/passive will be the best option, resyncing in general doesn't look inviting. My mailstore is only 60GB, few clients use webmail, most download everything all day. But it would certainly be a concern. When I setup MySQL as a cluster I will also be installing a local RO slave on each ecluster (toaster), just for auth purposes. I am assuming you found no problems running vpopmail/qmail on your mailstores? How do you handle failover? Any problems with qmail-local during deliveries? Thanks for the response. DAve -- Three years now I've asked Google why they don't have a logo change for Memorial Day. Why do they choose to do logos for other non-international holidays, but nothing for Veterans? Maybe they forgot who made that choice possible.
Re: [vchkpw] My single point of failure... failed
DAve wrote (at Fri, Oct 05, 2007 at 02:39:21PM -0400): > I am looking at GFS and active/active NFS and HaNFS. Has anyone gone > down this path yet? I haven't yet traveled this path, but I have strongly considered it. Please let us know what you come up with and how it works out. -- -- Casey Zacek Network Services NeoSpire, Inc. 1807 Ross Ave., Ste. 300 Dallas, TX 75201 www.neospire.net -- Managed Hosting Solutions P. 214-468-0768 F. 214-720-1836 --
RE: [vchkpw] My single point of failure... failed
Hi DAve; > -Original Message- > From: DAve [mailto:[EMAIL PROTECTED] > Sent: Friday, October 05, 2007 11:39 AM > To: vpopmail > Subject: [vchkpw] My single point of failure... failed > > I got bit hard this morning and I am looking for a solution. I have > been > slowly getting our email system up to snuff moving from a pair of > servers to two gateway AV scanners, three vpopmail toasters, and two > outbound qmail servers. The toasters mount the Maildirs via NFS, the AV > scanners talk to the toasters via milter-ahead, and the NFS mailstore > hosts MySQL for vpopmail. > > I've just gotten load balancers installed and moved the outbound > traffic > there first, getting a good load test on vpopmaild for smtp-auth. I had > promised to provide the scripts and now I am actually seeing how well > they work. > > Problems arose when my NFS server went stupid this morning and all mail > stopped. AV scanners couldn't verify mailboxes because the toasters > couldn't see MySQL, the outbound servers couldn't do smtp-auth for the > same reason. It wouldn't have mattered anyway because my Maildirs were > offline. NFS is my single point of failure, even though it is RAID5, > dual NIC, dual power supply (SUN Enterprise 250), it went offline. > > I need to fix that, I can cluster MySQL but I am looking for ways to > have either a clustered NFS with rw permissions and appropriate > locking/syncing, or NFS failover from the toasters. > > I am looking at GFS and active/active NFS and HaNFS. Has anyone gone > down this path yet? I have. There's a couple ways of doing this. I've never played with GFS so I can't comment on that. The easiest solution I've found is doing an Active/Standby configuration between 2 nodes using DRBD to replicate the data in real time. There's quite a few solutions out there to handle resource seizure on node failure. If you want absolutely simple, go heartbeat v1. If you want to break your mailstore into 2 pieces (I have no idea how large of a mailstore you're working with. Mine is breaking 70G pretty soon) then you can do an Active/Active configuration using the High Availability manager from LinuxHA.net. I like that product mainly because it's written specifically for 2 node active/active clusters. And if you really want to muddy the waters, you can go with heartbeat v2 (I still have a bad taste in my mouth from it though) It's always best to keep major components on their own sets of boxen. My MySQL servers are a 2 node load balanced multi-master replicated pair. My Mailstore is a 2 node Active/Passive pair as described above (I cheat a bit and do some iSCSI exports on the "passive" box to the Windows people who demanded I share my storage with them. It's also handled by the HA software, so if the box exporting the iSCSI targets goes down, it shuffles across to the NFS box, and vice-versa) My inbound/outbound SMTP is across 4 dedicated load-balanced boxen. IMAP4(s)/POP3(s) is on its own pair and same with Web. If any of this seems useful to you let me know. No one should have to go through the nightmare of a key server going down. I hate getting yelled at. :) HTH, Tren
[vchkpw] My single point of failure... failed
I got bit hard this morning and I am looking for a solution. I have been slowly getting our email system up to snuff moving from a pair of servers to two gateway AV scanners, three vpopmail toasters, and two outbound qmail servers. The toasters mount the Maildirs via NFS, the AV scanners talk to the toasters via milter-ahead, and the NFS mailstore hosts MySQL for vpopmail. I've just gotten load balancers installed and moved the outbound traffic there first, getting a good load test on vpopmaild for smtp-auth. I had promised to provide the scripts and now I am actually seeing how well they work. Problems arose when my NFS server went stupid this morning and all mail stopped. AV scanners couldn't verify mailboxes because the toasters couldn't see MySQL, the outbound servers couldn't do smtp-auth for the same reason. It wouldn't have mattered anyway because my Maildirs were offline. NFS is my single point of failure, even though it is RAID5, dual NIC, dual power supply (SUN Enterprise 250), it went offline. I need to fix that, I can cluster MySQL but I am looking for ways to have either a clustered NFS with rw permissions and appropriate locking/syncing, or NFS failover from the toasters. I am looking at GFS and active/active NFS and HaNFS. Has anyone gone down this path yet? Thanks, DAve -- Three years now I've asked Google why they don't have a logo change for Memorial Day. Why do they choose to do logos for other non-international holidays, but nothing for Veterans? Maybe they forgot who made that choice possible.