Michael Haardt wrote: > On Wed, Jan 10, 2007 at 04:26:29PM +0100, Florian Weimer wrote: > >> If you want to throw money at the problem, a RAID controller with a >> battery-backed cache is a good option as well. >> > > You completely miss the point, so let me rephrase it: I am _not_ > talking about regular operation. I am talking about cleaning up a mess, > e.g. after an attack or double/triple fault that managed to kill all > redundancy. Additionally, exotic applications benefit from disabling > fsync(). > > It's not economical to run systems at 10% of their maximum performance > just to have enough if shit happens, unless of course you just run a > small site, where the economic disadvantage of doing so can be tolerated. >
Errrrrr. I am somewhat concerned about your last statement. I run the mail system for the University here, which isn't really a big site, but we see over a million attempts to deliver mail a day which translates into about 46,000 real mail messages after greylisting. We have internal mail servers which accept email from local users and handle all internal communications and we have a pair of external mail servers which talk to the outside world. Our mail servers are running at a fraction of their capacity just because bad things happen too often. All it takes is some annoying spammer out on the internet to use one of our users as a fake "From" address and we will see hundreds of thousands of error messages heading our way. We've also seen cases of a trojan getting on a local users PC which has then sent hundreds of thousands of email messages off site. We've also has cases where our ISP, or the firewalls, or some other system admin type mistake has taken us down for a weekend which means we get three days of email on Monday. So we do always plan for the unexpected and even though a mess happens several times a year I don't need to do anything to fix it. I have tried to run a mail system in the way that you are trying to and I'm very happy that we have the resources here to run ours with lots of spare capacity because it makes my life simpler. Having said that here's what I used to do. 1. Find a way to stop whatever was generating the mess. 2. Move the input queue out of the way and restart exim That at least gets you to the point where current email is flowing. 3. Move the valid email back into the real queue Easier said than done, but judicious use of "grep" on the header files usually results in a short list of real email and then its just a case of moving the header and data files back into the normal queue space. 4. Delete the old queue that is now full of junk. If you have more than one mail server then you could take the queue onto another system and run it there rather than slowing down the main server. In theory you could move it onto a tmpfs filesystem and perform an exim queue run specifically on that input queue to avoid the fsync() delays. Jon. -- ## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
