Nathan Kennedy wrote: >> Fyodor was unresponsive for a while today. This morning around 7:00 EST I >> found load averages upwards of 30, and just now the system was totally >> unresponsive. I seem to have caught it at the worst time when ssh and >> http were totally unavailable for a while. >> >> [EMAIL PROTECTED]:~$ uptime >> 14:57:48 up 3 days, 12:51, 11 users, load average: 101.25, 144.79, 92.69 >> >> spamd seems to be working pretty hard still but the load average is >> dropping down to 13 now. Whatever was causing the system to become >> unresponsive must have finished or gotten killed. >> >> >> I'd like to hear if anyone has ideas about why this happened to see if we >> can prevent it on the new server configuration. >> > > Argh, did you run ps or top? It would seem unlikely that spamd alone > could bring fyodor to its knees, but perhaps we got a particularly bad > barrage? I don't know. I personally enabled spamassassin on my account > recently and have added my extensive spam traffic to Spamassassin's load, > but the extra work that Spamassassin does is offset by the reduced imap > work. > > This morning I noticed lots of mysql and php processes, especially on tanveer's user account. I would have liked to correlate this to a hit rate on a particular page but didn't have time to do this. Running vmstat later today showed about 10 processes blocked, perhaps waiting for disk I/O. I have said this before but I really think that the disk is the bottleneck on fyodor and everything chokes while waiting for data to be read or written to the disk. This could account for the super-high load averages, I think (although I don't know exactly how load averages are calculated). We could perhaps look at average iostat rates and those during these times when fyodor is bogged down to see if this is the case.
> I set up my .forward file to send all spam > 9.0 to /dev/null, as a result > a large proportion of my email is never even delivered to a mailbox. > > However, this may indicate that in the further future we will probably > want to have mail and web on separate servers. Mail is a batch process > whereas web transactions should be highly responsive. But immediately > speaking performance should be substantially better at Peer1, and from the > last couple emails I received things are pretty much ready to go. > > Yeah, this is a good point. A dedicated mail server might be necessary in the future given current levels of spam. Chances are the two servers will also have to be configured and tuned differently for the type of work that each does (i.e., the mail server probably needs fast IO while the web/database/file servers may need huge amounts of RAM). We had better start putting some money in the HCoop piggy bank! _______________________________________________ HCoop-SysAdmin mailing list [email protected] http://hcoop.net/cgi-bin/mailman/listinfo/hcoop-sysadmin
