Nathan Kennedy wrote:
>> Fyodor was unresponsive for a while today.  This morning around 7:00 EST I
>> found load averages upwards of 30, and just now the system was totally
>> unresponsive.  I seem to have caught it at the worst time when ssh and
>> http were totally unavailable for a while.
>>
>> [EMAIL PROTECTED]:~$ uptime
>>  14:57:48 up 3 days, 12:51, 11 users,  load average: 101.25, 144.79, 92.69
>>
>> spamd seems to be working pretty hard still but the load average is
>> dropping down to 13 now.  Whatever was causing the system to become
>> unresponsive must have finished or gotten killed.
>>
>>
>> I'd like to hear if anyone has ideas about why this happened to see if we
>> can prevent it on the new server configuration.
>>     
>
> Argh, did you run ps or top?  It would seem unlikely that spamd alone
> could bring fyodor to its knees, but perhaps we got a particularly bad
> barrage?  I don't know.  I personally enabled spamassassin on my account
> recently and have added my extensive spam traffic to Spamassassin's load,
> but the extra work that Spamassassin does is offset by the reduced imap
> work.
>
>   
This morning I noticed lots of mysql and php processes, especially on 
tanveer's user account.  I would have liked to correlate this to a hit 
rate on a particular page but didn't have time to do this.  Running 
vmstat later today showed about 10 processes blocked, perhaps waiting 
for disk I/O.  I have said this before but I really think that the disk 
is the bottleneck on fyodor and everything chokes while waiting for data 
to be read or written to the disk.  This could account for the 
super-high load averages, I think (although I don't know exactly how 
load averages are calculated).  We could perhaps look at average iostat 
rates and those during these times when fyodor is bogged down to see if 
this is the case.

> I set up my .forward file to send all spam > 9.0 to /dev/null, as a result
> a large proportion of my email is never even delivered to a mailbox.
>
> However, this may indicate that in the further future we will probably
> want to have mail and web on separate servers.  Mail is a batch process
> whereas web transactions should be highly responsive.  But immediately
> speaking performance should be substantially better at Peer1, and from the
> last couple emails I received things are pretty much ready to go.
>
>   
Yeah, this is a good point.  A dedicated mail server might be necessary 
in the future given current levels of spam.  Chances are the two servers 
will also have to be configured and tuned differently for the type of 
work that each does (i.e., the mail server probably needs fast IO while 
the web/database/file servers may need huge amounts of RAM).  We had 
better start putting some money in the HCoop piggy bank!

_______________________________________________
HCoop-SysAdmin mailing list
[email protected]
http://hcoop.net/cgi-bin/mailman/listinfo/hcoop-sysadmin

Reply via email to