Re: [osol-help] 2009.06 up for one week - now I can't login over SSH or console

Thanassis Tsiodras Mon, 30 Nov 2009 04:25:45 -0800

OK, I couldn't wait any longer - so I did the dirty deed (hit the power off 
button).
Thankfully, the system did come up afterwards, and ... all seem fine.


I am, however, trying to decipher the logs to see what went wrong...
So far, it is clear that the system's memory got completely full - starting 
Friday night, and over the weekend, the /var/log/syslog has lines like this:

Nov 27 21:17:46 zeus sendmail[477]: [ID 702911 mail.info] runqueue: Skipping 
queue run -- fork() failed: Not enough space
Nov 27 21:33:01 zeus sendmail[477]: [ID 702911 mail.info] runqueue: Skipping 
queue run -- fork() failed: Not enough space

The services' logs (/var/svc/log) don't seem to have anything related to the 
disaster (other than the startup logs due to the ... forceful restart I did 
just now).

top -b shows that memory usage is...

last pid:  1055;  load avg:  0.47,  0.45,  0.34;  up 0+00:37:36        14:26:58
46 processes: 45 sleeping, 1 on cpu
CPU states: 83.0% idle,  0.2% user, 16.8% kernel,  0.0% iowait,  0.0% swap
Kernel: 4310 ctxsw, 9 trap, 3461 intr, 465 syscall, 9 flt
Memory: 3551M phys mem, 2614M free mem, 512M total swap, 512M free swap

...so it appears that swap is small, I'll enlarge it and hope that helps.

But what seriously bothers me is that I can't debug this disaster - I have no
idea why it happened, and no clue as to what to do to make sure it doesn't 
happen again - other than adding a cron job that checks memory usage every 
minute and logs it somewhere for safekeeping...

Very annoying ...
-- 
This message posted from opensolaris.org
_______________________________________________
opensolaris-help mailing list
[email protected]

Re: [osol-help] 2009.06 up for one week - now I can't login over SSH or console

Reply via email to