hello
today i found 9grid plan9 under heavy load, stats reports load ~2000, syscall
~60000, context ~22000, i was trying to discover which proc has gone crazy, but
i can't even complete a ps. I can do other operations, such as sending this
email over drawterm, run stats, netstat, read the logs, etc. but i can't run
ps, or any other /proc related tool, i can't kill/Kill/slay anything.
I can ls /proc
cpu% ls -l | wc -l
573
something like
cpu% for(i in `{ls}) {echo -n 'PID ' $i 'has status. . . '; cat $i/status | wc
-c }
[....]
PID 1944693 has status. . . 176
PID 1944698 has status. . . 176
PID 1944699 has status. . . 176
PID 1944700 has status. . . 176
PID 1944707 has status. . .
and here ends, i can't know which process is that nor kill it.
I can ls it:
cpu% ls -l /proc/1944707/
--rw-rw---- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/args
--rw-r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/ctl
--r--r--r-- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/fd
--rw-r----- p 0 offending_user bootes 108 Dec 1 2008 /proc/1944707/fpregs
--r--r----- p 0 offending_user bootes 76 Dec 1 2008 /proc/1944707/kregs
--rw-r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/mem
--rw-r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/note
--rw-rw-r-- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/noteid
--rw-r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/notepg
--r--r--r-- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/ns
--r--r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/proc
--r--r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/profile
--rw-r----- p 0 offending_user bootes 76 Dec 1 2008 /proc/1944707/regs
--r--r--r-- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/segment
--r--r--r-- p 0 offending_user bootes 176 Dec 1 2008 /proc/1944707/status
--rw-r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/text
--r--r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/wait
i can't either chmod those files. (is that date normal? seems all /proc is
with that date :?)
any tip on how to solve this without rebooting?
thanks!!
gabi