iostat will help with disk activity. vmstat might be a benefit as well both are pretty small, so you could set up a cronjob to run regularly and dump the output into a file for analysis later.
ps will be easier to manipulate from a command line process than top. Kev. ----- Original Message ----- From: "J. Rafael S�nchez" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Tuesday, February 11, 2003 11:04 AM Subject: Re: (clug-talk) Hi Kev - Programmer(s)/User(s) crashing my system. > Hi Kev, > Actually not completely. > My machine(s) have sorta settled down a bit because the 'inhouse' program(s) > are not being run as often. The question that's still in my mind is: is ps > and top the only tools available to keep an eye on processess and programs > in detail (how are they making use of memory? what type of loads they are > cousing on hard drives reads and writes etc.) > > Thanks for asking Kev, > Rafael. > > > J.Rafael.S�nchez > Itres Research Limited > www.itres.com > P.403.250.9944 > F.403.250.9916 > > ----- Original Message ----- > From: "Kevin Anderson" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Friday, February 07, 2003 11:13 AM > Subject: Re: (clug-talk) Programmer(s)/User(s) crashing my system. > > > > Was this ever resolved? > > > > (Cleanup day for me...) > > > > Kev. > > > > > > > > ----- Original Message ----- > > From: "J. Rafael S�nchez" <[EMAIL PROTECTED]> > > To: <[EMAIL PROTECTED]> > > Sent: Tuesday, November 12, 2002 4:07 PM > > Subject: (clug-talk) Programmer(s)/User(s) crashing my system. > > > > > > > Good day all, > > > I have user(s)/programmer(s)who are crashing one of my servers. > > > > > > Users have access to this RH 7.0 system over Xwin32 using XDMCP. > > > > > > System decription: 512Mg Ram, Dual Pentium III 933, 1GB swap file, 1GHZ > > > Ethernet Card. > > > > > > The crash(es) are so bad that when I go to the machine, I can't even log > > in, > > > no console access whatsoever; to the point that the only option is to > > "push > > > the on/off button". Of course after that I have to do manual e2fsck(s) > on > > > all my 6 180GB hard drives. > > > > > > I have been able to pinpoint that the system crashes is because they are > > > running a home-made program using IDL language over a gui > > interface/program > > > called ENVI. We deal with imagery a lot (huge files and outputs). Some > of > > > these programs have to break-up huge amounts of image-data into pieces, > do > > > some sort of processing on them and stitch them back together. > > > > > > It could have to do with the fact that the program(s) may not be using > the > > > resources efficiently, memory, 32bit file system limits (2GB file size > > > limits), etc, etc. > > > > > > I'd like to help them and myself by finding out what exactly is that > they > > > are doing or not doing. Is there a system utility or OS utility that I > can > > > use to monitor the system. I've used top. I've looked through the log > > files > > > but I cannot seem to find anything important to help me. > > > > > > The last few lines of my /var/log/messages file of today's crash: > > > > > > *** real name replaced by "thishost" > > > > > > Nov 12 14:00:01 thishost CROND[28389]: (root) CMD ( /sbin/rmmod -as) > > > Nov 12 14:01:00 thishost CROND[28391]: (root) CMD (run-parts > > > /etc/cron.hourly) > > > Nov 12 14:10:01 thishost CROND[28402]: (root) CMD ( /sbin/rmmod -as) > > > Nov 12 14:37:12 thishost syslogd 1.3-3: restart. > > > > > > Output of ls of /etc/cron.hourly > > > [root@thishost /etc]# ls -laF cron.hourly/ > > > total 16 > > > drwxr-xr-x 2 root root 4096 Apr 24 2002 ./ > > > drwxr-xr-x 56 root root 4096 Nov 12 15:20 ../ > > > -rwxr-xr-x 1 news news 65 Jul 24 2000 > inn-cron-nntpsend* > > > -rwxr-xr-x 1 news news 68 Jul 24 2000 inn-cron-rnews* > > > > > > Cat of inn-cron-nntpsend > > > [root@thishost /etc]# cat cron.hourly/inn-cron-nntpsend > > > #!/bin/sh > > > /sbin/chkconfig innd && su - news -c /usr/bin/nntpsend > > > > > > > > > Cat of inn-cron-rnews* > > > #!/bin/sh > > > /sbin/chkconfig innd && su - news -c '/usr/bin/rnews -U' > > > > > > > > > Would this be what's crashing my system? > > > > > > Any suggestion would be greatly appreciated. > > > > > > > > > Rafael. > > > > > > > > > +=+=+=+=+=+=+=+=+=+=+=+=+ > > > j.rafael.s�nchez > > > Systems Administrator > > > +=+=+=+=+=+=+=+=+=+=+=+=+ > > > Itres Research Limited > > > www.itres.com > > > Phone: 403.250.9944 > > > Fax: 403.250.9916 > > > +=+=+=+=+=+=+=+=+=+=+=+=+ > > > > > > > > > > > > > > > > > >
