Good day all,
I have user(s)/programmer(s)who are crashing one of my servers.

Users have access to this RH 7.0 system over Xwin32 using XDMCP.

System decription: 512Mg Ram, Dual Pentium III 933, 1GB swap file, 1GHZ
Ethernet Card.

The crash(es) are so bad that when I go to the machine, I can't even log in,
no console access whatsoever; to the point that the only option is to "push
the on/off button". Of course after that I have to do manual e2fsck(s) on
all my 6 180GB hard drives.

I have been able to pinpoint that the system crashes is because they are
running a home-made program using IDL language over a gui interface/program
called ENVI. We deal with imagery a lot (huge files and outputs). Some of
these programs have to break-up huge amounts of image-data into pieces, do
some sort of processing on them and stitch them back together.

It could have to do with the fact that the program(s) may not be using the
resources efficiently, memory, 32bit file system limits (2GB file size
limits), etc, etc.

I'd like to help them and myself by finding out what exactly is that they
are doing or not doing. Is there a system utility or OS utility that I can
use to monitor the system. I've used top. I've looked through the log files
but I cannot seem to find anything important to help me.

The last few lines of my /var/log/messages file of today's crash:

*** real name replaced by "thishost"

Nov 12 14:00:01 thishost CROND[28389]: (root) CMD (   /sbin/rmmod -as)
Nov 12 14:01:00 thishost CROND[28391]: (root) CMD (run-parts
/etc/cron.hourly)
Nov 12 14:10:01 thishost CROND[28402]: (root) CMD (   /sbin/rmmod -as)
Nov 12 14:37:12 thishost syslogd 1.3-3: restart.

Output of ls of /etc/cron.hourly
[root@thishost /etc]# ls -laF cron.hourly/
total 16
drwxr-xr-x    2 root     root         4096 Apr 24  2002 ./
drwxr-xr-x   56 root     root         4096 Nov 12 15:20 ../
-rwxr-xr-x    1 news     news           65 Jul 24  2000 inn-cron-nntpsend*
-rwxr-xr-x    1 news     news           68 Jul 24  2000 inn-cron-rnews*

Cat of inn-cron-nntpsend
[root@thishost /etc]# cat cron.hourly/inn-cron-nntpsend
#!/bin/sh
/sbin/chkconfig innd && su - news -c /usr/bin/nntpsend


Cat of inn-cron-rnews*
#!/bin/sh
/sbin/chkconfig innd && su - news -c '/usr/bin/rnews -U'


Would this be what's crashing my system?

Any suggestion would be greatly appreciated.


Rafael.


+=+=+=+=+=+=+=+=+=+=+=+=+
j.rafael.s�nchez
Systems Administrator
+=+=+=+=+=+=+=+=+=+=+=+=+
Itres Research Limited
www.itres.com
Phone: 403.250.9944
Fax:   403.250.9916
+=+=+=+=+=+=+=+=+=+=+=+=+

Reply via email to