----- Original Message ----- From: "Kevin Anderson" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Tuesday, November 12, 2002 7:43 PM Subject: Re: (clug-talk) Programmer(s)/User(s) crashing my system.
> If you haven't already, I'd STRONGLY advise a kernel upgrade. IIRC, RH 7.0 > gave you the choice of either a 2.2 or a 2.4 Kernel. The early 2.4 series > had some REALLY bad Virtual Memory issues. Red Hat is based on the ac > branch, so it was better than most. Still, I'd look for something after > about 2.4.15 or so. Gentoo offers up to 2.4.20, so something around 15 > isn't overly new. > > With the system you've got, what does top show for your Free Memory? and > Swap? Are you just running out? 1:18pm up 22:50, 2 users, load average: 1.44, 1.05, 0.97 83 processes: 81 sleeping, 2 running, 0 zombie, 0 stopped CPU0 states: 21.1% user, 8.0% system, 0.0% nice, 70.3% idle CPU1 states: 17.3% user, 8.1% system, 0.0% nice, 73.4% idle Mem: 517056K av, 515724K used, 1332K free, 9292K shrd, 205600K buff Swap: 1052216K av, 20556K used, 1031660K free 169764K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 1136 jason 12 0 75336 71M 2684 R 51.5 14.2 156:34 idl 2220 root 3 0 1192 1192 944 R 1.5 0.2 0:00 top This is a capture of top. As you can tell "idl" is one of the main apps that run on this system. One license, one user. Load averages are anywhere from 1 - 3 cpu loads sometimes (by the way, how do you tell what's a too high load average? - other than system crash) This system would access local data storage as well as nfs shares from other systems. > > If you can BEAT on this system for a while, I'd set up a cronjob that looks > like this > > w >> test > free >> test > Let me see if I understand this pseudo code. Run a command that concatenate results to a text file. Would you tell me what that command is? I know how to use cron, I'm not sure if know what command I need to perform this task. > and then have it run every minuite. It'll be a huge hit on your system, and > it'll create a HUGE file fairly quick, but it'll show your system stats > within 1 minuite of a crash. > > When you talk about a 2 Gig size limit, that shouldn't be a factor for > Linux, at least not if it's a 2.4 kernel. I can't remember what the max > file size was for the 2.2 kernel series. However, if you have legacy > Windows clients, Samba sometimes had a limitation of 2 Gigs per file. As I > remember it, that restriction existed when Linux mounted a Share on a legacy > Windows box across the network, and then tried to copy a file. If Windows > mounted the Linux box, and performed the copy, then the 2 Gig issue didn't > exist. Are you using Legacy Clients? Which machine mounts what? > Yes, I'm using legacy windows clients (95,98,w2k,nt4) with samba. However, users don't move the BIG files around to windows systems. They're allowed to move from one Linux system to another Linux system using samba though. Thanks Kev. > Something to start with... > > Kev. > > > ----- Original Message ----- > From: "J. Rafael S�nchez" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Tuesday, November 12, 2002 4:07 PM > Subject: (clug-talk) Programmer(s)/User(s) crashing my system. > > > > Good day all, > > I have user(s)/programmer(s)who are crashing one of my servers. > > > > Users have access to this RH 7.0 system over Xwin32 using XDMCP. > > > > System decription: 512Mg Ram, Dual Pentium III 933, 1GB swap file, 1GHZ > > Ethernet Card. > > > > The crash(es) are so bad that when I go to the machine, I can't even log > in, > > no console access whatsoever; to the point that the only option is to > "push > > the on/off button". Of course after that I have to do manual e2fsck(s) on > > all my 6 180GB hard drives. > > > > I have been able to pinpoint that the system crashes is because they are > > running a home-made program using IDL language over a gui > interface/program > > called ENVI. We deal with imagery a lot (huge files and outputs). Some of > > these programs have to break-up huge amounts of image-data into pieces, do > > some sort of processing on them and stitch them back together. > > > > It could have to do with the fact that the program(s) may not be using the > > resources efficiently, memory, 32bit file system limits (2GB file size > > limits), etc, etc. > > > > I'd like to help them and myself by finding out what exactly is that they > > are doing or not doing. Is there a system utility or OS utility that I can > > use to monitor the system. I've used top. I've looked through the log > files > > but I cannot seem to find anything important to help me. > > > > The last few lines of my /var/log/messages file of today's crash: > > > > *** real name replaced by "thishost" > > > > Nov 12 14:00:01 thishost CROND[28389]: (root) CMD ( /sbin/rmmod -as) > > Nov 12 14:01:00 thishost CROND[28391]: (root) CMD (run-parts > > /etc/cron.hourly) > > Nov 12 14:10:01 thishost CROND[28402]: (root) CMD ( /sbin/rmmod -as) > > Nov 12 14:37:12 thishost syslogd 1.3-3: restart. > > > > Output of ls of /etc/cron.hourly > > [root@thishost /etc]# ls -laF cron.hourly/ > > total 16 > > drwxr-xr-x 2 root root 4096 Apr 24 2002 ./ > > drwxr-xr-x 56 root root 4096 Nov 12 15:20 ../ > > -rwxr-xr-x 1 news news 65 Jul 24 2000 inn-cron-nntpsend* > > -rwxr-xr-x 1 news news 68 Jul 24 2000 inn-cron-rnews* > > > > Cat of inn-cron-nntpsend > > [root@thishost /etc]# cat cron.hourly/inn-cron-nntpsend > > #!/bin/sh > > /sbin/chkconfig innd && su - news -c /usr/bin/nntpsend > > > > > > Cat of inn-cron-rnews* > > #!/bin/sh > > /sbin/chkconfig innd && su - news -c '/usr/bin/rnews -U' > > > > > > Would this be what's crashing my system? > > > > Any suggestion would be greatly appreciated. > > > > > > Rafael. > > > > > > +=+=+=+=+=+=+=+=+=+=+=+=+ > > j.rafael.s�nchez > > Systems Administrator > > +=+=+=+=+=+=+=+=+=+=+=+=+ > > Itres Research Limited > > www.itres.com > > Phone: 403.250.9944 > > Fax: 403.250.9916 > > +=+=+=+=+=+=+=+=+=+=+=+=+ > > > > > >
