----- Original Message -----
From: "Kevin Anderson" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, November 13, 2002 1:36 PM
Subject: Re: (clug-talk) Programmer(s)/User(s) crashing my system.


> I have a server that has gone past 20 CPU loads, and it hasn't crashed
> (RH7.2)  I don't think being busy is a problem.  It never has been for me,
> at least...

Wow! That's good to know - I've seen cpu loads of 5 overhere and I've been a
bit concerned. Thanks.

>
> w is a command that gives similar info to the top piece of top.
> free gives info about memory usage.
>
> You could run either at the command line, and see the output.
>
> If you're running 5.2 --> 7, and you're using Samba, one thing I certainly
> hope is that you've upgraded Samba at least...  The restrictions on file

Yes, I've been careful to keep upgrading samba to the corresponding kernel
versions.

> sizes may have been uglier on older version of Samba.  I started heavily
> testing samba at about 2.2.2.
>
> Even RH7 had an OLD version of Samba when it came out.  You'll want at
least
> 2.2.5 if you're running it as a PDC.
>
> For adding the cron stuff, you'll need to type "crontab -e" and you'll be
> facing a vi editor.  You can copy almost exactly what I typed into it, and
> when you exit vi, it will run as a cron job.  You should probably specify
a
> full path for the file you're dumping the output into though.
>
> Did you set up the server (like I did on my first production box) with a /
> and a swap partition?  (yeah, yeah, laugh your heart out, I'm screwed, I
> know...)

Really? well, you know when I first started learning linux I had a boss who
was an excellent teacher. Among other things, he always recommended a
partition schema such: /, /tmp, /usr, /swap(s) /home /usr/local and /var.
Sometimes I leave the /usr/local out except for cases where programmers want
to have some extra space where to put their programs.

By the way, my assumption has always been to put the swap partitions where
they're best needed (/usr, /home) is this a good assumption?


Thanks again Kev.
Rafael.


>
> If not, upgrades should be no problem.
>
> Kev.
>
>
>
> ----- Original Message -----
> From: "J. Rafael S�nchez" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Wednesday, November 13, 2002 12:23 PM
> Subject: Re: (clug-talk) Programmer(s)/User(s) crashing my system.
>
>
> >
> > ----- Original Message -----
> > From: "Kevin Anderson" <[EMAIL PROTECTED]>
> > To: <[EMAIL PROTECTED]>
> > Sent: Tuesday, November 12, 2002 7:43 PM
> > Subject: Re: (clug-talk) Programmer(s)/User(s) crashing my system.
> >
> >
> > > If you haven't already, I'd STRONGLY advise a kernel upgrade.  IIRC,
RH
> > 7.0
> > > gave you the choice of either a 2.2 or a 2.4 Kernel.  The early 2.4
> series
> > > had some REALLY bad Virtual Memory issues.  Red Hat is based on the ac
> > > branch, so it was better than most.  Still, I'd look for something
after
> > > about 2.4.15 or so.  Gentoo offers up to 2.4.20, so something around
15
> > > isn't overly new.
> > >
> > > With the system you've got, what does top show for your Free Memory?
> and
> > > Swap?  Are you just running out?
> >
> >  1:18pm  up 22:50,  2 users,  load average: 1.44, 1.05, 0.97
> > 83 processes: 81 sleeping, 2 running, 0 zombie, 0 stopped
> > CPU0 states: 21.1% user,  8.0% system,  0.0% nice, 70.3% idle
> > CPU1 states: 17.3% user,  8.1% system,  0.0% nice, 73.4% idle
> > Mem:   517056K av,  515724K used,    1332K free,    9292K shrd,  205600K
> > buff
> > Swap: 1052216K av,   20556K used, 1031660K free                  169764K
> > cached
> >
> >   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
> >  1136 jason     12   0 75336  71M  2684 R    51.5 14.2 156:34 idl
> >  2220 root       3   0  1192 1192   944 R     1.5  0.2   0:00 top
> >
> > This is a capture of top. As you can tell "idl" is one of the main apps
> that
> > run on this system. One license, one user. Load averages are anywhere
from
> > 1 - 3 cpu loads sometimes (by the way, how do you tell what's a too high
> > load average? - other than system crash)
> >
> > This system would access local data storage as well as nfs shares from
> other
> > systems.
> >
> > >
> > > If you can BEAT on this system for a while, I'd set up a cronjob that
> > looks
> > > like this
> > >
> > > w >> test
> > > free >> test
> > >
> > Let me see if I understand this pseudo code. Run a command that
> concatenate
> > results to a text file.
> > Would you tell me what that command is? I know how to use cron, I'm not
> sure
> > if know what command I need to perform this task.
> >
> > > and then have it run every minuite.  It'll be a huge hit on your
system,
> > and
> > > it'll create a HUGE file fairly quick, but it'll show your system
stats
> > > within 1 minuite of a crash.
> > >
> > > When you talk about a 2 Gig size limit, that shouldn't be a factor for
> > > Linux, at least not if it's a 2.4 kernel.  I can't remember what the
max
> > > file size was for the 2.2 kernel series.  However, if you have legacy
> > > Windows clients, Samba sometimes had a limitation of 2 Gigs per file.
> As
> > I
> > > remember it, that restriction existed when Linux mounted a Share on a
> > legacy
> > > Windows box across the network, and then tried to copy a file.  If
> Windows
> > > mounted the Linux box, and performed the copy, then the 2 Gig issue
> didn't
> > > exist.  Are you using Legacy Clients?  Which machine mounts what?
> > >
> > Yes, I'm using legacy windows clients (95,98,w2k,nt4) with samba.
However,
> > users don't move the BIG files around to windows systems. They're
allowed
> to
> > move from one Linux system to another Linux system using samba though.
> >
> > Thanks Kev.
> >
> >
> > > Something to start with...
> > >
> > > Kev.
> > >
> > >
> > > ----- Original Message -----
> > > From: "J. Rafael S�nchez" <[EMAIL PROTECTED]>
> > > To: <[EMAIL PROTECTED]>
> > > Sent: Tuesday, November 12, 2002 4:07 PM
> > > Subject: (clug-talk) Programmer(s)/User(s) crashing my system.
> > >
> > >
> > > > Good day all,
> > > > I have user(s)/programmer(s)who are crashing one of my servers.
> > > >
> > > > Users have access to this RH 7.0 system over Xwin32 using XDMCP.
> > > >
> > > > System decription: 512Mg Ram, Dual Pentium III 933, 1GB swap file,
> 1GHZ
> > > > Ethernet Card.
> > > >
> > > > The crash(es) are so bad that when I go to the machine, I can't even
> log
> > > in,
> > > > no console access whatsoever; to the point that the only option is
to
> > > "push
> > > > the on/off button". Of course after that I have to do manual
e2fsck(s)
> > on
> > > > all my 6 180GB hard drives.
> > > >
> > > > I have been able to pinpoint that the system crashes is because they
> are
> > > > running a home-made program using IDL language over a gui
> > > interface/program
> > > > called ENVI. We deal with imagery a lot (huge files and outputs).
Some
> > of
> > > > these programs have to break-up huge amounts of image-data into
> pieces,
> > do
> > > > some sort of processing on them and stitch them back together.
> > > >
> > > > It could have to do with the fact that the program(s) may not be
using
> > the
> > > > resources efficiently, memory, 32bit file system limits (2GB file
size
> > > > limits), etc, etc.
> > > >
> > > > I'd like to help them and myself by finding out what exactly is that
> > they
> > > > are doing or not doing. Is there a system utility or OS utility that
I
> > can
> > > > use to monitor the system. I've used top. I've looked through the
log
> > > files
> > > > but I cannot seem to find anything important to help me.
> > > >
> > > > The last few lines of my /var/log/messages file of today's crash:
> > > >
> > > > *** real name replaced by "thishost"
> > > >
> > > > Nov 12 14:00:01 thishost CROND[28389]: (root) CMD (
  /sbin/rmmod -as)
> > > > Nov 12 14:01:00 thishost CROND[28391]: (root) CMD (run-parts
> > > > /etc/cron.hourly)
> > > > Nov 12 14:10:01 thishost CROND[28402]: (root) CMD (
  /sbin/rmmod -as)
> > > > Nov 12 14:37:12 thishost syslogd 1.3-3: restart.
> > > >
> > > > Output of ls of /etc/cron.hourly
> > > > [root@thishost /etc]# ls -laF cron.hourly/
> > > > total 16
> > > > drwxr-xr-x    2 root     root         4096 Apr 24  2002 ./
> > > > drwxr-xr-x   56 root     root         4096 Nov 12 15:20 ../
> > > > -rwxr-xr-x    1 news     news           65 Jul 24  2000
> > inn-cron-nntpsend*
> > > > -rwxr-xr-x    1 news     news           68 Jul 24  2000
> inn-cron-rnews*
> > > >
> > > > Cat of inn-cron-nntpsend
> > > > [root@thishost /etc]# cat cron.hourly/inn-cron-nntpsend
> > > > #!/bin/sh
> > > > /sbin/chkconfig innd && su - news -c /usr/bin/nntpsend
> > > >
> > > >
> > > > Cat of inn-cron-rnews*
> > > > #!/bin/sh
> > > > /sbin/chkconfig innd && su - news -c '/usr/bin/rnews -U'
> > > >
> > > >
> > > > Would this be what's crashing my system?
> > > >
> > > > Any suggestion would be greatly appreciated.
> > > >
> > > >
> > > > Rafael.
> > > >
> > > >
> > > > +=+=+=+=+=+=+=+=+=+=+=+=+
> > > > j.rafael.s�nchez
> > > > Systems Administrator
> > > > +=+=+=+=+=+=+=+=+=+=+=+=+
> > > > Itres Research Limited
> > > > www.itres.com
> > > > Phone: 403.250.9944
> > > > Fax:   403.250.9916
> > > > +=+=+=+=+=+=+=+=+=+=+=+=+
> > > >
> > > >
> > > >
> >
> >
> >

Reply via email to