Well, BSDI's reporting on "top" is crap. The box actually has 128mb of
RAM on it... I don't think it's really swapping as much as it shows,
either.
Ricky
On Wed, 20 Dec 2000, Peter R. Hubberstey wrote:
> Am I right in thinking you are running apache, sendmail AND pop with a
> machine with only 32 MB of RAM ? how many users do you expect this to serve
> ? one ;-)
>
> I would be inclined to up the amount of physical RAM you have since I think
> running in daemon mode requires more memory.
>
> Also it's a newer version.... so more memory used again.
>
> You're swapping quite a bit, but check your paging - you can use 'procinfo'
> for this. Page out is usually bad I think.
>
> I am sure this will be a contributory factor if not THE reason!
>
>
>
>
> -----Original Message-----
> From: Ricky Crow [mailto:[EMAIL PROTECTED]]
> Sent: 20 December 2000 15:53
> To: Peter Evans
> Cc: Subscribers of Qpopper
> Subject: Re: Serious problem....
>
>
>
>
> On Thu, 21 Dec 2000, Peter Evans wrote:
>
> > lots. But no actual information ^^;
> >
> >
> > 1 - what OS are you running, have you locked down any of the
> > non-essential crap that things like redhat/solaris
> > and the likes come with?
>
> Running BSD/OS 2.01... Yes, I know it's old, but it has been extremely
> stable over the last 4-5 years on the same machine.
> Yes, the system has been locked down, as well.
>
> > 2 - when it craps out (for want of a better word) what else is the
> > system doing? Commands that may help you here:
> >
> > top
>
> Nothing really serious or unusual, here... I am experiencing the problem
> as of right now, and here's what top shows:
>
>
> load averages: 0.46, 0.41, 0.36
> 09:43:37
> 90 processes: 2 running, 88 sleeping
> Cpu states: 2.0% user, 0.0% nice, 3.0% system, 0.0% interrupt, 95.0%
> idle
> Memory: Real: 15M/32M Virt: 78M/254M Free: 72M
>
> PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
> 12362 root 2 0 5184K 5084K sleep 1:37 1.61% 1.61% named
> 23724 root 2 0 1148K 1040K sleep 0:00 6.00% 0.29% sendmail
> 23721 root 28 0 256K 444K run 0:00 1.40% 0.20% top
> 23325 nobody 2 0 1900K 972K sleep 0:00 0.10% 0.10% httpd
> 23656 nobody 2 0 1900K 944K sleep 0:00 0.05% 0.05% httpd
> 19761 root 2 0 1044K 260K sleep 0:05 0.05% 0.05% sendmail
> 23131 root 28 0 536K 388K run 0:00 0.00% 0.00% ftpd
> 1991 root 18 0 1876K 1000K sleep 0:24 0.00% 0.00% httpd
> 15981 rickyc 18 0 592K 764K sleep 0:00 0.00% 0.00% tcsh
> 17909 root 18 0 584K 744K sleep 0:00 0.00% 0.00% tcsh
> 19574 root 18 0 540K 704K sleep 0:00 0.00% 0.00% tcsh
> 15159 rickyc 18 0 536K 660K sleep 0:00 0.00% 0.00% tcsh
> 13145 rickyc 18 0 536K 652K sleep 0:00 0.00% 0.00% tcsh
> 126 root 18 -12 352K 416K sleep 0:00 0.00% 0.00% xntpd
> 2099 root 18 0 340K 220K sleep 0:02 0.00% 0.00% cron
>
>
> > iostat
>
> ns1: {44} % iostat
> tty sd0 sd1 sd2 sd3
> cpu
> tin tout sps tps msps sps tps msps sps tps msps sps tps msps us ni sy
> id
> 0 38 87 3 4.9 0 0 5.0 0 0 0.0 351 24 3.8 8 0 21
> 0 71
>
>
> I don't know EXACTLY what all of that means on iostat, but that's what it
> shows right now, too.
>
> > netstat
>
> There is nothing unusual in there....and no connections on port 110 right
> now, either.
>
> > ps -ef (-auxww or whatever)
>
> Nothing unusual.. There are probably too many processes listed to copy
> and paste into this email, but there isn't anything that makes me
> suspicious or looks unusual.
>
> > lsof
>
> I don't have that command on this machine for some reason......
>
> > These should give you hints about things like resource-starvation,
> > strange crap and so on.
>
> Nothing strange... Can't figure this out... Any other ideas?
>
> > 3 - look in the system logs for clues.
> >
> > This is probably number 2a, not 3.
>
> Been looking in the logs....even doing a tail -f to watch the log as it's
> happening, then I keep testing mail in another window and waiting for it
> to "crap out" and nothing.... Nothing unusual. No inetd messages telling
> me that it is shutting down that service or anything.... It's frustrating
> me to no end.
>
>
> > There, that should get you looking in the right direction.
> > It could be something as simple as "not using server-mode/
> > noupdateonabort/nostatus and having allowed your lusers to
> > build up 900 mb mailboxes."
>
> We have a few people with mailboxes approaching 10 megs on the system, but
> by and large, we don't have all that many that get that big.
>
> > Oh, and we have 30000 lusers on a linux box using qp3.1+ldap,
> > without so much as a hiccup. so I suspect soemthing silly.
>
> Damn.... I wish I could say that.....
> I've only got a couple of thousand on this box, and it's giving me fits.
>
> Ricky
>
>
>
>