Well thanks for the hints. I have a couple of responses below.

Garl

Bob Miller wrote:

> Garl R. Grigsby wrote:

<SNIP>
me rambling
<SNIP>

>
> Short answer: I don't have a clue. (-:
>
> Long answer: Here are some ideas...
>
> 1. Some app is misconfigured and is constantly restarting/dumping
>    core, and that's keeping your disk busy.  This happened on one
>    machine I put 7.1 on -- the problem was an interaction between
>    Apache and mod_ssl.  I didn't figure it out; I just removed Apache,
>    since I didn't want a web server there anyway.
>
>    Have you checked syslog and root's mailbox?

Y4up and no joy.

>
>
> 2. A kernel or driver bug is leaking memory.  If this is the case,
>    your machine will probably eventually hang.
>
>    Have you checked dmesg?

Yes. There are two errors that are each repeated a couple of time, but I have not
taken the time to hunt them down. I do not beleive they are related, but I am the
first to admit I spend a lot of time being wrong. Here they are for those who are
interested (NOTE: these come at the very bottom of dmesg):

> scsi0: MEDIUM ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 00 02 4d 1f 00 00 
>38 00
> Info fld=0x24d2f, Current sd08:01: sense key Medium Error
> Additional sense indicates Unrecovered read error
> scsidisk I/O error: dev 08:01, sector 150768
> scsi0: MEDIUM ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 00 02 5a 57 00 00 
>10 00
> Info fld=0x25a63, Current sd08:01: sense key Medium Error
> Additional sense indicates Unrecovered read error
> scsidisk I/O error: dev 08:01, sector 154144
> scsi0: MEDIUM ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 00 02 5a 5f 00 00 
>08 00
> Info fld=0x25a63, Current sd08:01: sense key Medium Error
> Additional sense indicates Unrecovered read error
> scsidisk I/O error: dev 08:01, sector 154144
> scsi0: MEDIUM ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 00 02 5a 5f 00 00 
>08 00
> Info fld=0x25a63, Current sd08:01: sense key Medium Error
> Additional sense indicates Unrecovered read error
> scsidisk I/O error: dev 08:01, sector 154144
> ISO 9660 Extensions: Microsoft Joliet Level 3
> ISO 9660 Extensions: RRIP_1991A
> end_request: I/O error, dev 02:00 (floppy), sector 0
> scsi0: MEDIUM ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 00 02 5a 5f 00 00 
>08 00
> Info fld=0x25a63, Current sd08:01: sense key Medium Error
> Additional sense indicates Unrecovered read error
> scsidisk I/O error: dev 08:01, sector 154144
> end_request: I/O error, dev 02:00 (floppy), sector 0
> end_request: I/O error, dev 02:00 (floppy), sector 0
> end_request: I/O error, dev 02:00 (floppy), sector 0
>
> 3. A userland app is leaking memory.  Top would show this -- start top
>    and type 'M' (uppercase) to sort by memory use.  Type 'i'
>    (lowercase) if you don't see a whole screenful of processes.
>
>    The kde tools are known to leak, but not at the rate you're seeing.
>    I generally have to restart kfm every couple of weeks.

X is not running. I typically run from init level 3. I only run X if my HPUX box is
down and I need to get something done quickly. Other wise I run everything from a
rlogin or telnet session.

> 4. Nothing is wrong; you're misreading the tea leaves.

Maybe I should start drinking tea again. Then I could figure this out. Hmmm........

> You say top
>    shows all memory in use?  You aren't basing that on the 4th line
>    that shows "Mem: ######K av, ######K used", I hope.  Those two
>    numbers are always nearly equal.  The Linux kernel caches
>    aggressively -- it does its best to keep memory full so it won't
>    have to read from disk.

I am basing this on /proc/meminfo. When I look at the MemFree line is will show
something like 6k free. Looking at the swap free line I ussually see something like
50% of my swap used (256mb total).

>    To see the paging rate, try vmstat.  Here's some sample vmstat
>    output under heavy vm load.

I will have to do somereading on vmstat. I have never used this command before.
Thanks for the hint.

>
> jogger-egg> vmstat 5
>    procs                      memory    swap          io     system         cpu
>  r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
>  0  0  0  43316   3476   2896  46388   0   0     3     1    1    18   1   0   6
>  0  0  0  43316   3476   2896  46388   0   0     0    11  127   124   0   0  99
>  1  0  0  43316   1684   2132  29360   0   0     0     0  106   133   0   3  96
>  2  0  0  56024   1548    220  15860   0 2542    1   636 5188   140   3  93   4
>  2  0  0  69540   1620    220  16012   0 2703    0   676 5509   124   5  95   0
>  2  0  1  81888   1024    220  16004   0 2470    0   617 5042   111   4  96   0
>  1  0  0  82288  76864    220  15168   0 1005    0   251 2112   119   2  33  65
>  0  0  0  82288  76864    220  15168   0   0     0     0  101   112   0   1  99
> ^C
>
>    The interesting colmns are si and so (amount swapped in and out, in
>    Kb) and bi/bo (amount of block I/O in Kb).  Also note how cache
>    fell from 46 Mb to 15 Mb as memory got tight.
>
>    The run you see is from a single PII/450MHz, 192 Mb RAM, and single
>    IDE disk.  I ran a simple memory thrashing program for 20 seconds.
>    If I'd run it longer, you would have seen si nonzero.
>
> > So now to my question(s).
> >
> > 1) How do I find out what app is tying up all of my memory? Top does not
> > list any one app as chewing up all the RAM.
>
> See above.
>
> > 2) Is there a way to force a memory cleanup? (IE force all of the apps
> > to "claim" there allocated memeory?)
>
> "kill -9 ..." (-:

This I am familar with. I spend a lot of time with my friend 'kill -9'. <grin>

> > 3) Is this a problem with Mandrake (v7.1)? (Or is this a PEBKAC
> > problem?)
>
> Unknown.
>
> > 4) Am I going crazy?
>
> Unknown.  Ask the voices.

They have been less than helpful these days. I think I ticked them off. Maybe that
was a mista.......

> > 5) Or could this be a problem with the SMP? This is the first time I
> > have ever run Linux on a dual processor machine, so I have no idea if
> > this could be related (I really doubt it, but in my line of work I have
> > learned to question all things computer related)
>
> I doubt it too. (-:
>
> --
>                                         K<bob>
> [EMAIL PROTECTED], http://www.jogger-egg.com/

--
=============================================================================
Garl R. Grigsby
Customer Applications Engineering - Analysis Team
-----------------------------------------------------------------------------
Structural Dynamics Research Corporation      Phone: (800)242-7372
TAO Americas Support Center                   FAX: (541)342-8277
1750 Willow Creek Circle                      Email:  [EMAIL PROTECTED]
Eugene, OR 97402                              Internet:  http://www.sdrc.com
=============================================================================
-FEA makes a good engineer great, and a poor engineer dangerous-
=============================================================================
PGP ID: 0xF2D845E7
PGP Fingerprint: 9C40 CB5E 1C51 CF58 E3F9  3F2C 8F1F F3EF F2D8 45E7
=============================================================================

Reply via email to