Thanks to the guys that responded to this post.

I chased up the links on lwn.net and ended up upgrading to
kernel 2.4.12 -> which seems to have fixed things :(

The latest Redhat kernel (2.4.3-12) wasn't swapping *anything*
on a high memory machine :(

Subsequently, under high loads, intensive I/O (which can only use low
memory) was causing a severe shortage of resources.

The new stock kernel swaps properly and testing under heavy loads showed
no problems.

I think Redhat needs to *seriously* look at this problem as the errors
occurred with both the supplied smp kernel and a custom compiled kernel
from the kernel-source rpm.

-------------------------------------

Steve Batson
System Administrator
Victorian Institute of Animal Science
Victoria, Australia
Email: [EMAIL PROTECTED]

--------------------------------------

>Hi,
>
>I'm responsible for looking after a couple of machines that
>run some fairly large statistical analysis jobs.
>
>One machine (Dell PowerEdge 6300, quad Pentium III, 4GB RAM, RH-7.1)
>does most of the processing, often running jobs as large as 1.8GB/process.
>
>Since upgrading to RH-7.1, the machine occasionally runs out of resources.
>This seems to happen while our backup is running (we use BRU to backup to
>a DDS-4).
>
>I can ping the machine but cannot log in and a hard reset is needed.
>
>Before the reset, I usually see messages like this on the console:
>
>(scsi2:A:6): 20.000MB/s transfers (20.000MHz, offset 15)
>st0: Block limits 1 - 16777215 bytes.
>mm: critical shortage of bounce buffers.
>(scsi1:A:1:0): Locking max tag count at 64
>(scsi1:A:3:0): Locking max tag count at 64
>
>Sometimes there are references to killed jobs because of lack of resources.
>
>The SCSI controller for the disks is an Adaptec aic7890/91 Ultra2.
>The tape drive uses an Adaptec aic7860 SCSI adapter.
>
>The reference to: mm: critical shortage of bounce buffers
>can be found in mm/highmem.c.
>
>The strange thing is, at no point does swap seem to get used, no matter how
>heavily loaded the machine is.
>
>If anyone can shed some light on this it would be much appreciated.
>
>-------------------------------------
>
>Steve Batson
>System Administrator
>Victorian Institute of Animal Science
>Victoria, Australia
>Email: [EMAIL PROTECTED]
>
>--------------------------------------
>




_______________________________________________
Seawolf-list mailing list
[EMAIL PROTECTED]
https://listman.redhat.com/mailman/listinfo/seawolf-list

Reply via email to