(The following message was posted to 
news://news.redhat.com/redhat.kernel.general on April 29 with 
message ID <[EMAIL PROTECTED]> - but something seems 
to be wrong with the news-server; the message is not there.)

I have a Linux box where the rc5des client is normally running 
(see http://www.distributed.net/rc5/).

However, I have now discovered the following weird behaviour 
which has made me kill the rc5des client. Unfortunately, I 
think that my findings show a more general problem.

When the rc5des software is not running, the following procedure 
takes around one minute:
time cat /dev/hda1 >/dev/null
(CPU usage is around 30%)

When the rc5des software is running, the same procedure takes 
15 minutes ('time' reports CPU usage as 0%)

hda1 is an IDE disk (PIO4) of around 360 MB. The rc5des client 
has no relation to the partition whatsoever (the partition 
is not mounted at all).

I'm running the rc5des client as a special user - and at the 
lowest priority possible, I should think. I start the rc5des 
software with the following line in /etc/rc.d/rc.local:
usr/local/sbin/desstart &

desstart looks like this:
#!/bin/sh
su -c "cd /home/des; nice /home/des/rc5des >/home/des/log-1
2>/home/des/log-2" des

I have also tried this su line without any change in behaviour:
su -c "cd /home/des; nice /home/des/rc5des >/home/des/log-1 2>&1" des

(one line - again, my browser may have wrapped the line)

When I run top, I'm reassured that rc5des is running at low 
priority and as the special "des" user:
USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
des       20  19   284  284   208 R N     0 98.0  0.9   0:12 rc5des
des        0   0   736  736   540 S       0  0.0  2.3   0:00 su
des        0   0   652  652   536 S       0  0.0  2.1   0:00 bash
des       19  19   284  284   208 S N     0  0.0  0.9   0:00 rc5des
des       19  19   284  284   208 S N     0  0.0  0.9   0:00 rc5des

As far as I know, the rc5des software should have extremely low 
priority and should not be able to influence performance in any 
way (except some memory usage).

However, the rc5des software seems to be able to slow down a 
root-initiated process (the cat command) by 1500%!

When rc5des is running and I'm cat'ing from hda1, 
/proc/<cat's pric-ID>/status presents like this in 19 
out of 20 tries:

Name:   cat
State:  D (disk sleep)
[cut]

Once in a while, I'm able to get a 
/proc/<cat's pric-ID>/status looking like this:

Name:   cat
State:  R (running)
[cut]

As soon as I kill rc5des, there seems to be a 1:1 relation 
between cat's status being D/R.

The system is Intel-Redhat 5.0 with all the latest official 
Redhat updates.

I have tried with Redhat's latest kernel sources. I have 
tried with a clean 2.0.33. And I have tried with the 
latest pre-release of 2.0.34 (11b).

I have also tried various configurations of the IDE 
controller/harddisk (changing parameters in the hdparm 
utility). Nothing is able to improve the situation.

A hdparm -i /dev/hda gives the following output:
/dev/hda:

 Model=QUANTUM FIREBALL_TM3200A, FwRev=A6B.1T00, SerialNo=29562753
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>5Mbs TrkOff }
 RawCHS=6232/16/63, TrkSize=32256, SectSize=512, ECCbytes=4
 BuffType=3(DualPortCache), BuffSize=76kB, MaxMultSect=16,MultSect=16
 DblWordIO=no, maxPIO=2(fast), DMA=yes, maxDMA=2(fast)
 CurCHS=6232/16/63, CurSects=6281856, LBA=yes, LBAsects=6281856
 tDMA={min:120,rec:120}, DMA modes: sword0 sword1 sword2 mword0 mword1
*mword2 
 IORDY=on/off, tPIO={min:300,w/IORDY:120}, PIO modes: mode3 mode4

'hdparm -W1 /dev/hda' does not improve the situation.

More system info about the computer is at 
http://www.studmed.ku.dk:8000/
I do not suspect any hardware defects.

I consider the phenomenon as a potential denial-of-service 
problem: I think that a normal user should not be able to 
bring down the performance of basic disk access to this 
dramatic degree - even running the program 'nicely'.

(On an other Linux computer, 'cat /dev/disk-unit >/dev/null' 
has a constant throughput of 6MB/s, no matter if I'm running 
rc5des or not. This is an Intel system using SCSI.)

Any ideas/comments?

Could this be a kernel bug? A glibc bug? Some other bug?
Is there something utterly wrong with time-slicing to the 
IDE-controller?

Could there be a connection between my observations? - The IDE 
throughput/priority/time-slicing problem and the 
non-responsive apache?


Greetings from Troels Arvin, Copenhagen, Denmark
http://www.mdb.ku.dk/tarvin/


-- 
  PLEASE read the Red Hat FAQ, Tips, Errata and the MAILING LIST ARCHIVES!
http://www.redhat.com/RedHat-FAQ /RedHat-Errata /RedHat-Tips /mailing-lists
         To unsubscribe: mail [EMAIL PROTECTED] with 
                       "unsubscribe" as the Subject.

Reply via email to