Bug#362146: iostat: wrong avgqu-sz and %util values (fwd)

2006-04-25 Thread Gabor Gombas
On Fri, Apr 21, 2006 at 11:25:21AM +0200, Sebastien Godard wrote:

 Because of a Linux kernel bug, iostat -x may display huge I/O response times
 (svctm) and a bandwidth utilization (%util) of 100% for some devices. Indeed
 these devices have a value for the field #9 (beginning after the device 
 name)
 in /proc/{partitions,diskstats} which is always different from 0, and even
 negative sometimes. Yet this field should go to zero, since it gives the
 number of I/Os currently in progress (it is incremented as requests are
 submitted, and decremented as they finish).

Yep, this seems to be the case. It seems that some requests decrement
in_flight multiple times so it goes negative pretty quickly.

 To (temporarily) solve the problem, you should reboot your system to 
 reset the
 counters in /proc/{partitions,diskstats}.

That does not help, by the time the boot process finishes in_flight is
already negative.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences,
 Laboratory of Parallel and Distributed Systems
 Address   : H-1132 Budapest Victor Hugo u. 18-22. Hungary
 Phone/Fax : +36 1 329-78-64 (secretary)
 W3: http://www.lpds.sztaki.hu
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#362146: iostat: wrong avgqu-sz and %util values (fwd)

2006-04-21 Thread Sebastien Godard

Hi,

Robert Luberda wrote:

Forwarding yet another bug report.

- Forwarded message from Gabor Gombas [EMAIL PROTECTED] -

I just noticed that iostat -x -d 2 reports bogus values for avgqu-sz
and %util:

Device:rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/srkB/swkB/s avgrq-sz 
avgqu-sz   await  svctm  %util
sda  0.00   5.00  0.00  6.000.00   88.00 0.0044.0014.67 
978275.898.67 166.75 100.05
sdb  0.00   5.00  0.00  6.000.00   88.00 0.0044.0014.67 
978275.874.58 166.75 100.05
md0  0.00   0.00  0.00  5.500.00   44.00 0.0022.00 8.00 
0.000.00   0.00   0.00
md4  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
md3  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
md2  0.00   0.00  0.00  1.500.00   12.00 0.00 6.00 8.00 
0.000.00   0.00   0.00
md1  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
sdc  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00

AMD64 has similar problems, just the numbers are larger:

Device:rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/srkB/swkB/s avgrq-sz 
avgqu-sz   await  svctm  %util
sda  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
9269720640049696.000.00   0.00 100.55
sdb  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
9269720640049696.000.00   0.00 100.55
sdc  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
9269720640049696.000.00   0.00 100.55
sdd  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
9269720640049696.000.00   0.00 100.55
md0  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
md3  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
md2  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
md1  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
  
I am wondering if this problem is not linked to the following Linux 
kernel bug, as explained in sysstat FAQ :


*** BEGIN FAQ ***
3.6. iostat -x displays huge numbers for some fields...

Because of a Linux kernel bug, iostat -x may display huge I/O response times
(svctm) and a bandwidth utilization (%util) of 100% for some devices. Indeed
these devices have a value for the field #9 (beginning after the device 
name)

in /proc/{partitions,diskstats} which is always different from 0, and even
negative sometimes. Yet this field should go to zero, since it gives the
number of I/Os currently in progress (it is incremented as requests are
submitted, and decremented as they finish).
To (temporarily) solve the problem, you should reboot your system to 
reset the

counters in /proc/{partitions,diskstats}.
*** END FAQ ***

This could explain why we get such a value for %util (100%).
Gabor : could you please send me the contents of your /proc/diskstats 
file so that I can check it?


PS: Note that a problem with huge avgqu-sz values was also reported on 
64-bit machines in LKML.
Though fixing iostat to handle this problem was possible, it was decided 
to update the kernel's disk_stats structure to fix it (patch from Ben 
Woodard which was finally included in 2.6.17-rc1).


Regards,

--
Sébastien Godard (sysstat at wanadoo.fr)
http://perso.wanadoo.fr/sebastien.godard/






Bug#362146: iostat: wrong avgqu-sz and %util values (fwd)

2006-04-19 Thread Robert Luberda
Hi,

Forwarding yet another bug report.


Regards,
robert

- Forwarded message from Gabor Gombas [EMAIL PROTECTED] -

From: Gabor Gombas [EMAIL PROTECTED]
Subject: Bug#362146: iostat: wrong avgqu-sz and %util values
To: Debian Bug Tracking System [EMAIL PROTECTED]
X-Mailer: reportbug 3.20
Date: Wed, 12 Apr 2006 16:16:03 +0200

Package: sysstat
Version: 6.1.1-1
Severity: normal


Hi,

I just noticed that iostat -x -d 2 reports bogus values for avgqu-sz
and %util:

Device:rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/srkB/swkB/s avgrq-sz 
avgqu-sz   await  svctm  %util
sda  0.00   5.00  0.00  6.000.00   88.00 0.0044.0014.67 
978275.898.67 166.75 100.05
sdb  0.00   5.00  0.00  6.000.00   88.00 0.0044.0014.67 
978275.874.58 166.75 100.05
md0  0.00   0.00  0.00  5.500.00   44.00 0.0022.00 8.00 
0.000.00   0.00   0.00
md4  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
md3  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
md2  0.00   0.00  0.00  1.500.00   12.00 0.00 6.00 8.00 
0.000.00   0.00   0.00
md1  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
sdc  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00

AMD64 has similar problems, just the numbers are larger:

Device:rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/srkB/swkB/s avgrq-sz 
avgqu-sz   await  svctm  %util
sda  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
9269720640049696.000.00   0.00 100.55
sdb  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
9269720640049696.000.00   0.00 100.55
sdc  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
9269720640049696.000.00   0.00 100.55
sdd  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
9269720640049696.000.00   0.00 100.55
md0  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
md3  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
md2  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00
md1  0.00   0.00  0.00  0.000.000.00 0.00 0.00 0.00 
0.000.00   0.00   0.00

Gabor


-- System Information:
Debian Release: testing/unstable
  APT prefers unstable
  APT policy: (990, 'unstable'), (500, 'testing'), (500, 'stable'), (101, 
'experimental')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.16libata
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)

Versions of packages sysstat depends on:
ii  debconf [debconf-2.0] 1.4.72 Debian configuration management sy
ii  libc6 2.3.6-6GNU C Library: Shared libraries
ii  lsb-base  3.1-3  Linux Standard Base 3.1 init scrip
ii  ucf   2.007  Update Configuration File: preserv

Versions of packages sysstat recommends:
ii  cron  3.0pl1-94  management of regular background p

-- debconf information:
  sysstat/notice:
  sysstat/remove_files: true
* sysstat/enable: true


- End forwarded message -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]