Hello,
Have you allocated the whole "disk" file before creating the
DomU or are you using a sparse file? I can run an iozone test in DomU
without any problem (which certainly is much more disk IO intensive than
scp). Even using LVM, MD and other setups in DomU does not appear to
cause problems.
My HW/SW setup is the same as yours, but I am running everything
on 64-bit... it may behave differently on 32-bit.
Daniel
-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ben
Sent: Friday, January 18, 2008 5:53 PM
To: RHEL5 mailing list
Subject: [rhelv5-list] Pegged IOwait and spiraling load average with
XenDomUs
Hi there,
You may remember me from such emails as "Issues with RHEL5 and PV Xen
RHEL4"
and "Fast reaping, slow processes!". Now I'm back with something new!
The issue is that whenever we do any kind of IO-intensive operations
with our Xen DomU instances (for example 'scp' copies to the filesystems
of the DomU from a remote machine) the IOwait rapidly rises to 100% and
pretty much pegs there with occasional drops to 97% or so before
returning to 100% again.
Console/SSHD'd shell response is almost nil but sometimes works. 'ls'
or other commands which require disk access are fine unless the result
hasn't been cached in which case response can be anything from one
minute to ten.
When pegged at 100% IOwait the load average (reported by 'top') spirals
slowly but inexorably upwards (on the 1 VCPU instance we got as high as
12.45).
If left long enough the 'scp' (copy to a DomU in this instance,
initiated on the remote (push) machine) _will_ complete. After the
prompt is return on the remote machine the IOwait on the DomU remains at
100% for another five to ten minutes before dropping back down to pretty
much 0% (idle). The load average also returns to ~0.00 too. Only then
is console/SSHD'd shell access properly responsive again.
During these IOwait slowdowns I've managed to get a 'ps -auxwww' output
on a DomU which shows between four and six '[pdflush]' processes as well
as sadc running in the D state. I'm not clever enough to know if this
is important.
Also of note is that IOwait on the Dom0 never rises above 25% ever,
despite having stuff happening on the other DomU (potentially that too
being at 100% IOwait due to a similar copy to its filesystem, this time
a straight 'cp'
from an NFS share to the local disk) at the same time.
Can someone tell me what's going on please? I'd be hugely grateful.
Thanks!
Setup
-----
SunFire x4200 M2, 16 GB RAM, 4 cores, RAID 1 (x2 136GB)
Dom0 - RHEL 5.1 32-bit (kernel 2.6.18-53.1.4.el5xen)
DomU - RHEL 4 Update 6 32-bit (kernel 2.6.9-67.0.1.ELxenU)
1 VCPU, 4G, 41GB, 1 physical chassis NIC (dedicated)
No LVM, plain ext3 (/boot (100MB), swap (2GB), / (39GB)
DomU - RHEL 4 Update 6 32-bit (kernel 2.6.9-67.0.1.ELxenU)
3 VCPUs, 8G, 81GB, 1 NIC bridged/shared with Dom0
No LVM, plain ext3 (/boot (100MB), swap (2GB), / (79GB)
Dom0:
# rpm -qa | egrep 'xen|virt' | grep -v rhn
xen-3.0.3-41.el5
kernel-xen-2.6.18-53.el5
python-virtinst-0.103.0-3.el5_1.1
kernel-xen-2.6.18-53.1.4.el5
xen-libs-3.0.3-41.el5
libvirt-0.2.3-9.el5
kernel-xen-devel-2.6.18-53.1.4.el5
libvirt-python-0.2.3-9.el5
DomU (both):
# rpm -qa | egrep 'xen|virt'
kernel-xenU-2.6.9-67.0.1.EL
kernel-xenU-devel-2.6.9-67.0.1.EL
Tools used:
iostat
top
mpstat
Ben
--
Unix Support, MISD, University of Cambridge, England Plugger of wire,
typer of keyboard, imparter of Clue
Life Is Short. It's All Good.
_______________________________________________
rhelv5-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/rhelv5-list
_______________________________________________
rhelv5-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/rhelv5-list