On 07/01/2014 01:29 PM, Andras Horvath wrote:
On Mon, 30 Jun 2014 16:23:45 -0400
Lamar Owen <[email protected]> wrote:
On 06/30/2014 03:52 PM, Andras Horvath wrote:
Actually the drive has its own power so it is not USB powered. I
cannot tell if the drive spins down (did not get the idea to check
it), but the CPU is in 100% I/O wait all the time after this happens.
I was told the disk is a WD RED, but I'll check the power mode later
with hdparm.
The only time I've personally run into the 100% I/O wait issue with EL6
was when I was trying to RAID a Seagate 1.5TB internal SATA drive with a
WD GREEN 1.5TB SATA drive. The system was basically unusable, with
frequent and long forays into 100% iowait territory. Replacing the WD
GREEN drive with another 1.5TB Seagate fixed that. It could be WD's
TLER/non-TLER (Time-Limited Error Recovery) handling doing. this. More
info on this at http://www.wdc.com/en/library/other/2579-001098.pdf and
googling 'WD TLER' yields a lot of hits.
Another possibility is that the idle timer is set up on the disk; I
would think that it would hit you sooner, though, if it was that issue.
I ran into that sort of issue with an eSATA Seagate a long time ago,
where throughput was good but after a while it would error out. For
some reason the standard Linux write caching and the timeout interacted
badly. There's more about the WD RED and GREEN drives and this idle
timer at
http://forums.freenas.org/index.php?threads/hacking-wd-greens-and-reds-with-wdidle3-exe.18171/
with some open source tool at http://idle3-tools.sourceforge.net/
A note:
hdparm -I /dev/sda | grep -i pow
* Power Management feature set
Power-Up In Standby feature set
* SET_FEATURES required to spinup after power up
* Host-initiated interface power management
Device-initiated interface power management
I cannot access the power levels through the USB interface. I'll check the
eSATA connection tomorrow.
I restarted copying again, and in a minute the CPU hung again with 100% I/O wait. The
"iotop" output shows absolutely nothing, as if there was no load on the disks
at all. Interrupt and context switch is around 20-50, so almost nothing (dstat output).
Disk operation is zero. Load is at 5.01. The rsync processes that I'm using for the copy
cannot be killed or force killed.
Any idea? Thanks.
Andras
Circling back around to the "is it spinning" question, for externals in
a workable enclosure, I've found the "Jurassic Park" test to be rather
trustworthy.[1]
Does dmesg report anything interesting?
Pat
[1] https://www.youtube.com/watch?v=1koa2xAxCAw
--
Pat Riehecky
Scientific Linux developer
http://www.scientificlinux.org/