On Tue, 1 Jul 2014 14:24:48 -0500 Pat Riehecky <[email protected]> wrote:
> On 07/01/2014 01:29 PM, Andras Horvath wrote: > > On Mon, 30 Jun 2014 16:23:45 -0400 > > Lamar Owen <[email protected]> wrote: > > > >> On 06/30/2014 03:52 PM, Andras Horvath wrote: > >>> Actually the drive has its own power so it is not USB powered. I > >>> cannot tell if the drive spins down (did not get the idea to check > >>> it), but the CPU is in 100% I/O wait all the time after this happens. > >>> I was told the disk is a WD RED, but I'll check the power mode later > >>> with hdparm. > >> The only time I've personally run into the 100% I/O wait issue with EL6 > >> was when I was trying to RAID a Seagate 1.5TB internal SATA drive with a > >> WD GREEN 1.5TB SATA drive. The system was basically unusable, with > >> frequent and long forays into 100% iowait territory. Replacing the WD > >> GREEN drive with another 1.5TB Seagate fixed that. It could be WD's > >> TLER/non-TLER (Time-Limited Error Recovery) handling doing. this. More > >> info on this at http://www.wdc.com/en/library/other/2579-001098.pdf and > >> googling 'WD TLER' yields a lot of hits. > >> > >> Another possibility is that the idle timer is set up on the disk; I > >> would think that it would hit you sooner, though, if it was that issue. > >> I ran into that sort of issue with an eSATA Seagate a long time ago, > >> where throughput was good but after a while it would error out. For > >> some reason the standard Linux write caching and the timeout interacted > >> badly. There's more about the WD RED and GREEN drives and this idle > >> timer at > >> http://forums.freenas.org/index.php?threads/hacking-wd-greens-and-reds-with-wdidle3-exe.18171/ > >> with some open source tool at http://idle3-tools.sourceforge.net/ > > A note: > > > > hdparm -I /dev/sda | grep -i pow > > * Power Management feature set > > Power-Up In Standby feature set > > * SET_FEATURES required to spinup after power up > > * Host-initiated interface power management > > Device-initiated interface power management > > > > I cannot access the power levels through the USB interface. I'll check the > > eSATA connection tomorrow. > > > > I restarted copying again, and in a minute the CPU hung again with 100% I/O > > wait. The "iotop" output shows absolutely nothing, as if there was no load > > on the disks at all. Interrupt and context switch is around 20-50, so > > almost nothing (dstat output). Disk operation is zero. Load is at 5.01. The > > rsync processes that I'm using for the copy cannot be killed or force > > killed. > > > > Any idea? Thanks. > > > > > > Andras > > Circling back around to the "is it spinning" question, for externals in > a workable enclosure, I've found the "Jurassic Park" test to be rather > trustworthy.[1] > > Does dmesg report anything interesting? > > Pat > > > [1] https://www.youtube.com/watch?v=1koa2xAxCAw This part of the dmesg output repeats forever: sd 0:0:0:0: [sda] Unhandled error code sd 0:0:0:0: [sda] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 41 19 e0 00 00 f0 00 __ratelimit: 20 callbacks suppressed Buffer I/O error on device sda1, logical block 46670396 lost page write due to I/O error on sda1 Buffer I/O error on device sda1, logical block 46670397 lost page write due to I/O error on sda1 Buffer I/O error on device sda1, logical block 46670398 lost page write due to I/O error on sda1 Buffer I/O error on device sda1, logical block 46670399 lost page write due to I/O error on sda1 Buffer I/O error on device sda1, logical block 46670400 lost page write due to I/O error on sda1 Buffer I/O error on device sda1, logical block 46670401 lost page write due to I/O error on sda1 Buffer I/O error on device sda1, logical block 46670402 lost page write due to I/O error on sda1 Buffer I/O error on device sda1, logical block 46670403 lost page write due to I/O error on sda1 Buffer I/O error on device sda1, logical block 46670404 lost page write due to I/O error on sda1 Buffer I/O error on device sda1, logical block 46670405 lost page write due to I/O error on sda1 usb 1-4: reset high speed USB device number 2 using ehci_hcd usb 1-4: reset high speed USB device number 2 using ehci_hcd usb 1-4: reset high speed USB device number 2 using ehci_hcd I'll have physical access to the disk only tomorrow. Will report back. Andras
