Tejun Heo wrote:
> Jonas Lundgren wrote:
> [--snip--]
>> Also, it doesn't matter if I enable AHCI in the BIOS (But with AHCI
>> enabled the disks spin down/power down when I boot, just to power up
>> again a few seconds after. The boot progress freezes until the disks
>> have spun up again. (This happens when the kernel probes the sata
>> controller ports at bootup, the disks spin down at the same time, but
>> spin up one by one as they're getting probed))
>
> Likely fix is pending for this problem.
>
>> I've tried changing I/O scheduler, only noticable diffrence is when I
>> use "noop". Then I get like 20mb/sec write instead of 4mb/sec. I have no
>> idea why this is :P
>>
>> Example of what I mean with crappy performance:
>> dd if=/dev/zero of=test232 bs=1M count=100; time sync
>> 100+0 records in
>> 100+0 records out
>> 104857600 bytes (105 MB) copied, 0.130424 s, 804 MB/s
>> real 0m21.104s
>> user 0m0.000s
>> sys 0m0.011s
>>
>> 21 seconds to do a seq write of 100mb.. And during this time ALL other
>> disk IO gets starved, I can't do anything that uses disk IO for the
>> duration.. (not even `ls`)
>
> What does the kernel say during this writing? Can you post the result
> of the following?
>
> 1. reboot
> 2. dmesg -c
> 3. time dd if=/dev/zero.. blah
> 4. dmesg
>
> Also, does 'mount -o remount,barrier=0 /' change anything?
I will post this info as soon as I can "reproduce" the error.
>
>> Yet, a hdparm shows a decent read
>> hdparm -tT /dev/md4
>> /dev/md4:
>> Timing cached reads: 8060 MB in 1.99 seconds = 4042.19 MB/sec
>> Timing buffered disk reads: 400 MB in 3.00 seconds = 133.28 MB/sec
>>
>> dd if=1GBzeroFile of=/dev/null bs=1M count=1000
>> 1000+0 records in
>> 1000+0 records out
>> 1048576000 bytes (1.0 GB) copied, 11.4335 s, 91.7 MB/s
>>
>> This is the cpu usage stats I get from top when running the dd write:
>> Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 99.0%wa, 0.5%hi, 0.5%si, 0.0%st
>> Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>>
>> Pretty crappy read speeds compared to what I got on my previous mobo
>> (around 140mb/sec), but still alot better than the 4mb/sec I get when
>> writing..
>
> Which controller did you use on your previous mobo? If you're using
> ata_piix and hook two hard drives as primary and secondary on the same
> channel, some level of performance degradation is expected. ata_piix
> can only issue command to only one of the two drives at once. Is the
> read performance still bad in ahci mode?
Atm I run the ICH8 SATA ports in AHCI mode with "IDE bus master"(To be
honest I don't really know what this option does, no info about it in
the BIOS nor the mobo manual) turned off in BIOS. The drives are
connected to port 1, 3, 6 and 8 (raptor+raptor on 1+3, and WD 250G + WD
250G (also a raid0) on ports 6+8)
>
> [--snip--]
>> Dmesg output from the error(s): (sda and sdb are 2 * 74GB raptor SATA
>> drives in a Linux software raid0)
>>
>> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>> ata1.00: (BMDMA stat 0x20)
>> ata1.00: tag 0 cmd 0xca Emask 0x4 stat 0x40 err 0x0 (timeout)
>
> This might be a missed interrupt. It's a write. DMA engine is done
> finishing transferring all data. Device is ready for the next command
> but the interrupt has never arrived.
>
>> ata1: port is slow to respond, please be patient
>> ata1: port failed to respond (30 secs)
>> ata1: soft resetting port
>> ATA: abnormal status 0xD0 on port 0xFA07
>> ATA: abnormal status 0xD0 on port 0xFA07
>> ATA: abnormal status 0xD0 on port 0xFA07
>> ATA: abnormal status 0xD0 on port 0xFA07
>> ATA: abnormal status 0xD0 on port 0xFA07
>> ATA: abnormal status 0xD0 on port 0xFA07
>> ata1.00: qc timeout (cmd 0xec)
>> ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
>> ata1.00: revalidation failed (errno=-5)
>> ata1: failed to recover some devices, retrying in 5 secs
>
> But this is weird. If it were a missed interrupt, softreset should have
> recovered it instantly. Something fishy is going on.
>
> [--snip--]
>> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>> ata1.00: (BMDMA stat 0x21)
>> ata1.00: tag 0 cmd 0xc8 Emask 0x4 stat 0x40 err 0x0 (timeout)
>
> Same thing for read.
>
>> ata1: port is slow to respond, please be patient
>> ata1: port failed to respond (30 secs)
>
> Again, pre-reset wait times out. Weird.
>
>> ata1: soft resetting port
>> ata1.00: