R: ahci timeout

2011-01-13 Thread Barbara



From : nde...@gmail.com

On 27 Dec, 2010, at 21:20 , Barbara wrote:

 
 As my old PATA hard disk was failing, I had to replace it with a new SATA 
 drive where I moved my FreeBSDs installations, as PATA drives are not 
easy 
to 
 find these days.
 So I had to move one of my data drive from a VIA8237A SATA controller to 
the 
 last free SATA slot on a Marvell 88SX6121 to make room for the new hd.
 The hd I moved was working perfectly when connected to the VIA controller.
 Now, with the Marvell I'm getting messages like the following twos while 
using 
 the disk:
ahcich0: Timeout on slot 10
ahcich0: is  cs 3800 ss 3c00 rs 3c00 tfd 50010040 
serr 
 
 
ahcich0: Timeout on slot 5
ahcich0: is  cs 0180 ss 01e0 rs 01e0 tfd 50040040 
serr 
 
 
 This doesn't happen regularly. For example downloading from a slow 
website 
on 
 it, so few kb/s, is ok.
 But if I copy files from the disk attacked to the Marvell controller to 
 another another disk, or for example run md5 on some files, it's very 
likely to 
 happen.
 The process accessing the disk can not be killed even with -9, ^C does 
 nothing, and umount doesn't exit.
 If I'm copying files on it from another disk it can't be unmounted too as 
the 
 unkillable process has it in use.
 On shutdown many disk doesn't get unmounted, so there are a lot of fsck 
on 
 boot, and on CURRENT (last built yesterday), FreeBSD enter debugger as it 
fail 
 flushing disk caches.
 
 Relevant part from dmesg:
 
 atapci0: Marvell 88SX6121 UDMA133 controller port 0xdc00-0xdc07,0xd880-
 0xd883,0xd800-0xd807,0xd480-0xd483,0xd400-0xd40f mem 0xfbdffc00-
0xfbdf 
irq 
 28 at device 0.0 on pci6
 ahci0: Marvell 88SX6121 AHCI SATA controller on atapci0
 ahci0: AHCI v1.00 with 2 3Gbps ports, Port Multiplier supported
 ahcich0: AHCI channel at channel 0 on ahci0
 ahcich1: AHCI channel at channel 1 on ahci0
 ata2: ATA channel 0 on atapci0
 atapci1: VIA 8237A SATA150 controller port 0xbc00-0xbc07,0xb880-0xb883,
 0xb800-0xb807,0xb480-0xb483,0xb400-0xb40f,0xb000-0xb0ff irq 21 at device 
15.0 
 on pci0
 ata3: ATA channel 0 on atapci1
 ata4: ATA channel 1 on atapci1
 atapci2: VIA 8237A UDMA133 controller port 0x1f0-0x1f7,0x3f6,0x170-
0x177,
 0x376,0xfc00-0xfc0f at device 15.1 on pci0
 ata0: ATA channel 0 on atapci2
 ata1: ATA channel 1 on atapci2
 
 ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
 ada0: ST31000528AS CC44 ATA-8 SATA 2.x device
 ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
 ada0: Command Queueing enabled
 ada0: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
 ada1 at ata3 bus 0 scbus3 target 0 lun 0
 ada1: WDC WD2500KS-00MJB0 02.01C03 ATA-7 SATA 2.x device
 ada1: 150.000MB/s transfers (SATA 1.x, UDMA5, PIO 8192bytes)
 ada1: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)
 ada2 at ata4 bus 0 scbus4 target 0 lun 0
 ada2: ST3500320AS SD1A ATA-8 SATA 1.x device
 ada2: 150.000MB/s transfers (SATA 1.x, UDMA5, PIO 8192bytes)
 ada2: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C)
 ada3 at ata0 bus 0 scbus5 target 0 lun 0
 ada3: MAXTOR STM3160212A 3.AAJ ATA-7 device
 ada3: 100.000MB/s transfers (UDMA5, PIO 8192bytes)
 ada3: 152627MB (312581808 512 byte sectors: 16H 63S/T 16383C)
 
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Just to add a me too.

I'm running -STABLE but have the same problems with Marvell 88SX6121 giving 
ahci timeout messages.

Regards,
Nikolay


Nikolay, thanks for the feedback, even if unfortunately it's negative...
I see that in both 8-STABLE (8.2-PRERELEASE now) and 9.0-CURRENT, both 
rebuilt 
no more than a week ago.
I've also tried rebuilding the kernel with:
   options CAMDEBUG
   options CAM_DEBUG_BUS=0
   options CAM_DEBUG_TARGET=0
   options CAM_DEBUG_LUN=0
   options CAM_DEBUG_FLAGS=CAM_DEBUG_TRACE
(0 is the bus/target/lun of the marvell controller/attached hd)
and run a test computing md5 for about 60GB of files.
Obviously, as the debug options where active, it run successfully without 
any 
problems :)
Is there any other info I should provide or any other test that I can do?

Thanks
Barbara


Maybe the attached disk has some problems which aren't handled if it's 
attached to a 88SE6121 controller?
From what I can see, connecting it to the other internal sata controller, 
which is a VIA 8237A, or to an external PCIe Sil3132 controller using the siis 
driver, those timeouts aren't happening.
Anyway I see something which I don't understand.
I tried reading some files (~1 gb) from the same slice (1st one, ~200gb) while 
looking at gstat.
Some files are being read at 100mb/s some others at about 4 mb/s. It seems 
that a zone of the disk is very slow.
The disk is a Seagate 7200.12 and neither smartmontools nor SeaTools (Seagate 
diagnostic) 

R: ahci timeout

2010-12-28 Thread Barbara



As my old PATA hard disk was failing, I had to replace it with a new SATA 
drive where I moved my FreeBSDs installations, as PATA drives are not easy 
to 
find these days.
So I had to move one of my data drive from a VIA8237A SATA controller to the 
last free SATA slot on a Marvell 88SX6121 to make room for the new hd.
The hd I moved was working perfectly when connected to the VIA controller.
Now, with the Marvell I'm getting messages like the following twos while 
using 
the disk:
ahcich0: Timeout on slot 10
ahcich0: is  cs 3800 ss 3c00 rs 3c00 tfd 50010040 
serr 


ahcich0: Timeout on slot 5
ahcich0: is  cs 0180 ss 01e0 rs 01e0 tfd 50040040 
serr 


This doesn't happen regularly. For example downloading from a slow website 
on 
it, so few kb/s, is ok.
But if I copy files from the disk attacked to the Marvell controller to 
another another disk, or for example run md5 on some files, it's very likely 
to 
happen.
The process accessing the disk can not be killed even with -9, ^C does 
nothing, and umount doesn't exit.
If I'm copying files on it from another disk it can't be unmounted too as 
the 
unkillable process has it in use.
On shutdown many disk doesn't get unmounted, so there are a lot of fsck on 
boot, and on CURRENT (last built yesterday), FreeBSD enter debugger as it 
fail 
flushing disk caches.

Relevant part from dmesg:

atapci0: Marvell 88SX6121 UDMA133 controller port 0xdc00-0xdc07,0xd880-
0xd883,0xd800-0xd807,0xd480-0xd483,0xd400-0xd40f mem 0xfbdffc00-0xfbdf 
irq 
28 at device 0.0 on pci6
ahci0: Marvell 88SX6121 AHCI SATA controller on atapci0
ahci0: AHCI v1.00 with 2 3Gbps ports, Port Multiplier supported
ahcich0: AHCI channel at channel 0 on ahci0
ahcich1: AHCI channel at channel 1 on ahci0
ata2: ATA channel 0 on atapci0
atapci1: VIA 8237A SATA150 controller port 0xbc00-0xbc07,0xb880-0xb883,
0xb800-0xb807,0xb480-0xb483,0xb400-0xb40f,0xb000-0xb0ff irq 21 at device 
15.0 
on pci0
ata3: ATA channel 0 on atapci1
ata4: ATA channel 1 on atapci1
atapci2: VIA 8237A UDMA133 controller port 0x1f0-0x1f7,0x3f6,0x170-0x177,
0x376,0xfc00-0xfc0f at device 15.1 on pci0
ata0: ATA channel 0 on atapci2
ata1: ATA channel 1 on atapci2

ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: ST31000528AS CC44 ATA-8 SATA 2.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada1 at ata3 bus 0 scbus3 target 0 lun 0
ada1: WDC WD2500KS-00MJB0 02.01C03 ATA-7 SATA 2.x device
ada1: 150.000MB/s transfers (SATA 1.x, UDMA5, PIO 8192bytes)
ada1: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)
ada2 at ata4 bus 0 scbus4 target 0 lun 0
ada2: ST3500320AS SD1A ATA-8 SATA 1.x device
ada2: 150.000MB/s transfers (SATA 1.x, UDMA5, PIO 8192bytes)
ada2: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C)
ada3 at ata0 bus 0 scbus5 target 0 lun 0
ada3: MAXTOR STM3160212A 3.AAJ ATA-7 device
ada3: 100.000MB/s transfers (UDMA5, PIO 8192bytes)
ada3: 152627MB (312581808 512 byte sectors: 16H 63S/T 16383C)


I've tried with the following setting in /boot/loader.conf:
hw.pci.enable_msix=0
hw.pci.enable_msi=0
kern.cam.ada.default_timeout=60
with no luck.
I had to hard reset while playing a video from the hd connected to the Marvell 
controller as, after running shutdown, it was stuck trying to umount all the 
partitions.
Even ctrl+alt+del or a short pressure of the power button wasn't turning it 
down.


I've also run smartctl -t long on the disk and no error are reported:
$ smartctl -l selftest /dev/ada0
# 1  Extended offlineCompleted without error   00%  
5542 -
$ smartctl -l error /dev/ada0
No Errors Logged

Here's my verbose dmesg:
http://pastebin.com/sp6Js9Yj

Btw, why is the controller identified as 88SX6121?
Shouldn't it be 88SE6121 (s/X/E/)??? 
This is what is reported on ASUS website, mb manual and so on, and even 
running lshal!
There is no 88SX6121 here:
http://en.wikipedia.org/wiki/List_of_Marvell_Technology_Group_chipsets


Thanks
Barbara

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org