Arno J. Klaassen wrote:
Hello,
Alexander Sabourenkov [EMAIL PROTECTED] writes:
Hello.
I have ported the workaround for the hardware bug that causes data
corruption on Promise SATA300 TX4 cards to RELENG_7.
Bug description:
SATA300 TX4 hardware chokes if last PRD entry (in a dma transfer) is
larger than 164 bytes. This was found while analysing vendor-supplied
linux driver.
Workaround:
Split trailing PRD entry if it's larger that 164 bytes.
Two supplied patches do fix problem on my machine.
definitely an improvement, but not sufficient (for my setup ) :
amd64-releng_6 on an ASUS A8V UP (box ran rock-stable
for years i386-releng_5 with same hardware apart TX4 and
drives)
from dmesg :
atapci0: Promise PDC40718 SATA300 controller port 0xe000-0xe07f,0xd800-0xd8ff
mem 0xfbb0-0xfbb00fff,0xfba0-0xfba1 irq 18 at device 13.0 on pci0
ata2: ATA channel 0 on atapci0
ata3: ATA channel 1 on atapci0
ata4: ATA channel 2 on atapci0
ata5: ATA channel 3 on atapci0
atapci1: VIA 6420 SATA150 controller port
0xd400-0xd407,0xd000-0xd003,0xc800-0xc807,0xc400-0xc403,0xc000-0xc00f,0xb800-0xb8ff
irq 20 at device 15.0 on pci0
ata6: ATA channel 0 on atapci1
ata7: ATA channel 1 on atapci1
atapci2: VIA 8237 UDMA133 controller port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 15.1 on pci0
ata0: ATA channel 0 on atapci2
ata1: ATA channel 1 on atapci2
[ ... ]
ad0: 38166MB Seagate ST3402111A 3.AAJ at ata0-master UDMA100
ad6: 476940MB WDC WD5000AAKS-00TMA0 12.01C01 at ata3-master SATA300
ad12: 305245MB WDC WD3200JD-22KLB0 08.05J08 at ata6-master SATA150
booting from ad0 and simple gconcat over ad6 and ad12.
Improvement : I now can fsck /dev/concat/data without
ad6 being detached
Persistent problem : when I rsync an nfs-mounted disk to /dev/concat/data,
I get after about some Gigs of data have been transfered :
Nov 2 16:39:55 charlotte kernel: ad6: WARNING - WRITE_DMA UDMA ICRC error
(retrying request) LBA=268435392
Nov 2 16:40:50 charlotte kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE
taskqueue timeout - completing request directly
Nov 2 16:40:50 charlotte kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE
taskqueue timeout - completing request directly
Nov 2 16:40:50 charlotte kernel: ad6: WARNING - SETFEATURES ENABLE RCACHE
taskqueue timeout - completing request directly
Nov 2 16:40:50 charlotte kernel: ad6: WARNING - SETFEATURES ENABLE WCACHE
taskqueue timeout - completing request directly
Nov 2 16:40:50 charlotte kernel: ad6: WARNING - SET_MULTI taskqueue timeout -
completing request directly
Nov 2 16:40:50 charlotte kernel: ad6: TIMEOUT - WRITE_DMA retrying (0 retries
left) LBA=268435392
Nov 2 16:40:50 charlotte kernel: ad6: FAILURE - WRITE_DMA
status=ffBUSY,READY,DMA_READY,DSC,DRQ,CORRECTABLE,INDEX,ERROR
error=ffICRC,UNCORRECTABLE,MEDIA_CHANGED,NID_NOT_FOUND,MEDIA_CHANGE_REQEST,ABORTED,NO_MEDIA,ILLEGAL_LENGTH
LBA=268435392
Nov 2 16:40:50 charlotte kernel:
g_vfs_done():concat/data[WRITE(offset=137438920704, length=131072)]error = 5
Nov 2 16:40:50 charlotte kernel: ad6: TIMEOUT - WRITE_DMA48 retrying (1 retry
left) LBA=268435648
Nov 2 16:40:50 charlotte kernel: ad6: WARNING - WRITE_DMA48 UDMA ICRC error
(retrying request) LBA=268435648
Nov 2 16:40:50 charlotte kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE
taskqueue timeout - completing request directly
Nov 2 16:40:50 charlotte kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE
taskqueue timeout - completing request directly
Nov 2 16:40:50 charlotte kernel: ad6: WARNING - SETFEATURES ENABLE RCACHE
taskqueue timeout - completing request directly
Nov 2 16:40:50 charlotte kernel: ad6: WARNING - SETFEATURES ENABLE WCACHE
taskqueue timeout - completing request directly
Nov 2 16:40:50 charlotte kernel: ad6: WARNING - SET_MULTI taskqueue timeout -
completing request directly
Nov 2 16:40:50 charlotte kernel: ad6: FAILURE - WRITE_DMA48 timed out
LBA=268435648
Nov 2 16:40:50 charlotte kernel:
g_vfs_done():concat/data[WRITE(offset=137439051776, length=131072)]error = 5
...
I will test again with #define PDC_MAXLASTSGSIZE 32*4 (just to see
if that makes a difference)
Regards, Arno
Just a guess here, I bet that patch helped, but there are compound
problems reguarding SATA on amd64 in 7.x Do a quick search for [sata]
(especially g_vfs_done) in the PR database. Hopefully this removed a
layer of bugs so the other ones are easyer to fix.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]