RE: MCP55 SATA data corruption in FreeBSD 7

2008-07-24 Thread Daniel Eriksson
Munenori Ohuchi wrote:

 Could you try the following patch?
 ...
 If you have a device like 'ad4' which is detected as 
 'udma=UDMA100', this patch will work.

The drive in question looked like this in the verbose boot-log:

ata2-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
ad4: 715404MB WDC WD7500AAKS-00RBA0 30.04G30 at ata2-master SATA300
ad4: 1465149168 sectors [1453521C/16H/63S] 16 sectors/interrupt 1 depth
queue

I tried the patch even though it probably isn't applicable to my
problem, and unsurprisingly it didn't help. :-)

Thanks for trying to help though!

___
Daniel Eriksson (http://www.toomuchdata.com/)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: MCP55 SATA data corruption in FreeBSD 7

2008-07-23 Thread Daniel Eriksson
I wrote:

 I am having problems with silent data corruption on (some) drives
 connected to an MCP55 SATA controller.

The original problem showed up when talking to (brand new) Samsung 1TB
drives in SATA-300 mode hooked up to the onboard controller. I have now
tested with a 750GB Seagate drive in both SATA-300 and SATA-150 mode.
Unfortunately the problem was not Samsung-related or SATA-300 specific.

This points to a driver problem with the chipset/controller combination,
or possibly some sort of strange interaction with other hardware
(interrupts?). I have no idea how to troubleshoot this any further.

___
Daniel Eriksson (http://www.toomuchdata.com/)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-23 Thread Jeremy Chadwick
On Wed, Jul 23, 2008 at 02:32:01PM +0200, Daniel Eriksson wrote:
  I am having problems with silent data corruption on (some) drives
  connected to an MCP55 SATA controller.
 
 The original problem showed up when talking to (brand new) Samsung 1TB
 drives in SATA-300 mode hooked up to the onboard controller. I have now
 tested with a 750GB Seagate drive in both SATA-300 and SATA-150 mode.
 Unfortunately the problem was not Samsung-related or SATA-300 specific.
 
 This points to a driver problem with the chipset/controller combination,
 or possibly some sort of strange interaction with other hardware
 (interrupts?). I have no idea how to troubleshoot this any further.

Or it could just be a bad motherboard.  (I'd need to go re-read the
thread to remember if you were seeing this on more than one board.)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-23 Thread Manjunath Ranganathaiah
On 7/1/08, Daniel Eriksson [EMAIL PROTECTED] wrote:

 The server with 570 SLI chipset has a bunch of new SATA-300 drives
 hooked up to the MCP55 controller and it is giving me silent data
 corruption (easily detectable by running ZFS scrub, every time I run it
 new checksum errors show up).


Could be in-memory data corruption. How much RAM installed on the
system?

-Manjunath
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: MCP55 SATA data corruption in FreeBSD 7

2008-07-23 Thread Daniel Eriksson
Manjunath Ranganathaiah wrote:

 Could be in-memory data corruption. How much RAM installed on the
 system?

I doubt it. If it was a RAM problem then all drives would be affected.

___
Daniel Eriksson (http://www.toomuchdata.com/)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-23 Thread Munenori Ohuchi

Hi Daniel,

Could you try the following patch?
You can apply this patch in freebsd 7.0 just by copying and
pasting to your shell.

Before you apply this patch, you can check as follows 
if this works on your environment or not.


1. Set bootverbose mode.

cat  /boot/loader.conf  EOF
boot_verbose=YES
EOF

2. Reboot your machine.
3. Check the dmesg log of your HDDs as follows.

dmesg | grep ata
.
.
ata2-master: pio=PIO4 wdma=WDMA2 udma=UDMA100 cable=40 wire

ad4: 115328MB Super Talent Tech 02.10103 at ata2-master SATA150
ata3-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
ad6: 953869MB Hitachi HDS721010KLA330 GKAOA51D at ata3-master SATA150

If you have a device like 'ad4' which is detected as 
'udma=UDMA100', this patch will work.


patch start
cd /usr/src/sys/dev/ata
cat ata-chipset.c.patch EOF
--- ata-chipset.c.orig  2008-04-02 00:20:49.0 +0900
+++ ata-chipset.c   2008-07-18 19:15:24.0 +0900
@@ -377,6 +377,7 @@
ata_sata_setmode(device_t dev, int mode)
{
struct ata_device *atadev = device_get_softc(dev);
+struct ata_params *atacap = atadev-param;

/*
 * if we detect that the device isn't a real SATA device we limit
@@ -390,7 +391,7 @@

   /* on some drives we need to set the transfer mode */
   ata_controlcmd(dev, ATA_SETFEATURES, ATA_SF_SETXFER, 0,
-  ata_limit_mode(dev, mode, ATA_UDMA6));
+  ata_limit_mode(dev, mode, ata_umode(atacap)));

   /* query SATA STATUS for the speed */
if (ch-r_io[ATA_SSTATUS].res 
EOF

patch -l  ata-chipset.c.patch
patch end

Best regards,
--
Munenori Ohuchi [EMAIL PROTECTED]
Internet Initiative Japan Inc.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-03 Thread Zaphod Beeblebrox
On Tue, Jul 1, 2008 at 5:01 AM, Daniel Eriksson [EMAIL PROTECTED]
wrote:


 I am having problems with silent data corruption on (some) drives
 connected to an MCP55 SATA controller.



I have an MCP55 controller here running most of my RAID array.  When I
origionally loaded this machine, I had many problems until I figured out I
could only use every other SATA port with any degree of reliability.

It turned out that with the first two drives I bought, this every-other rule
was true. These drives are:

ad4: 238475MB SAMSUNG SP2504C VT100-33 at ata2-master SATA300

But When I bought new drives, they happily used every channel:

ad10: 715404MB WDC WD7500AAKS-00RBA0 30.04G30 at ata5-master SATA300

The difference, I'm lead to believe, is that the Samsung drive is a PATA
drive with a SATA to PATA bridge on it.  The newer true SATA drives work
fine.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: MCP55 SATA data corruption in FreeBSD 7

2008-07-02 Thread Daniel Eriksson
Andrey V. Elsukov wrote:

 I have two motherboards with MCP55. They work well and I didn't
 see any data corruption.

Do you have SATA-150 or SATA-300 drives connected to the motherboards?

___
Daniel Eriksson (http://www.toomuchdata.com/)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-02 Thread Andrey V. Elsukov

Daniel Eriksson wrote:

I have two motherboards with MCP55. They work well and I didn't
see any data corruption.


Do you have SATA-150 or SATA-300 drives connected to the motherboards?


All drives are SATA-300:
FreeBSD 6.2:
1x WDC WD3200YS-01PGB0/21.00M21
FreeBSD 8.0:
5x WDC WD5001ABYS-01YNA0/59.01D01
1x WDC WD1200JS-00MHB0/02.01C03


--
WBR, Andrey V. Elsukov
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-02 Thread Jeremy Chadwick
On Wed, Jul 02, 2008 at 11:17:48AM +0400, Andrey V. Elsukov wrote:
 Daniel Eriksson wrote:
 I have two motherboards with MCP55. They work well and I didn't
 see any data corruption.

 Do you have SATA-150 or SATA-300 drives connected to the motherboards?

 All drives are SATA-300:
 FreeBSD 6.2:
 1x WDC WD3200YS-01PGB0/21.00M21
 FreeBSD 8.0:
 5x WDC WD5001ABYS-01YNA0/59.01D01
 1x WDC WD1200JS-00MHB0/02.01C03

Which makes me wonder if there's multiple revisions of the MCP55, or, if
Samsung drives simply don't behave properly with that chipset (this has
my vote).

Can the OP get some non-Samsung disks for testing?

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-02 Thread Andrey V. Elsukov

Jeremy Chadwick wrote:

Which makes me wonder if there's multiple revisions of the MCP55, or, if
Samsung drives simply don't behave properly with that chipset (this has
my vote).


Daniel has the same revision (as I can see from pciconf).

--
WBR, Andrey V. Elsukov
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: MCP55 SATA data corruption in FreeBSD 7

2008-07-02 Thread Daniel Eriksson
Jeremy Chadwick wrote:

 Can the OP get some non-Samsung disks for testing?

I've got a 750 GB Western Digital that I've been planning to use to
verify if it's a SATA-150 / SATA-300 problem (it can be jumpered to
SATA-150), but the drive is packed with valuable data that I'd have to
move elsewhere first.

I'll get to it eventually, but maybe not this week.

___
Daniel Eriksson (http://www.toomuchdata.com/)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: MCP55 SATA data corruption in FreeBSD 7

2008-07-02 Thread Chris Rees
 Date: Wed, 2 Jul 2008 10:55:07 +0200
 Daniel Eriksson [EMAIL PROTECTED] wrote:
 Jeremy Chadwick wrote:

 Can the OP get some non-Samsung disks for testing?

 I've got a 750 GB Western Digital that I've been planning to use to
 verify if it's a SATA-150 / SATA-300 problem (it can be jumpered to
 SATA-150), but the drive is packed with valuable data that I'd have to
 move elsewhere first.

 I'll get to it eventually, but maybe not this week.

 ___
 Daniel Eriksson (http://www.toomuchdata.com/)


Looks like I'm the guinea pig for now, I'll post in about half an hour
with the results :)

This is a clean install; it works perfectly with the restriction
jumper on, now it comes off.

Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-02 Thread Chris Rees
On 02/07/2008, Chris Rees [EMAIL PROTECTED] wrote:
  Date: Wed, 2 Jul 2008 10:55:07 +0200
   Daniel Eriksson [EMAIL PROTECTED] wrote:
   Jeremy Chadwick wrote:
  
   Can the OP get some non-Samsung disks for testing?
  
   I've got a 750 GB Western Digital that I've been planning to use to
   verify if it's a SATA-150 / SATA-300 problem (it can be jumpered to
   SATA-150), but the drive is packed with valuable data that I'd have to
   move elsewhere first.
  
   I'll get to it eventually, but maybe not this week.

 
   ___
   Daniel Eriksson (http://www.toomuchdata.com/)
  


 Looks like I'm the guinea pig for now, I'll post in about half an hour
  with the results :)

  This is a clean install; it works perfectly with the restriction
  jumper on, now it comes off.


  Chris


Real sorry fellas, can't reproduce on my M2N-SLI; chipset 570 SLI.
I've been building ports for hours on here, everything's working
perfectly I'm afraid.

# pciconf -lv

- snip -

[EMAIL PROTECTED]:0:5:0:class=0x010185 card=0x82391043 chip=0x037f10de
rev=0xa3 hdr=0x00
vendor = 'Nvidia Corp'
device = 'MCP55 SATA Controller'
class  = mass storage
subclass   = ATA

- snip -

# FreeBSD hydra.bayofrum.net 7.0-RELEASE-p2 FreeBSD 7.0-RELEASE-p2 #1:
Wed Jul  2 16:50:49 UTC 2008
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/HYDRA  amd64


Looks like mine's hardware revision (rev=) 0xa3; yours however is 0xa2.

I had PS/2 port trouble on Linux, and needed a BIOS update, perhaps it
came with that? Is the revision a firmware or hardware property?

Good luck tracking that down, anyway, hope my tests helped :)

Chris

-- 
R $h !  $- ! $+  $@ $2  @ $1 .UUCP.  (sendmail.cf)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-01 Thread Søren Schmidt

Hi

I'll look into that providing I can find HW to work on, IIRC I have  
one in the ATA collection but I have to verify when I get  to the lab.


-Søren

On 1Jul, 2008, at 11:01 , Daniel Eriksson wrote:



I am having problems with silent data corruption on (some) drives
connected to an MCP55 SATA controller.

I have two servers, both running RELENG_7_0/amd64. One has the 570  
Ultra

chipset, the other has 570 SLI. Both chipsets have the MCP55 SATA
controller.

The server with 570 Ultra chipset has a bunch of older 250GB SATA-150
drives hooked up to the MCP55 controller and it is working just fine.
The server with 570 SLI chipset has a bunch of new SATA-300 drives
hooked up to the MCP55 controller and it is giving me silent data
corruption (easily detectable by running ZFS scrub, every time I run  
it

new checksum errors show up). I know the drives are good because when
they are hooked up to another controller they work just fine.

Unfortunately the drives does not have a jumper for setting SATA-150
speed (they are Samsung 1 TB drives), and trying to force the drives  
to

SATA-150 speed with the patch provided by the manufacturer does not
seem to work (the drives still negotiate SATA-300 speed). I will try  
to

get my hands on another older SATA-150 drive (or a new that can be
jumpered) to verify if the culprit is the MCP55 revision (see below)  
or

the interface speed.


NOT working (570 SLI)
-
[EMAIL PROTECTED]:0:5:0: class=0x010185 card=0x72501462 chip=0x037f10de
rev=0xa2 hdr=0x00
   vendor = 'Nvidia Corp'
   device = 'MCP55 SATA Controller'
   class  = mass storage
   subclass   = ATA

Working (570 Ultra)
---
[EMAIL PROTECTED]:0:5:0: class=0x010185 card=0xcb8410de chip=0x037f10de
rev=0xa3 hdr=0x00
   vendor = 'Nvidia Corp'
   device = 'MCP55 SATA Controller'
   class  = mass storage
   subclass   = ATA

This is most likely related to kern/120296
(http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/120296) and kern/ 
121396

(http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/121396).


If someone else is having data corruption problems with drives  
connected
to an MCP55 controller it might be worth testing if limiting the  
drives

to SATA-150 makes a difference. It will most likely take me a while
before I can verify this.

---
Daniel Eriksson (http://www.toomuchdata.com/)



-Søren






___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-01 Thread Andrey V. Elsukov

Daniel Eriksson wrote:

Unfortunately the drives does not have a jumper for setting SATA-150
speed (they are Samsung 1 TB drives), and trying to force the drives to
SATA-150 speed with the patch provided by the manufacturer does not
seem to work (the drives still negotiate SATA-300 speed). I will try to
get my hands on another older SATA-150 drive (or a new that can be
jumpered) to verify if the culprit is the MCP55 revision (see below) or
the interface speed.


Which patch did you use?

--
WBR, Andrey V. Elsukov
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: MCP55 SATA data corruption in FreeBSD 7

2008-07-01 Thread Daniel Eriksson
Andrey V. Elsukov wrote:

 Which patch did you use?

I used BDM_SpeedSwitch1.zip
(http://www.samsung.com/global/system/business/hdd/faq/2007/10/29/184337
BDM_SpeedSwitch1.zip).

/Daniel
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-01 Thread Jeremy Chadwick
On Tue, Jul 01, 2008 at 11:01:17AM +0200, Daniel Eriksson wrote:
 The server with 570 Ultra chipset has a bunch of older 250GB SATA-150
 drives hooked up to the MCP55 controller and it is working just fine.
 The server with 570 SLI chipset has a bunch of new SATA-300 drives
 hooked up to the MCP55 controller and it is giving me silent data
 corruption (easily detectable by running ZFS scrub, every time I run it
 new checksum errors show up). I know the drives are good because when
 they are hooked up to another controller they work just fine.

With the same cables?  Not that I want to use cables as a scapegoat, but
in this case it seems applicable.

 Unfortunately the drives does not have a jumper for setting SATA-150
 speed (they are Samsung 1 TB drives), and trying to force the drives to
 SATA-150 speed with the patch provided by the manufacturer does not
 seem to work (the drives still negotiate SATA-300 speed). I will try to
 get my hands on another older SATA-150 drive (or a new that can be
 jumpered) to verify if the culprit is the MCP55 revision (see below) or
 the interface speed.

Can you provide atacontrol cap output for one of the drives?

I know in the case of Maxtor drives, there is a bug that exists in one
of their disk firmwares which causes silent data corruption and/or SATA
bus lockups when NCQ is used on nForce 4 chipsets.  Maxtor provides a
firmware update which fixes the bug.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-01 Thread Søren Schmidt

Hi

OK, the only modern nVidia board I have is MCP51 based, however it  
uses the same codepath as the MCP55.
Anyhow, there has been fixes fro these in -current, thats not in any  
of the releng's yet.


Please try the attached patch, or even better try a -current kernel.

-Søren



ff
Description: Binary data




On 1Jul, 2008, at 11:01 , Daniel Eriksson wrote:



I am having problems with silent data corruption on (some) drives
connected to an MCP55 SATA controller.

I have two servers, both running RELENG_7_0/amd64. One has the 570  
Ultra

chipset, the other has 570 SLI. Both chipsets have the MCP55 SATA
controller.

The server with 570 Ultra chipset has a bunch of older 250GB SATA-150
drives hooked up to the MCP55 controller and it is working just fine.
The server with 570 SLI chipset has a bunch of new SATA-300 drives
hooked up to the MCP55 controller and it is giving me silent data
corruption (easily detectable by running ZFS scrub, every time I run  
it

new checksum errors show up). I know the drives are good because when
they are hooked up to another controller they work just fine.

Unfortunately the drives does not have a jumper for setting SATA-150
speed (they are Samsung 1 TB drives), and trying to force the drives  
to

SATA-150 speed with the patch provided by the manufacturer does not
seem to work (the drives still negotiate SATA-300 speed). I will try  
to

get my hands on another older SATA-150 drive (or a new that can be
jumpered) to verify if the culprit is the MCP55 revision (see below)  
or

the interface speed.


NOT working (570 SLI)
-
[EMAIL PROTECTED]:0:5:0: class=0x010185 card=0x72501462 chip=0x037f10de
rev=0xa2 hdr=0x00
   vendor = 'Nvidia Corp'
   device = 'MCP55 SATA Controller'
   class  = mass storage
   subclass   = ATA

Working (570 Ultra)
---
[EMAIL PROTECTED]:0:5:0: class=0x010185 card=0xcb8410de chip=0x037f10de
rev=0xa3 hdr=0x00
   vendor = 'Nvidia Corp'
   device = 'MCP55 SATA Controller'
   class  = mass storage
   subclass   = ATA

This is most likely related to kern/120296
(http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/120296) and kern/ 
121396

(http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/121396).


If someone else is having data corruption problems with drives  
connected
to an MCP55 controller it might be worth testing if limiting the  
drives

to SATA-150 makes a difference. It will most likely take me a while
before I can verify this.

---
Daniel Eriksson (http://www.toomuchdata.com/)



-Søren






___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

RE: MCP55 SATA data corruption in FreeBSD 7

2008-07-01 Thread Daniel Eriksson
Jeremy Chadwick wrote:

 With the same cables?  Not that I want to use cables as a 
 scapegoat, but in this case it seems applicable.

With the same cables, yes.

 Can you provide atacontrol cap output for one of the drives?

# atacontrol cap ad4

Protocol  Serial ATA II
device model  SAMSUNG HD103UJ
firmware revision 1AA01112
cylinders 16383
heads 16
sectors/track 63
lba supported 268435455 sectors
lba48 supported   1953525168 sectors
dma supported
overlap not supported

Feature  Support  EnableValue   Vendor
write cacheyes  yes
read ahead yes  yes
Native Command Queuing (NCQ)   yes   -  31/0x1F
Tagged Command Queuing (TCQ)   no   no  31/0x1F
SMART  yes  yes
microcode download yes  yes
security   yes  no
power management   yes  yes
advanced power management  yes  no  0/0x00
automatic acoustic management  yes  no  0/0x00  254/0xFE


 I know in the case of Maxtor drives, there is a bug that exists in one
 of their disk firmwares which causes silent data corruption and/or
 SATA bus lockups when NCQ is used on nForce 4 chipsets.  Maxtor
 provides a firmware update which fixes the bug.

Connecting (some of) the drives to a JMicron JMB363 SATA300 controller
or a Promise PDC20318 SATA150 controller makes them work just fine.

FreeBSD itself does not seem to notice any data corruption. I only
noticed it because zpool status reported checksum errors after I had
written almost 3 TB to the array. I then issued a zpool scrub, and
within a couple of minutes I already had dozens of corrupt files (so I
stopped the scrub, deleted the pool and started fault-finding).

---
Daniel Eriksson (http://www.toomuchdata.com/)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: MCP55 SATA data corruption in FreeBSD 7

2008-07-01 Thread Daniel Eriksson
Søren Schmidt wrote:

 Please try the attached patch, or even better try a -current kernel.

The patch made no difference on RELENG_7_0 unfortunately. (And I cannot try 
CURRENT on this server.)

___
Daniel Eriksson (http://www.toomuchdata.com/)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-01 Thread Chris Rees
 Date: Tue, 1 Jul 2008 11:01:17 +0200
 From: Daniel Eriksson [EMAIL PROTECTED]

 I am having problems with silent data corruption on (some) drives
 connected to an MCP55 SATA controller.

 I have two servers, both running RELENG_7_0/amd64. One has the 570 Ultra
 chipset, the other has 570 SLI. Both chipsets have the MCP55 SATA
 controller.

 The server with 570 Ultra chipset has a bunch of older 250GB SATA-150
 drives hooked up to the MCP55 controller and it is working just fine.
 The server with 570 SLI chipset has a bunch of new SATA-300 drives
 hooked up to the MCP55 controller and it is giving me silent data
 corruption (easily detectable by running ZFS scrub, every time I run it
 new checksum errors show up). I know the drives are good because when
 they are hooked up to another controller they work just fine.

 Unfortunately the drives does not have a jumper for setting SATA-150
 speed (they are Samsung 1 TB drives), and trying to force the drives to
 SATA-150 speed with the patch provided by the manufacturer does not
 seem to work (the drives still negotiate SATA-300 speed). I will try to
 get my hands on another older SATA-150 drive (or a new that can be
 jumpered) to verify if the culprit is the MCP55 revision (see below) or
 the interface speed.


 NOT working (570 SLI)
 -
 [EMAIL PROTECTED]:0:5:0: class=0x010185 card=0x72501462 chip=0x037f10de
 rev=0xa2 hdr=0x00
vendor = 'Nvidia Corp'
device = 'MCP55 SATA Controller'
class  = mass storage
subclass   = ATA

 Working (570 Ultra)
 ---
 [EMAIL PROTECTED]:0:5:0: class=0x010185 card=0xcb8410de chip=0x037f10de
 rev=0xa3 hdr=0x00
vendor = 'Nvidia Corp'
device = 'MCP55 SATA Controller'
class  = mass storage
subclass   = ATA

 This is most likely related to kern/120296
 (http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/120296) and kern/121396
 (http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/121396).


 If someone else is having data corruption problems with drives connected
 to an MCP55 controller it might be worth testing if limiting the drives
 to SATA-150 makes a difference. It will most likely take me a while
 before I can verify this.

 ---
 Daniel Eriksson (http://www.toomuchdata.com/)


I have a 570 SLI too (Asus M2N-SLI Deluxe), I've been looking for an
excuse to put FreeBSD on here :)

I'll start installing it, anything I should do to make this error more obvious?

My hard drive is a WDC WD2000JS-00SGB0;
http://www.wdc.com/en/library/sata/2879-001146.pdf


Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: MCP55 SATA data corruption in FreeBSD 7

2008-07-01 Thread Daniel Eriksson
Chris Rees wrote:

 I have a 570 SLI too (Asus M2N-SLI Deluxe), I've been looking for an
 excuse to put FreeBSD on here :)
 
 I'll start installing it, anything I should do to make this 
 error more obvious?

No, if it is a common problem with this chipset and/or MCP55 controller
revision (which I think) then you should run into problems pretty much
as soon as you start reading from the drive (on initial boot after
install for example).

I'm not sure if the problem is amd64 specific or not, but it seems the
people that have reported problems have all run amd64 (and not i386).
This might be a coincident though.

___
Daniel Eriksson (http://www.toomuchdata.com/)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MCP55 SATA data corruption in FreeBSD 7

2008-07-01 Thread Andrey V. Elsukov

Daniel Eriksson wrote:

I'm not sure if the problem is amd64 specific or not, but it seems the
people that have reported problems have all run amd64 (and not i386).
This might be a coincident though.


I have two motherboards with MCP55. They work well and I didn't
see any data corruption.

1. ASUS M2N32 WS Pro (nForce 590 Ultra)
FreeBSD 8.0-CURRENT amd64
[EMAIL PROTECTED]:0:13:0:class=0x010185 card=0x81fb1043 chip=0x037f10de 
rev=0xa2 hdr=0x00

vendor = 'Nvidia Corp'
device = 'MCP55 SATA Controller'
class  = mass storage
subclass   = ATA

2. EPOX (don't remember version) (nForce 570 Ultra)
FreeBSD 6.2-STABLE amd64
[EMAIL PROTECTED]:5:0:   class=0x010185 card=0x10261695 chip=0x037f10de 
rev=0xa2 hdr=0x00

vendor = 'NVIDIA Corporation'
class  = mass storage
subclass   = ATA

Both work on amd64 many months...

--
WBR, Andrey V. Elsukov
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]