Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-11-04 Thread grant beattie
Ed Saipetch wrote:
 To answer a number of questions:
 
 Regarding different controllers, I've tried 2 Syba Sil 3114 controllers 
 purchased about 4 months apart.  I've tried 5.4.3 firmware with one and 
 5.4.13 with another.  Maybe Syba makes crappy Sil 3114 cards but it's the 
 same one that someone on blogs.sun.com used with success.  I had weird 
 problems flashing the first card I got, hence the order of another one.  I'm 
 not sure how I could get 2 different controllers 4 months apart and then use 
 them in 2 completely different computers and both controllers be bad.

another data point..

I run two SiI 3114 based cards in my home fileserver running s10u3. I 
was having ZFS data corruption issues and I suspected the SiI cards - 
that was until I replaced the motherboard/CPU/memory. I didn't have the 
time or patience to try to determine which component was at fault, but I 
swapped the motherboard/CPU/memory and stressed it for a few hours and 
the data corruption problem was gone.

before that, I was seeing data corruption issues within minutes. maybe 
it was just memory, but I'll never know. I junked the old kit after I 
confirmed I had eliminated the problem.

grant.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-31 Thread Nigel Smith
Ok, this is a strange problem!
You seem to have tried  eliminated all the possible issues
that the community has suggested!

I was hoping you would see some errors logged in
'/var/adm/messages' that would give a clue.

Your original 'zpool status' said 140 errors.
Over what time period are these occurring?
I'm wondering if the errors are occurring at a
constant steady rate or if there are bursts of error?
Maybe you could monitor zpool status while generating
activity with dd or similar.
You could use zpool iostat interval to monitor
bandwidth and see if it is reasonably steady or erratic.

From your prtconf -D we see the 3114 card is using
the ata driver, as expected.
I believe the driver can talk to the disk drive
in either PIO or DMA mode, so you could try 
changing that in the ata.conf file. See here for details:
http://docs.sun.com/app/docs/doc/819-2254/ata-7d?a=view

I've just had a quick look at the source code for
the ata driver, and there does seem to be specific support
for the Silicon Image chips in the drivers:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/io/dktp/controller/ata/sil3xxx.c
and
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/io/dktp/controller/ata/sil3xxx.h
The file sil3xxx.h does mention:
  Errata Sil-AN-0109-B2 (Sil3114 Rev 0.3)
  To prevent erroneous ERR set for queued DMA transfers
  greater then 8k, FIS reception for FIS0cfg needs to be set
  to Accept FIS without Interlock
..which I read as meaning there have being some 'issues'
with this chip. And it sounds similar to the issue mention on
the link that Tomasz supplied:
http://home-tj.org/wiki/index.php/Sil_m15w

If you decide to try a different SATA controller card, possible options are:

1. The si3124 driver, which supports SiI-3132 (PCI-E)
   and SiI-3124 (PCI-X) devices.
   
2. The AHCI driver, which supports the Intel ICH6 and latter devices, often
   found on motherboard.
   
4. The NV_SATA driver which supports Nvidia ck804/mcp55 devices.

Regards
Nigel Smith
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-31 Thread Edward Saipetch
Nigel,

Thanks for the response!  Basically my last method of testing was to 
sftp a few 50-100MB files to /tank over a couple of minutes and force a 
scrub after.  The very first time this happened, I was using it as a NAS 
device dumping data to it for over a week.  I went to a customer's site 
to show him how cool zfs was and upon running zpool status, I saw the 
data corruption status and telling me to restore from a backup.  Running 
zpool status without a scrub shows no errors.

I tried mirrored devices, no raid whatsoever and raidz, all with the 
same results.  All the motherboards I've been using only have PCI since 
I was hoping I could create a low cost solution as a POC.  I'll test 
changing the transfer mode a bit later.  Other people have had better 
luck, what other debugging can be done?  I'm willing to even let someone 
have remote access to the box if they want.

Nigel Smith wrote:
 Ok, this is a strange problem!
 You seem to have tried  eliminated all the possible issues
 that the community has suggested!

 I was hoping you would see some errors logged in
 '/var/adm/messages' that would give a clue.

 Your original 'zpool status' said 140 errors.
 Over what time period are these occurring?
 I'm wondering if the errors are occurring at a
 constant steady rate or if there are bursts of error?
 Maybe you could monitor zpool status while generating
 activity with dd or similar.
 You could use zpool iostat interval to monitor
 bandwidth and see if it is reasonably steady or erratic.

 From your prtconf -D we see the 3114 card is using
 the ata driver, as expected.
 I believe the driver can talk to the disk drive
 in either PIO or DMA mode, so you could try 
 changing that in the ata.conf file. See here for details:
 http://docs.sun.com/app/docs/doc/819-2254/ata-7d?a=view

 I've just had a quick look at the source code for
 the ata driver, and there does seem to be specific support
 for the Silicon Image chips in the drivers:
 http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/io/dktp/controller/ata/sil3xxx.c
 and
 http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/io/dktp/controller/ata/sil3xxx.h
 The file sil3xxx.h does mention:
   Errata Sil-AN-0109-B2 (Sil3114 Rev 0.3)
   To prevent erroneous ERR set for queued DMA transfers
   greater then 8k, FIS reception for FIS0cfg needs to be set
   to Accept FIS without Interlock
 ..which I read as meaning there have being some 'issues'
 with this chip. And it sounds similar to the issue mention on
 the link that Tomasz supplied:
 http://home-tj.org/wiki/index.php/Sil_m15w

 If you decide to try a different SATA controller card, possible options are:

 1. The si3124 driver, which supports SiI-3132 (PCI-E)
and SiI-3124 (PCI-X) devices.

 2. The AHCI driver, which supports the Intel ICH6 and latter devices, often
found on motherboard.

 4. The NV_SATA driver which supports Nvidia ck804/mcp55 devices.

 Regards
 Nigel Smith
  
  
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-31 Thread Mario Goebbels
I haven't seen the beginning of this discussion, but seeing SiI sets the
fire alarm off here.

The Silicon Image chipsets are renowned to be crap and causing data
corruption. At least the variants that usually go onto mainboards. Based
on this, I suggest that you should get a different card.

-mg



signature.asc
Description: OpenPGP digital signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-31 Thread Edward Saipetch
Mario,

I don't have any issues getting a new card.  The root of the discussion 
started because people did indeed post that they had good luck with 
them.  In fact, when I went out there and google'd to find which cards 
would worked well, it seemed to be at the top of the list.  I'm 
interested to know if it's something I can help resolve so other people 
don't have this problem or make sure people don't run into the same 
issue I do.

Mario Goebbels wrote:
 I haven't seen the beginning of this discussion, but seeing SiI sets the
 fire alarm off here.

 The Silicon Image chipsets are renowned to be crap and causing data
 corruption. At least the variants that usually go onto mainboards. Based
 on this, I suggest that you should get a different card.

 -mg
   
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Nigel Smith
First off, can we just confirm the exact version of the Silicon Image Card
and which driver Solaris is using.

Use 'prtconf -pv' and '/usr/X11/bin/scanpci'
to get the PCI vendor  device ID information.

Use 'prtconf -D' to confirm which drivers are being used by which devices.

And 'modinfo' will tell you the version of the drivers.

The above commands will give details for all the devices
in the PC.  You may want to edit down the output before
posting it back here, or alternatively put the output into an
attached file.

See this link for an example of this sort of information
for a different hard disk controller card:
http://mail.opensolaris.org/pipermail/storage-discuss/2007-September/003399.html

Regards
Nigel Smith
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Nigel Smith
And are you seeing any error messages in '/var/adm/messages'
indicating any failure on the disk controller card?
If so, please post a sample back here to the forum.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Tomasz Torcz
On 10/30/07, Neal Pollack [EMAIL PROTECTED] wrote:
  I'm experiencing major checksum errors when using a syba silicon image 3114 
  based pci sata controller w/ nonraid firmware.  I've tested by copying data 
  via sftp and smb.  With everything I've swapped out, I can't fathom this 
  being a hardware problem.
 Even before ZFS, I've had numerous situations where various si3112 and
 3114 chips
 would corrupt data on UFS and PCFS, with very simple  copy and checksum
 test scripts, doing large bulk transfers.

  Those SIL chips are really broken when used with certain Seagate drivers.
But I have data corrupted by them with WD drive also.
Linux can workaround this bug by reducing transfer sizes (and thus
dramatically impacting speed). Solaris probably don't have workaround.
With this quirk enabled (on Linux), I get at most 20 MB/s from drives,
but ZFS do not report any corruption. Before I had corruptions hourly.

More info about SIL issue: http://home-tj.org/wiki/index.php/Sil_m15w
I have Si 3112, but despite SIL claims other chips seem to be affected also.


-- 
Tomasz Torcz
[EMAIL PROTECTED]
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Stephen Usher
One thing to check before you blame your controller:

Are the SATA cables close together for an extended length?

Basically, most SATA cables will generate massive levels of cross-talk between 
them if they're tied together or a run parallel in close proximity for a part 
of 
their run-length.

I friend found this sort of problem a couple of months ago and it was cured by 
separating the cables.

Steve
-- 
---
Computer Systems Administrator,E-Mail:[EMAIL PROTECTED]
Department of Earth Sciences, Tel:-  +44 (0)1865 282110
University of Oxford, Parks Road, Oxford, UK. Fax:-  +44 (0)1865 272072
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Frank . Hofmann
On Tue, 30 Oct 2007, Tomasz Torcz wrote:

 On 10/30/07, Neal Pollack [EMAIL PROTECTED] wrote:
 I'm experiencing major checksum errors when using a syba silicon image 3114 
 based pci sata controller w/ nonraid firmware.  I've tested by copying data 
 via sftp and smb.  With everything I've swapped out, I can't fathom this 
 being a hardware problem.
 Even before ZFS, I've had numerous situations where various si3112 and
 3114 chips
 would corrupt data on UFS and PCFS, with very simple  copy and checksum
 test scripts, doing large bulk transfers.

  Those SIL chips are really broken when used with certain Seagate drivers.
 But I have data corrupted by them with WD drive also.
 Linux can workaround this bug by reducing transfer sizes (and thus
 dramatically impacting speed). Solaris probably don't have workaround.

Might be slightly off-topic for the whole, but _this_ specific thing 
(reducing transfer sizes) is possible on Solaris as well. As documented 
here:

http://docs.sun.com/app/docs/doc/819-2724/chapter2-29?a=view

You can also read a bit more on the following thread:

http://www.opensolaris.org/jive/thread.jspa?threadID=6866

It's possible to limit this system-wide or per-LUN.

Best regards,
FrankH.

 With this quirk enabled (on Linux), I get at most 20 MB/s from drives,
 but ZFS do not report any corruption. Before I had corruptions hourly.

 More info about SIL issue: http://home-tj.org/wiki/index.php/Sil_m15w
 I have Si 3112, but despite SIL claims other chips seem to be affected also.


 -- 
 Tomasz Torcz
 [EMAIL PROTECTED]
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


--
No good can come from selling your freedom, not for all the gold in the world,
for the value of this heavenly gift far exceeds that of any fortune on earth.
--
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Al Hopper
On Mon, 29 Oct 2007, MC wrote:

 Here's what I've done so far:

 The obvious thing to test is the drive controller, so maybe you should do 
 that :)


Also - while you're doing swapTronics - don't forget the Power Supply 
(PSU).  Ensure that your PSU has sufficient capacity on its 12Volt 
rails (older PSUs did'nt even tell you how much current they can push 
out on the 12V outputs).

See also: http://blogs.sun.com/elowe/entry/zfs_saves_the_day_ta

Regards,

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
Graduate from sugar-coating school?  Sorry - I never attended! :)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Ed Saipetch
Tried that... completely different cases with different power supplies.

On Oct 30, 2007, at 10:28 AM, Al Hopper wrote:

 On Mon, 29 Oct 2007, MC wrote:

 Here's what I've done so far:

 The obvious thing to test is the drive controller, so maybe you  
 should do that :)


 Also - while you're doing swapTronics - don't forget the Power Supply
 (PSU).  Ensure that your PSU has sufficient capacity on its 12Volt
 rails (older PSUs did'nt even tell you how much current they can push
 out on the 12V outputs).

 See also: http://blogs.sun.com/elowe/entry/zfs_saves_the_day_ta

 Regards,

 Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
 OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
 Graduate from sugar-coating school?  Sorry - I never attended! :)
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Mauro Mozzarelli
Hi,

I have the same sil3114 based controller, installed in a dual Opteron box. I 
have installed Solaris x86 and have had no problem with it, however I hardly 
used that box with Solaris as my installation was only to try out Solaris on my 
Opteron worksation. Instead, on that workstation I constantly run Linux, and 
twice in a few months I came across (while running linux Fedora) several I/O 
errors on the SATA disk attached to that controller. I though at first that the 
hard drive was gone, but then I swapped that controller with a sil3112 and the 
I/O errors stopped. I swapped back the sil3114 and had no errors since. I 
reckon that it might have been due to one of the SATA cables (power or data?) 
not making a perfect contact. SATA connectors are of extremely poor quality and 
they fail to hold in place as well as the older IDE or SCSI or molex power 
connector. I noticed as well that they crack easily if inadvertently pulled or 
pushed while working inside the computer case.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-29 Thread Neal Pollack
Ed Saipetch wrote:
 Hello,

 I'm experiencing major checksum errors when using a syba silicon image 3114 
 based pci sata controller w/ nonraid firmware.  I've tested by copying data 
 via sftp and smb.  With everything I've swapped out, I can't fathom this 
 being a hardware problem.  

I can.  But I suppose it could also be in some unknown way a driver issue.
Even before ZFS, I've had numerous situations where various si3112 and 
3114 chips
would corrupt data on UFS and PCFS, with very simple  copy and checksum
test scripts, doing large bulk transfers.

Si chips are best used to clean coffee grinders.  Go buy a real SATA 
controller.

Neal

 There have been quite a few blog posts out there with people having a similar 
 config and not having any problems.

 Here's what I've done so far:
 1. Changed solaris releases from S10 U3 to NV 75a
 2. Switched out motherboards and cpus from AMD sempron to a Celeron D
 3. Switched out memory to use completely different dimms
 4. Switched out sata drives (2-3 250gb hitachi's and seagates in RAIDZ, 
 3x400GB seagates RAIDZ and 1x250GB hitachi with no raid)

 Here's output of a scrub and the status (ignore the date and time, I haven't 
 reset it on this new motherboard) and please point me in the right direction 
 if I'm barking up the wrong tree.

 # zpool scrub tank
 # zpool status
   pool: tank
  state: ONLINE
 status: One or more devices has experienced an error resulting in data
 corruption.  Applications may be affected.
 action: Restore the file in question if possible.  Otherwise restore the
 entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
  scrub: scrub completed with 140 errors on Sat Sep 15 02:07:35 2007
 config:

 NAMESTATE READ WRITE CKSUM
 tankONLINE   0 0   293
   c0d1  ONLINE   0 0   293

 errors: 140 data errors, use '-v' for a list
  
  
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-29 Thread MC
 Here's what I've done so far:

The obvious thing to test is the drive controller, so maybe you should do that 
:)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-29 Thread Edward Saipetch
Neal Pollack wrote:
 Ed Saipetch wrote:
 Hello,

 I'm experiencing major checksum errors when using a syba silicon  
 image 3114 based pci sata controller w/ nonraid firmware.  I've  
 tested by copying data via sftp and smb.  With everything I've  
 swapped out, I can't fathom this being a hardware problem.

 I can.  But I suppose it could also be in some unknown way a driver  
 issue.
 Even before ZFS, I've had numerous situations where various si3112  
 and 3114 chips
 would corrupt data on UFS and PCFS, with very simple  copy and  
 checksum
 test scripts, doing large bulk transfers.

 Si chips are best used to clean coffee grinders.  Go buy a real SATA  
 controller.

 Neal

I have no problem ponying up money for a better SATA controller.  I saw
a bunch of blog posts that people were successful using the card so I
thought maybe I had a bad card with corrupt firmware nvram.  Is it worth
trying to trace down the bug?  If this type of corruption exists, nobody
should be using this card.  As a side note, what SATA cards are people
having luck with?


 There have been quite a few blog posts out there with people having  
 a similar config and not having any problems.

 Here's what I've done so far:
 1. Changed solaris releases from S10 U3 to NV 75a
 2. Switched out motherboards and cpus from AMD sempron to a Celeron D
 3. Switched out memory to use completely different dimms
 4. Switched out sata drives (2-3 250gb hitachi's and seagates in  
 RAIDZ, 3x400GB seagates RAIDZ and 1x250GB hitachi with no raid)

 Here's output of a scrub and the status (ignore the date and time,  
 I haven't reset it on this new motherboard) and please point me in  
 the right direction if I'm barking up the wrong tree.

 # zpool scrub tank
 # zpool status
  pool: tank
 state: ONLINE
 status: One or more devices has experienced an error resulting in  
 data
corruption.  Applications may be affected.
 action: Restore the file in question if possible.  Otherwise  
 restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: scrub completed with 140 errors on Sat Sep 15 02:07:35 2007
 config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0   293
  c0d1  ONLINE   0 0   293

 errors: 140 data errors, use '-v' for a list
  This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-29 Thread Will Murnane
On 10/30/07, Edward Saipetch [EMAIL PROTECTED] wrote:
 As a side note, what SATA cards are people having luck with?
Running b74, I'm very happy with the Marvell mv88sx6081-based Supermicro card:
http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm
http://www.newegg.com/Product/Product.aspx?Item=N82E16815121009Tpk=aoc-sat2
http://www.wiredzone.com/xq/asp/ic.10016527/qx/itemdesc.htm
It hypothetically supports port multipliers, but I haven't tested this myself.

On earlier releases (b69, specifically) I had problems with disks
occasionally disappearing.  Those appear to have been completely
resolved; the box has most recently been up for 16 days with no
errors.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-29 Thread James C. McPherson
Will Murnane wrote:
 On 10/30/07, Edward Saipetch [EMAIL PROTECTED] wrote:
 As a side note, what SATA cards are people having luck with?
 Running b74, I'm very happy with the Marvell mv88sx6081-based Supermicro card:
 http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm
 http://www.newegg.com/Product/Product.aspx?Item=N82E16815121009Tpk=aoc-sat2
 http://www.wiredzone.com/xq/asp/ic.10016527/qx/itemdesc.htm
 It hypothetically supports port multipliers, but I haven't tested this myself.
 
 On earlier releases (b69, specifically) I had problems with disks
 occasionally disappearing.  Those appear to have been completely
 resolved; the box has most recently been up for 16 days with no
 errors.

We don't currently have support for SATA port multipliers in
Solaris or OpenSolaris. I know this because people in my team
are working on it (no ETA as yet) and we discussed it last week.



James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-29 Thread Neal Pollack
Edward Saipetch wrote:
 Neal Pollack wrote:
 Ed Saipetch wrote:
 Hello,

 I'm experiencing major checksum errors when using a syba silicon 
 image 3114 based pci sata controller w/ nonraid firmware.  I've 
 tested by copying data via sftp and smb.  With everything I've 
 swapped out, I can't fathom this being a hardware problem.  

 I can.  But I suppose it could also be in some unknown way a driver 
 issue.
 Even before ZFS, I've had numerous situations where various si3112 
 and 3114 chips
 would corrupt data on UFS and PCFS, with very simple  copy and checksum
 test scripts, doing large bulk transfers.

 Si chips are best used to clean coffee grinders.  Go buy a real SATA 
 controller.

 Neal
 I have no problem ponying up money for a better SATA controller.  I 
 saw a bunch of blog posts that people were successful using the card 
 so I thought maybe I had a bad card with corrupt firmware nvram.  Is 
 it worth trying to trace down the bug?

Of course it is.  File a bug so someone on the SATA team can study it.

 If this type of corruption exists, nobody should be using this card.  
 As a side note, what SATA cards are people having luck with?

A lot of people are happy with the 8 port PCI SATA card made by 
SuperMicro that has the Marvell chip on it.
Don't buy other marvell cards on ebay, because Marvell dumped a ton of 
cards that ended up with an earlier
rev of the silicon that can corrupt data.  But all the cards made by 
SuperMicro and sold by them have the c rev
or later silicon and work great.

That said, I wish someone would investigate the Silicon Image issues, 
but there are only so many engineers,
with so little time.

 There have been quite a few blog posts out there with people having 
 a similar config and not having any problems.

 Here's what I've done so far:
 1. Changed solaris releases from S10 U3 to NV 75a
 2. Switched out motherboards and cpus from AMD sempron to a Celeron D
 3. Switched out memory to use completely different dimms
 4. Switched out sata drives (2-3 250gb hitachi's and seagates in 
 RAIDZ, 3x400GB seagates RAIDZ and 1x250GB hitachi with no raid)

 Here's output of a scrub and the status (ignore the date and time, I 
 haven't reset it on this new motherboard) and please point me in the 
 right direction if I'm barking up the wrong tree.

 # zpool scrub tank
 # zpool status
   pool: tank
  state: ONLINE
 status: One or more devices has experienced an error resulting in data
 corruption.  Applications may be affected.
 action: Restore the file in question if possible.  Otherwise restore 
 the
 entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
  scrub: scrub completed with 140 errors on Sat Sep 15 02:07:35 2007
 config:

 NAMESTATE READ WRITE CKSUM
 tankONLINE   0 0   293
   c0d1  ONLINE   0 0   293

 errors: 140 data errors, use '-v' for a list
  
  
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss