Re: [zfs-discuss] Disk failure chokes all the disks attached to the failing disk HBA

2012-06-01 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Richard Elling
 
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Antonio S. Cofiño
 
 My question is, there is anyway to anticipate this choking situation
when a
 disk is failing, to avoid the general failure?
 
 No.

Yes.
But not necessarily using the setup that you are currently using - that is
not quite clear from your original email.

If you have 4 HBA's, you want to arrange your raid such that you could
survive the complete loss of the entire HBA.  This would mean you build your
pool out of a bunch of 4-disk raidz vdev's, or perhaps a bunch of 8-disk
raidz2 vdev's.

The whole problem you're facing is that some bad disk brings down the whole
bus with it...  Make your redundancy able to survive the loss of a bus.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk failure chokes all the disks attached to the failing disk HBA

2012-05-31 Thread Weber, Markus

Antonio S. Cofiño wrote:
 [...]
 The system is a supermicro motherboard X8DTH-6F in a 4U chassis
 (SC847E1-R1400LPB) and an external SAS2 JBOD (SC847E16-RJBOD1).
 It makes a system with a total of 4 backplanes (2x SAS + 2x SAS2)
 each of them connected to a 4 different HBA (2x LSI 3081E-R (1068
 chip) + 2x LSI SAS9200-8e (2008 chip)).
 This system is has a total of 81 disk (2x SAS (SEAGATE ST3146356SS)
 + 34 SATA3 (Hitachi HDS722020ALA330) + 45 SATA6 (Hitachi HDS723020BLA642))

 The issue arise when one of the disk starts to fail making long time
 accesses. After some time (minutes, but I'm not sure) all the disks,
 connected to the same HBA, start to report errors. This situation
 produce a general failure on the ZFS making the whole POOL unavailable.
 [...]


Have been there and gave up at the end[1]. Could reproduce (even though
it took a bit longer) under most Linux versions (incl. using latest LSI
drivers) and LSI 3081E-R HBA.

Is it just mpt causing the errors or also mpt_sas?

In a lab environment the LSI 9200 HBA behaved better - I/O only dropped
shortly and then continued on the other disks without generating errors.

Had a lengthy Oracle case on this, but all proposed workarounds did
not worked for me at all, which had been (some also from other forums)

- disabling NCQ
- allow-bus-device-reset=0; to /kernel/drv/sd.conf
- set zfs:zfs_vdev_max_pending=1
- set mpt:mpt_enable_msi=0
- keep usage below 90%
- no fmservices running and did temporarily did fmadm unload disk-transport
  or other disk access stuff (smartd?)
- tried changing retries-timeout via sd-conf for the disks without any
  success and ended it doing via mdb 

At the end I knew the bad sector of the bad disk and by simply dd
this sector once or twice to /dev/zero I could easily bring down the
system/pool without any load on the disk system.


General consensus from various people: don't use SATA drives on SAS back-
planes. Some SATA drives might work better, but there seems to be no
guarantee. And even for SAS-SAS, try to avoid SAS1 backplanes.

Markus



[1] Search for What's wrong with LSI 3081 (1068) + expander + (bad) SATA
disk?
-- 
KPN International

Darmstädter Landstrasse 184| 60598 Frankfurt  | Germany
[T] +49 (0)69 96874-298| [F] -289 | [M] +49 (0)178 5352346
[E] markus.we...@kpn.de  | [W] www.kpn.de

KPN International ist ein eingetragenes Markenzeichen der KPN EuroRings B.V.

KPN Eurorings B.V.   | Niederlassung Frankfurt am Main
Amtsgericht Frankfurt HRB56874   | USt.IdNr. DE 225602449
Geschäftsführer  Jacobus Snijder  Louis Rustenhoven

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk failure chokes all the disks attached to the failing disk HBA

2012-05-30 Thread Richard Elling
On May 30, 2012, at 9:25 AM, Antonio S. Cofiño wrote:

 Dear All,
 
 It may be this not the correct mailing list, but I'm having a ZFS issue when 
 a disk is failing. 
 
 The system is a supermicro motherboard X8DTH-6F in a 4U chassis 
 (SC847E1-R1400LPB) and an external SAS2 JBOD (SC847E16-RJBOD1).
 It makes a system with a total of 4 backplanes (2x SAS + 2x SAS2) each of 
 them connected to a 4 different HBA (2x LSI 3081E-R (1068 chip) + 2x LSI 
 SAS9200-8e (2008 chip)).
 This system is has a total of 81 disk (2x SAS (SEAGATE ST3146356SS) + 34 
 SATA3 (Hitachi HDS722020ALA330) + 45 SATA6 (Hitachi HDS723020BLA642))
 
 The system is controlled by Opensolaris (snv_134) and it work normally. All 
 the SATA disks are part of the same pool separate by raidz2 vdev composed by 
 11 (~) disks.
 
 The issue arise when on of the disk starts to fail making long time accesses. 
 After some time (minutes, but I'm not sure) all the disks, connected to the 
 same HBA, start to report errors. This  situation produce a general failure 
 on the ZFS making the whole POOL unavailable. 
 
 Identifying the original failed disk producing access errors and removing it 
 the pool starts to resilver with no problem, and all the spurious errors 
 produced by the general error are recovered.  
 
 My question is, there is anyway to anticipate this choking situation when a 
 disk is failing, to avoid the general failure?

No.

 Any help or suggestion is welcome.

The best, proven solution is to not use SATA disks with SAS expanders.
Since that is likely to be beyond your time and budget, consider upgrading to 
the
latest HBA and expander firmware.
 -- richard

--
ZFS Performance and Training
richard.ell...@richardelling.com
+1-760-896-4422







___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk failure chokes all the disks attached to the failing disk HBA

2012-05-30 Thread Jim Klimov

2012-05-30 20:25, Antonio S. Cofiño wrote:

Dear All,

It may be this not the correct mailing list, but I'm having a ZFS issue
when a disk is failing.


I hope other users might help more on specific details, but while
we're waiting for their answer - please search the list archives.
Similar description of the problem comes up every few months, and
it seems to be a fundamental flaw of (consumerish?) SATA drives
with backplanes, leading to reset storms.

I remember the mechanism being something like this: a problematic
disk is detected and the system tries to have it reset so that it
might stop causing problems. The SATA controller either ignores
the command or takes too long to complete/respond, so the system
goes up the stack and next resets the backplane or ultimately the
controller.

I am not qualified to comment whether this issue is fundamental
(i.e. in SATA protocols) or incidental (cheap drives don't do
advanced stuff, while expensive SATAs might be ok in this regard).
There were discussions about using SATA-SAS interposers, but they
might not fit mechanically, add latency and instability, and raise
the system price to the point where native SAS disks would be
better...

Now, waiting for experts to chime in on whatever I missed ;)
HTH,
//Jim Klimov

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk failure chokes all the disks attached to the failing disk HBA

2012-05-30 Thread Paul Kraus
On Wed, May 30, 2012 at 12:52 PM, Richard Elling
richard.ell...@gmail.com wrote:

 The best, proven solution is to not use SATA disks with SAS expanders.
 Since that is likely to be beyond your time and budget, consider upgrading
 to the
 latest HBA and expander firmware.

I recently had the problem with a reset storm with five J4400
loaded with SATA drives behind two Sun/Oracle dual port SAS
controllers (LSI based). I was told the following by Oracle Support:

1. It is a known issue
2. software updates in Solaris 10U10 address some of it (we are at 10U9).
3. recommended stopping fmservice as that is a trigger (as well as a
failing drive)
4. the problem happens with SAS as well as SATA drives, but is much
less frequent
5. Oracle is pushing for new FW for the HBA to address the issue
6. a chain of three J4400 is much more likely to experience it than a
chain of two (we have one chain of two and one chain of three, the
problem occurred on the chain of three)

Not specifically applicable here, but probably related and might be of
use to someone here.

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
- Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
- Assistant Technical Director, LoneStarCon 3 (http://lonestarcon3.org/)
- Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
- Technical Advisor, Troy Civic Theatre Company
- Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss