subject:"Re\: \[zfs\-discuss\] SNV_125 MPT warning in logfile"


Hi Cindy,

I have a couple of questions about this issue :

  1. i have exactly the same LSI controller in another server running
 opensolaris snv_101b, and so far no errors like this ones where
 seen in the system
  2. up to snv_118 i haven't seen any problems, only now within snv_125
  3. the Sun StorageTek SAS HBA isn't a LSI OEM ? if so, is it possible
 to know what firmware version is that HBA using?


Thank you,
Bruno

Cindy Swearingen wrote:

Hi Bruno,

I see some bugs associated with these messages (6694909) that point to
an LSI firmware upgrade that cause these harmless errors to display.

According to the 6694909 comments, this issue is documented in the
release notes.

As they are harmless, I wouldn't worry about them.

Maybe someone from the driver group can comment further.

Cindy


On 10/22/09 05:40, Bruno Sousa wrote:

Hi all,

Recently i upgrade from snv_118 to snv_125, and suddently i started 
to see this messages at /var/adm/messages :


Oct 22 12:54:37 SAN02 scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Oct 22 12:54:37 SAN02  mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x3112011a
Oct 22 12:56:47 SAN02 scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Oct 22 12:56:47 SAN02  mpt_handle_event_sync: IOCStatus=0x8000, 
IOCLogInfo=0x3112011a
Oct 22 12:56:47 SAN02 scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Oct 22 12:56:47 SAN02  mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x3112011a
Oct 22 12:56:50 SAN02 scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Oct 22 12:56:50 SAN02  mpt_handle_event_sync: IOCStatus=0x8000, 
IOCLogInfo=0x3112011a
Oct 22 12:56:50 SAN02 scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Oct 22 12:56:50 SAN02  mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x3112011a



Is this a symptom of a disk error or some change was made in the 
driver?,that now i have more information, where in the past such 
information didn't appear?


Thanks,
Bruno

I'm using a LSI Logic SAS1068E B3 and i within lsiutil i have this 
behaviour :



1 MPT Port found

Port Name Chip Vendor/Type/RevMPT Rev  Firmware Rev  IOC
1.  mpt0  LSI Logic SAS1068E B3 105  011a 0

Select a device:  [1-1 or 0 to quit] 1

1.  Identify firmware, BIOS, and/or FCode
2.  Download firmware (update the FLASH)
4.  Download/erase BIOS and/or FCode (update the FLASH)
8.  Scan for devices
10.  Change IOC settings (interrupt coalescing)
13.  Change SAS IO Unit settings
16.  Display attached devices
20.  Diagnostics
21.  RAID actions
22.  Reset bus
23.  Reset target
42.  Display operating system names for devices
45.  Concatenate SAS firmware and NVDATA files
59.  Dump PCI config space
60.  Show non-default settings
61.  Restore default settings
66.  Show SAS discovery errors
69.  Show board manufacturing information
97.  Reset SAS link, HARD RESET
98.  Reset SAS link
99.  Reset port
e   Enable expert mode in menus
p   Enable paged mode
w   Enable logging

Main menu, select an option:  [1-99 or e/p/w or 0 to quit] 20

1.  Inquiry Test
2.  WriteBuffer/ReadBuffer/Compare Test
3.  Read Test
4.  Write/Read/Compare Test
8.  Read Capacity / Read Block Limits Test
12.  Display phy counters
13.  Clear phy counters
14.  SATA SMART Read Test
15.  SEP (SCSI Enclosure Processor) Test
18.  Report LUNs Test
19.  Drive firmware download
20.  Expander firmware download
21.  Read Logical Blocks
99.  Reset port
e   Enable expert mode in menus
p   Enable paged mode
w   Enable logging

Diagnostics menu, select an option:  [1-99 or e/p/w or 0 to quit] 12

Adapter Phy 0:  Link Down, No Errors

Adapter Phy 1:  Link Down, No Errors

Adapter Phy 2:  Link Down, No Errors

Adapter Phy 3:  Link Down, No Errors

Adapter Phy 4:  Link Up, No Errors

Adapter Phy 5:  Link Up, No Errors

Adapter Phy 6:  Link Up, No Errors

Adapter Phy 7:  Link Up, No Errors

Expander (Handle 0009) Phy 0:  Link Up
 Invalid DWord Count  79,967,229
 Running Disparity Error Count63,036,893
 Loss of DWord Synch Count   113
 Phy Reset Problem Count   0

Expander (Handle 0009) Phy 1:  Link Up
 Invalid DWord Count  79,967,207
 Running Disparity Error Count78,339,626
 Loss of DWord Synch Count   113
 Phy Reset Problem Count   0

Expander (Handle 0009) Phy 2:  Link Up
 Invalid DWord Count  76,717,646
 Running Disparity Error Count73,334,563
 Loss of DWord Synch Count   113
 Phy Reset Problem Count   0

Expander (Handle 0009) Phy 3:

Re: [zfs-discuss] SNV_125 MPT warning in logfile


Hi Adam,

How many disks and zpoo/zfs's do you have behind that LSI?
I have a system with 22 disks and 4 zpools with around 30 zfs's and so 
far it works like a charm, even during heavy load. The opensolaris 
release is snv_101b .


Bruno
Adam Cheal wrote:

Cindy: How can I view the bug report you referenced? Standard methods show my 
the bug number is valid (6694909) but no content or notes. We are having 
similar messages appear with snv_118 with a busy LSI controller, especially 
during scrubbing, and I'd be interested to see what they mentioned in that 
report. Also, the LSI firmware updates for the LSISAS3081E (the controller we 
use) don't usually come with release notes indicating what has changed in each 
firmware revision, so I'm not sure where they got that idea from.
  




--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

Our config is:
OpenSolaris snv_118 x64
1 x LSISAS3801E controller
2 x 23-disk JBOD (fully populated, 1TB 7.2k SATA drives)
Each of the two external ports on the LSI connects to a 23-disk JBOD. ZFS-wise 
we use 1 zpool with 2 x 22-disk raidz2 vdevs (1 vdev per JBOD). Each zpool has 
one ZFS filesystem containing millions of files/directories. This data is 
served up via CIFS (kernel), which is why we went with snv_118 (first release 
post-2009.06 that had stable CIFS server). Like I mentioned to James, we know 
that the server won't be a star performance-wise especially because of the wide 
vdevs but it shouldn't hiccup under load either. A guaranteed way for us to 
cause these IO errors is to load up the zpool with about 30 TB of data (90% 
full) then scrub it. Within 30 minutes we start to see the errors, which 
usually evolves into failing disks (because of excessive retry errors) which 
just makes things worse.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Jeremy f

What bug# is this under? I'm having what I believe is the same problem. Is
it possible to just take the mpt driver from a prior build in the time
being?
The below is from the load the zpool scrub creates. This is on a dell t7400
workstation with a 1068E oemed lsi. I updated the firmware to the newest
available from dell. The errors follow whichever of the 4 drives has the
highest load.

Streaming doesn't seem to trigger it as I can push 60 MiB a second to a
mirrored rpool all day, it's only when there are a lot of metadata
operations.


Oct 23 06:25:44 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:25:44 systurbo5   Disconnected command timeout for Target 1
Oct 23 06:27:15 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:27:15 systurbo5   Disconnected command timeout for Target 1
Oct 23 06:28:26 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:28:26 systurbo5   Disconnected command timeout for Target 1
Oct 23 06:29:47 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:29:47 systurbo5   Disconnected command timeout for Target 1
Oct 23 06:30:58 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:30:58 systurbo5   Disconnected command timeout for Target 1
Oct 23 06:31:28 systurbo5 scsi: [ID 243001 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:31:28 systurbo5   mpt_handle_event_sync: IOCStatus=0x8000,
IOCLogInfo=0x31123000
Oct 23 06:31:28 systurbo5 scsi: [ID 243001 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:31:28 systurbo5   mpt_handle_event: IOCStatus=0x8000,
IOCLogInfo=0x31123000
Oct 23 06:31:29 systurbo5 scsi: [ID 365881 kern.info] /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:31:29 systurbo5   Log info 0x31123000 received for target 1.
Oct 23 06:31:29 systurbo5   scsi_status=0x0, ioc_status=0x804b,
scsi_state=0xc
Oct 23 06:31:29 systurbo5 scsi: [ID 365881 kern.info] /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:31:29 systurbo5   Log info 0x31123000 received for target 1.
Oct 23 06:31:29 systurbo5   scsi_status=0x0, ioc_status=0x804b,
scsi_state=0xc
Oct 23 06:31:29 systurbo5 scsi: [ID 365881 kern.info] /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:31:29 systurbo5   Log info 0x31123000 received for target 1.
Oct 23 06:31:29 systurbo5   scsi_status=0x0, ioc_status=0x804b,
scsi_state=0xc
Oct 23 06:31:29 systurbo5 scsi: [ID 365881 kern.info] /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:31:29 systurbo5   Log info 0x31123000 received for target 1.
Oct 23 06:31:29 systurbo5   scsi_status=0x0, ioc_status=0x804b,
scsi_state=0xc


On Fri, Oct 23, 2009 at 7:13 AM, Adam Cheal ach...@pnimedia.com wrote:

 Our config is:
 OpenSolaris snv_118 x64
 1 x LSISAS3801E controller
 2 x 23-disk JBOD (fully populated, 1TB 7.2k SATA drives)
 Each of the two external ports on the LSI connects to a 23-disk JBOD.
 ZFS-wise we use 1 zpool with 2 x 22-disk raidz2 vdevs (1 vdev per JBOD).
 Each zpool has one ZFS filesystem containing millions of files/directories.
 This data is served up via CIFS (kernel), which is why we went with snv_118
 (first release post-2009.06 that had stable CIFS server). Like I mentioned
 to James, we know that the server won't be a star performance-wise
 especially because of the wide vdevs but it shouldn't hiccup under load
 either. A guaranteed way for us to cause these IO errors is to load up the
 zpool with about 30 TB of data (90% full) then scrub it. Within 30 minutes
 we start to see the errors, which usually evolves into failing disks
 (because of excessive retry errors) which just makes things worse.
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Jeremy f

Sorry, running snv_123, indiana

On Fri, Oct 23, 2009 at 11:16 AM, Jeremy f rysh...@gmail.com wrote:

 What bug# is this under? I'm having what I believe is the same problem. Is
 it possible to just take the mpt driver from a prior build in the time
 being?
 The below is from the load the zpool scrub creates. This is on a dell t7400
 workstation with a 1068E oemed lsi. I updated the firmware to the newest
 available from dell. The errors follow whichever of the 4 drives has the
 highest load.

 Streaming doesn't seem to trigger it as I can push 60 MiB a second to a
 mirrored rpool all day, it's only when there are a lot of metadata
 operations.


 Oct 23 06:25:44 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
 ,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
 Oct 23 06:25:44 systurbo5   Disconnected command timeout for Target 1
 Oct 23 06:27:15 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
 ,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
 Oct 23 06:27:15 systurbo5   Disconnected command timeout for Target 1
 Oct 23 06:28:26 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
 ,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
 Oct 23 06:28:26 systurbo5   Disconnected command timeout for Target 1
 Oct 23 06:29:47 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
 ,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
 Oct 23 06:29:47 systurbo5   Disconnected command timeout for Target 1
 Oct 23 06:30:58 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
 ,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
 Oct 23 06:30:58 systurbo5   Disconnected command timeout for Target 1
 Oct 23 06:31:28 systurbo5 scsi: [ID 243001 kern.warning] WARNING: /p...@0
 ,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
 Oct 23 06:31:28 systurbo5   mpt_handle_event_sync: IOCStatus=0x8000,
 IOCLogInfo=0x31123000
 Oct 23 06:31:28 systurbo5 scsi: [ID 243001 kern.warning] WARNING: /p...@0
 ,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
 Oct 23 06:31:28 systurbo5   mpt_handle_event: IOCStatus=0x8000,
 IOCLogInfo=0x31123000
 Oct 23 06:31:29 systurbo5 scsi: [ID 365881 kern.info] /p...@0
 ,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
 Oct 23 06:31:29 systurbo5   Log info 0x31123000 received for target 1.
 Oct 23 06:31:29 systurbo5   scsi_status=0x0, ioc_status=0x804b,
 scsi_state=0xc
 Oct 23 06:31:29 systurbo5 scsi: [ID 365881 kern.info] /p...@0
 ,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
 Oct 23 06:31:29 systurbo5   Log info 0x31123000 received for target 1.
 Oct 23 06:31:29 systurbo5   scsi_status=0x0, ioc_status=0x804b,
 scsi_state=0xc
 Oct 23 06:31:29 systurbo5 scsi: [ID 365881 kern.info] /p...@0
 ,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
 Oct 23 06:31:29 systurbo5   Log info 0x31123000 received for target 1.
 Oct 23 06:31:29 systurbo5   scsi_status=0x0, ioc_status=0x804b,
 scsi_state=0xc
 Oct 23 06:31:29 systurbo5 scsi: [ID 365881 kern.info] /p...@0
 ,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
 Oct 23 06:31:29 systurbo5   Log info 0x31123000 received for target 1.
 Oct 23 06:31:29 systurbo5   scsi_status=0x0, ioc_status=0x804b,
 scsi_state=0xc


 On Fri, Oct 23, 2009 at 7:13 AM, Adam Cheal ach...@pnimedia.com wrote:

 Our config is:
 OpenSolaris snv_118 x64
 1 x LSISAS3801E controller
 2 x 23-disk JBOD (fully populated, 1TB 7.2k SATA drives)
 Each of the two external ports on the LSI connects to a 23-disk JBOD.
 ZFS-wise we use 1 zpool with 2 x 22-disk raidz2 vdevs (1 vdev per JBOD).
 Each zpool has one ZFS filesystem containing millions of files/directories.
 This data is served up via CIFS (kernel), which is why we went with snv_118
 (first release post-2009.06 that had stable CIFS server). Like I mentioned
 to James, we know that the server won't be a star performance-wise
 especially because of the wide vdevs but it shouldn't hiccup under load
 either. A guaranteed way for us to cause these IO errors is to load up the
 zpool with about 30 TB of data (90% full) then scrub it. Within 30 minutes
 we start to see the errors, which usually evolves into failing disks
 (because of excessive retry errors) which just makes things worse.
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

Just submitted the bug yesterday, under advice of James, so I don't have a 
number you can refer to you...the change request number is 6894775 if that 
helps or is directly related to the future bugid.

From what I seen/read this problem has been around for awhile but only rears 
its ugly head under heavy IO with large filesets, probably related to large 
metadata sets as you spoke of. We are using snv_118 x64 but it seems to appear 
in snv_123 and snv_125 as well from what I read here.

We've tried installing SSD's to act as a read-cache for the pool to reduce the 
metadata hits on the physical disks and as a last-ditch effort we even tried 
switching to the latest LSI-supplied itmpt driver from 2007 (from reading 
http://enginesmith.wordpress.com/2009/08/28/ssd-faults-finally-resolved/) and 
disabling the mpt driver but we ended up with the same timeout issues. In our 
case, the drives in the JBODs are all WD (model WD1002FBYS-18A6B0) 1TB 7.2k 
SATA drives.

In revisting our architecture, we compared it to Sun's x4540 Thumper offering 
which uses the same controller with similar (though apparently customized) 
firmware and 48 disks. The difference is that they use 6 x LSI1068e controllers 
which each have to deal with only 8 disks...obviously better on performance but 
this architecture could be hiding the real IO issue by distributing the IO 
across so many controllers.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread James C. McPherson


Adam Cheal wrote:

Just submitted the bug yesterday, under advice of James, so I don't have a number you can 
refer to you...the change request number is 6894775 if that helps or is 
directly related to the future bugid.


From what I seen/read this problem has been around for awhile but only rears 
its ugly head under heavy IO with large filesets, probably related to large 
metadata sets as you spoke of. We are using snv_118 x64 but it seems to appear 
in snv_123 and snv_125 as well from what I read here.


We've tried installing SSD's to act as a read-cache for the pool to reduce the metadata 
hits on the physical disks and as a last-ditch effort we even tried switching to the 
latest LSI-supplied itmpt driver from 2007 (from reading 
http://enginesmith.wordpress.com/2009/08/28/ssd-faults-finally-resolved/) and disabling 
the mpt driver but we ended up with the same timeout issues. In our case, the drives in 
the JBODs are all WD (model WD1002FBYS-18A6B0) 1TB 7.2k SATA drives.

In revisting our architecture, we compared it to Sun's x4540 Thumper offering which uses 
the same controller with similar (though apparently customized) firmware and 48 disks. 
The difference is that they use 6 x LSI1068e controllers which each have to deal with 
only 8 disks...obviously better on performance but this architecture could be 
hiding the real IO issue by distributing the IO across so many controllers.


Hi Adam,
I was watching the incoming queues all day yesterday for the
bug, but missed seeing it, not sure why.

I've now moved the bug to the appropriate category so it will
get attention from the right people.


Thanks,
James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

Could Sun'x x4540 Thumper reason to have 6 LSI's some sort of hidden 
problems found by Sun where the HBA resets, and due to market time 
pressure the quick and dirty solution was to spread the load over 
multiple HBA's instead of software fix?


Just my 2 cents..


Bruno


Adam Cheal wrote:

Just submitted the bug yesterday, under advice of James, so I don't have a number you can 
refer to you...the change request number is 6894775 if that helps or is 
directly related to the future bugid.

From what I seen/read this problem has been around for awhile but only rears 
its ugly head under heavy IO with large filesets, probably related to large 
metadata sets as you spoke of. We are using snv_118 x64 but it seems to appear in 
snv_123 and snv_125 as well from what I read here.

We've tried installing SSD's to act as a read-cache for the pool to reduce the metadata 
hits on the physical disks and as a last-ditch effort we even tried switching to the 
latest LSI-supplied itmpt driver from 2007 (from reading 
http://enginesmith.wordpress.com/2009/08/28/ssd-faults-finally-resolved/) and disabling 
the mpt driver but we ended up with the same timeout issues. In our case, the drives in 
the JBODs are all WD (model WD1002FBYS-18A6B0) 1TB 7.2k SATA drives.

In revisting our architecture, we compared it to Sun's x4540 Thumper offering which uses 
the same controller with similar (though apparently customized) firmware and 48 disks. 
The difference is that they use 6 x LSI1068e controllers which each have to deal with 
only 8 disks...obviously better on performance but this architecture could be 
hiding the real IO issue by distributing the IO across so many controllers.
  



--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile


Hi Cindy,

Thank you for the update, mas it seems like i can't see any information 
specific to that bug.
I can only see bugs number 6702538 and 6615564, but according to their 
history, they have been fixed quite some time ago.

Can you by any chance present the information about bug 6694909 ?

Thank you,
Bruno


Cindy Swearingen wrote:

Hi Bruno,

I see some bugs associated with these messages (6694909) that point to
an LSI firmware upgrade that cause these harmless errors to display.

According to the 6694909 comments, this issue is documented in the
release notes.

As they are harmless, I wouldn't worry about them.

Maybe someone from the driver group can comment further.

Cindy


On 10/22/09 05:40, Bruno Sousa wrote:

Hi all,

Recently i upgrade from snv_118 to snv_125, and suddently i started 
to see this messages at /var/adm/messages :


Oct 22 12:54:37 SAN02 scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Oct 22 12:54:37 SAN02  mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x3112011a
Oct 22 12:56:47 SAN02 scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Oct 22 12:56:47 SAN02  mpt_handle_event_sync: IOCStatus=0x8000, 
IOCLogInfo=0x3112011a
Oct 22 12:56:47 SAN02 scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Oct 22 12:56:47 SAN02  mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x3112011a
Oct 22 12:56:50 SAN02 scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Oct 22 12:56:50 SAN02  mpt_handle_event_sync: IOCStatus=0x8000, 
IOCLogInfo=0x3112011a
Oct 22 12:56:50 SAN02 scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Oct 22 12:56:50 SAN02  mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x3112011a



Is this a symptom of a disk error or some change was made in the 
driver?,that now i have more information, where in the past such 
information didn't appear?


Thanks,
Bruno

I'm using a LSI Logic SAS1068E B3 and i within lsiutil i have this 
behaviour :



1 MPT Port found

Port Name Chip Vendor/Type/RevMPT Rev  Firmware Rev  IOC
1.  mpt0  LSI Logic SAS1068E B3 105  011a 0

Select a device:  [1-1 or 0 to quit] 1

1.  Identify firmware, BIOS, and/or FCode
2.  Download firmware (update the FLASH)
4.  Download/erase BIOS and/or FCode (update the FLASH)
8.  Scan for devices
10.  Change IOC settings (interrupt coalescing)
13.  Change SAS IO Unit settings
16.  Display attached devices
20.  Diagnostics
21.  RAID actions
22.  Reset bus
23.  Reset target
42.  Display operating system names for devices
45.  Concatenate SAS firmware and NVDATA files
59.  Dump PCI config space
60.  Show non-default settings
61.  Restore default settings
66.  Show SAS discovery errors
69.  Show board manufacturing information
97.  Reset SAS link, HARD RESET
98.  Reset SAS link
99.  Reset port
e   Enable expert mode in menus
p   Enable paged mode
w   Enable logging

Main menu, select an option:  [1-99 or e/p/w or 0 to quit] 20

1.  Inquiry Test
2.  WriteBuffer/ReadBuffer/Compare Test
3.  Read Test
4.  Write/Read/Compare Test
8.  Read Capacity / Read Block Limits Test
12.  Display phy counters
13.  Clear phy counters
14.  SATA SMART Read Test
15.  SEP (SCSI Enclosure Processor) Test
18.  Report LUNs Test
19.  Drive firmware download
20.  Expander firmware download
21.  Read Logical Blocks
99.  Reset port
e   Enable expert mode in menus
p   Enable paged mode
w   Enable logging

Diagnostics menu, select an option:  [1-99 or e/p/w or 0 to quit] 12

Adapter Phy 0:  Link Down, No Errors

Adapter Phy 1:  Link Down, No Errors

Adapter Phy 2:  Link Down, No Errors

Adapter Phy 3:  Link Down, No Errors

Adapter Phy 4:  Link Up, No Errors

Adapter Phy 5:  Link Up, No Errors

Adapter Phy 6:  Link Up, No Errors

Adapter Phy 7:  Link Up, No Errors

Expander (Handle 0009) Phy 0:  Link Up
 Invalid DWord Count  79,967,229
 Running Disparity Error Count63,036,893
 Loss of DWord Synch Count   113
 Phy Reset Problem Count   0

Expander (Handle 0009) Phy 1:  Link Up
 Invalid DWord Count  79,967,207
 Running Disparity Error Count78,339,626
 Loss of DWord Synch Count   113
 Phy Reset Problem Count   0

Expander (Handle 0009) Phy 2:  Link Up
 Invalid DWord Count  76,717,646
 Running Disparity Error Count73,334,563
 Loss of DWord Synch Count   113
 Phy Reset Problem Count   0

Expander (Handle 0009) Phy 3:  Link Up
 Invalid DWord Count  79,896,409
 Running Disparity Error Count

Re: [zfs-discuss] SNV_125 MPT warning in logfile


On Oct 23, 2009, at 1:48 PM, Bruno Sousa wrote:
Could Sun'x x4540 Thumper reason to have 6 LSI's some sort of  
hidden problems found by Sun where the HBA resets, and due to  
market time pressure the quick and dirty solution was to spread  
the load over multiple HBA's instead of software fix?


I don't think so. X4540 has 48 disks -- 6 controllers at 8 disks/ 
controller.
This is the same configuration as the X4500, which used a Marvell  
controller.

This decision leverages parts from the previous design.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

On Fri, Oct 23, 2009 at 3:48 PM, Bruno Sousa bso...@epinfante.com wrote:

 Could Sun'x x4540 Thumper reason to have 6 LSI's some sort of hidden
 problems found by Sun where the HBA resets, and due to market time pressure
 the quick and dirty solution was to spread the load over multiple HBA's
 instead of software fix?

 Just my 2 cents..


 Bruno


What else were you expecting them to do?  According to LSI's website, the
1068e in an x8 configuration is an 8-port card.
http://www.lsi.com/DistributionSystem/AssetDocument/files/docs/marketing_docs/storage_stand_prod/SCG_LSISAS1068E_PB_040407.pdf

While they could've used expanders, that just creates one more component
that can fail/have issues.  Looking at the diagram, they've taken the
absolute shortest I/O path possible, which is what I would hope to
see/expect.
http://www.sun.com/servers/x64/x4540/server_architecture.pdf

One drive per channel, 6 channels total.

I also wouldn't be surprised to find out that they found this the optimal
configuration from a performance/throughput/IOPS perspective as well.  Can't
seem to find those numbers published by LSI.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

I don't think there was any intention on Sun's part to ignore the 
problem...obviously their target market wants a performance-oriented box and 
the x4540 delivers that. Each 1068E controller chip supports 8 SAS PHY channels 
= 1 channel per drive = no contention for channels. The x4540 is a monster and 
performs like a dream with snv_118 (we have a few ourselves).

My issue is that implementing an archival-type solution demands a dense, simple 
storage platform that performs at a reasonable level, nothing more. Our design 
has the same controller chip (8 SAS PHY channels) driving 46 disks, so there is 
bound to be contention there especially in high-load situations. I just need it 
to work and handle load gracefully, not timeout and cause disk failures; at 
this point I can't even scrub the zpools to verify the data we have on there is 
valid. From a hardware perspective, the 3801E card is spec'ed to handle our 
architecture; the OS just seems to fall over somewhere though and not be able 
to throttle itself in certain intensive IO situations.

That said, I don't know whether to point the finger at LSI's firmware or 
mpt-driver/ZFS. Sun obviously has a good relationship with LSI as their 1068E 
is the recommended SAS controller chip and is used in their own products. At 
least we've got a bug filed now, and we can hopefully follow this through to 
find out where the system breaks down.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

On Fri, Oct 23, 2009 at 6:32 PM, Adam Cheal ach...@pnimedia.com wrote:

 I don't think there was any intention on Sun's part to ignore the
 problem...obviously their target market wants a performance-oriented box and
 the x4540 delivers that. Each 1068E controller chip supports 8 SAS PHY
 channels = 1 channel per drive = no contention for channels. The x4540 is a
 monster and performs like a dream with snv_118 (we have a few ourselves).

 My issue is that implementing an archival-type solution demands a dense,
 simple storage platform that performs at a reasonable level, nothing more.
 Our design has the same controller chip (8 SAS PHY channels) driving 46
 disks, so there is bound to be contention there especially in high-load
 situations. I just need it to work and handle load gracefully, not timeout
 and cause disk failures; at this point I can't even scrub the zpools to
 verify the data we have on there is valid. From a hardware perspective, the
 3801E card is spec'ed to handle our architecture; the OS just seems to fall
 over somewhere though and not be able to throttle itself in certain
 intensive IO situations.

 That said, I don't know whether to point the finger at LSI's firmware or
 mpt-driver/ZFS. Sun obviously has a good relationship with LSI as their
 1068E is the recommended SAS controller chip and is used in their own
 products. At least we've got a bug filed now, and we can hopefully follow
 this through to find out where the system breaks down.


Have you checked in with LSI to verify the IOPS ability of the chip?  Just
because it supports having 46 drives attached to one ASIC doesn't mean it
can actually service all 46 at once.  You're talking (VERY conservatively)
2800 IOPS.

Even ignoring that, I know for a fact that the chip can't handle raw
throughput numbers on 46 disks unless you've got some very severe raid
overhead.  That chip is good for roughly 2GB/sec each direction.  46 7200RPM
drives can fairly easily push 4x that amount in streaming IO loads.

Long story short, it appears you've got a 5lbs bag a 50lbs load...

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

LSI's sales literature on that card specs 128 devices which I take with a few 
hearty grains of salt. I agree that with all 46 drives pumping out streamed 
data, the controller would be overworked BUT the drives will only deliver data 
as fast as the OS tells them to. Just because the speedometer says 200 mph max 
doesn't mean we should (or even can!) go that fast.

The IO intensive operations that trigger our timeout issues are a small 
percentage of the actual normal IO we do to the box. Most of the time the 
solution happily serves up archived data, but when it comes time to scrub or do 
mass operations on the entire dataset bad things happen. It seems a waste to 
architect a more expensive performance-oriented solution when you aren't going 
to use that performance the majority of the time. There is a balance between 
performance and functionality, but I still feel that we should be able to make 
this situation work.

Ideally, the OS could dynamically adapt to slower storage and throttle its IO 
requests accordingly. At the least, it could allow the user to specify some IO 
thresholds so we can cage the beast if need be. We've tried some manual 
tuning via kernel parameters to restrict max queued operations per vdev and 
also a scrub related one (specifics escape me), but it still manages to 
overload itself.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

On Oct 23, 2009, at 4:46 PM, Tim Cook wrote:
On Fri, Oct 23, 2009 at 6:32 PM, Adam Cheal ach...@pnimedia.com
wrote:
I don't think there was any intention on Sun's part to ignore the
problem...obviously their target market wants a performance-oriented
box and the x4540 delivers that. Each 1068E controller chip supports
8 SAS PHY channels = 1 channel per drive = no contention for
channels. The x4540 is a monster and performs like a dream with
snv_118 (we have a few ourselves).

My issue is that implementing an archival-type solution demands a
dense, simple storage platform that performs at a reasonable level,
nothing more. Our design has the same controller chip (8 SAS PHY
channels) driving 46 disks, so there is bound to be contention there
especially in high-load situations. I just need it to work and
handle load gracefully, not timeout and cause disk failures; at
this point I can't even scrub the zpools to verify the data we have
on there is valid. From a hardware perspective, the 3801E card is
spec'ed to handle our architecture; the OS just seems to fall over
somewhere though and not be able to throttle itself in certain
intensive IO situations.

That said, I don't know whether to point the finger at LSI's
firmware or mpt-driver/ZFS. Sun obviously has a good relationship
with LSI as their 1068E is the recommended SAS controller chip and
is used in their own products. At least we've got a bug filed now,
and we can hopefully follow this through to find out where the
system breaks down.

Have you checked in with LSI to verify the IOPS ability of the
chip? Just because it supports having 46 drives attached to one
ASIC doesn't mean it can actually service all 46 at once. You're
talking (VERY conservatively) 2800 IOPS.

Tim has a valid point. By default, ZFS will queue 35 commands per disk.
For 46 disks that is 1,610 concurrent I/Os. Historically, it has
proven to be

relatively easy to crater performance or cause problems with very, very,
very expensive arrays that are easily overrun by Solaris. As a result,
it is
not uncommon to see references to setting throttles, especially in
older docs.

Fortunately, this is simple to test by reducing the number of I/Os ZFS
will queue. See the Evil Tuning Guide
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29

you have success with zfs_vdev_max_pending set to 10, then the mystery
might be solved. Use iostat to observe the wait and actv columns, which
show the number of transactions in the queues. JCMP?

NB sometimes a driver will have the limit be configurable. For
example, to get
high performance out of a high-end array attached to a qlc card, I've
set
the execution-throttle in /kernel/drv/qlc.conf to be more than two
orders of
magnitude greater than its default of 32. /kernel/drv/mpt*.conf does
not seem

to have a similar throttle.
-- richard

Even ignoring that, I know for a fact that the chip can't handle raw
throughput numbers on 46 disks unless you've got some very severe
raid overhead. That chip is good for roughly 2GB/sec each
direction. 46 7200RPM drives can fairly easily push 4x that amount
in streaming IO loads.

Long story short, it appears you've got a 5lbs bag a 50lbs load...

--Tim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

On Fri, Oct 23, 2009 at 7:17 PM, Adam Cheal ach...@pnimedia.com wrote:

 LSI's sales literature on that card specs 128 devices which I take with a
 few hearty grains of salt. I agree that with all 46 drives pumping out
 streamed data, the controller would be overworked BUT the drives will only
 deliver data as fast as the OS tells them to. Just because the speedometer
 says 200 mph max doesn't mean we should (or even can!) go that fast.

 The IO intensive operations that trigger our timeout issues are a small
 percentage of the actual normal IO we do to the box. Most of the time the
 solution happily serves up archived data, but when it comes time to scrub or
 do mass operations on the entire dataset bad things happen. It seems a waste
 to architect a more expensive performance-oriented solution when you aren't
 going to use that performance the majority of the time. There is a balance
 between performance and functionality, but I still feel that we should be
 able to make this situation work.

 Ideally, the OS could dynamically adapt to slower storage and throttle its
 IO requests accordingly. At the least, it could allow the user to specify
 some IO thresholds so we can cage the beast if need be. We've tried some
 manual tuning via kernel parameters to restrict max queued operations per
 vdev and also a scrub related one (specifics escape me), but it still
 manages to overload itself.
 --


Where are you planning on queueing up those requests?  The scrub, I can
understand wanting throttling, but what about your user workload?  Unless
you're talking about EXTREMELY  short bursts of I/O, what do you suggest the
OS do?  If you're sending 3000 IOPS at the box from a workstation, where is
that workload going to sit if you're only dumping 500 IOPS to disk?  The
only thing that will change is that your client will timeout instead of your
disks.

I don't recall seeing what generates the I/O, but I do recall that it's
backup.  My assumption would be it's something coming in over the network,
in which case I'd say you're far, far better off throttling at the network
stack.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

On Fri, Oct 23, 2009 at 7:17 PM, Richard Elling richard.ell...@gmail.comwrote:

Tim has a valid point. By default, ZFS will queue 35 commands per disk.
For 46 disks that is 1,610 concurrent I/Os. Historically, it has proven to
be
relatively easy to crater performance or cause problems with very, very,
very expensive arrays that are easily overrun by Solaris. As a result, it
is
not uncommon to see references to setting throttles, especially in older
docs.

Fortunately, this is simple to test by reducing the number of I/Os ZFS
will queue. See the Evil Tuning Guide

http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29

The mpt source is not open, so the mpt driver's reaction to 1,610
concurrent
I/Os can only be guessed from afar -- public LSI docs mention a number of
511
concurrent I/Os for SAS1068, but it is not clear to me that is an explicit
limit. If
you have success with zfs_vdev_max_pending set to 10, then the mystery
might be solved. Use iostat to observe the wait and actv columns, which
show the number of transactions in the queues. JCMP?

NB sometimes a driver will have the limit be configurable. For example, to
get
high performance out of a high-end array attached to a qlc card, I've set
the execution-throttle in /kernel/drv/qlc.conf to be more than two orders
of
magnitude greater than its default of 32. /kernel/drv/mpt*.conf does not
seem
to have a similar throttle.
-- richard

I believe there's a caveat here though. That really only helps if the total
I/O load is actually enough for the controller to handle. If the sustained
I/O workload is still 1600 concurrent I/O's, lowering the batch won't
actually cause any difference in the timeouts, will it? It would obviously
eliminate burstiness (yes, I made that word up), but if the total sustained
I/O load is greater than the ASIC can handle, it's still going to fall over
and die with a queue of 10, correct?

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

And therein lies the issue. The excessive load that causes the IO issues is 
almost always generated locally from a scrub or a local recursive ls used to 
warm up the SSD-based zpool cache with metadata. The regular network IO to the 
box is minimal and is very read-centric; once we load the box up with archived 
data (which generally happens in a short amount of time), we simply serve it 
out as needed.

As far as queueing goes, I would expect the system to queue bursts of IO in 
memory with appropriate timeouts, as required. These timeouts could either be 
manually or auto-magically adjusted to deal with the slower storage hardware. 
Obviously sustained intense IO requests would eventually blow up the queue so 
the goal here is to avoid creating those situations in the first place. We can 
throttle the network IO, if needed; I need the OS to know it's own local IO 
boundaries though and not attempt to overwork itself during scrubs etc.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

On Oct 23, 2009, at 5:32 PM, Tim Cook wrote:
On Fri, Oct 23, 2009 at 7:17 PM, Richard Elling richard.ell...@gmail.com
wrote:

Tim has a valid point. By default, ZFS will queue 35 commands per
disk.
For 46 disks that is 1,610 concurrent I/Os. Historically, it has
proven to be
relatively easy to crater performance or cause problems with very,
very,
very expensive arrays that are easily overrun by Solaris. As a
result, it is
not uncommon to see references to setting throttles, especially in
older docs.

Fortunately, this is simple to test by reducing the number of I/Os
ZFS

will queue. See the Evil Tuning Guide
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29

you have success with zfs_vdev_max_pending set to 10, then the mystery
might be solved. Use iostat to observe the wait and actv columns,
which

show the number of transactions in the queues. JCMP?

NB sometimes a driver will have the limit be configurable. For
example, to get
high performance out of a high-end array attached to a qlc card,
I've set
the execution-throttle in /kernel/drv/qlc.conf to be more than two
orders of
magnitude greater than its default of 32. /kernel/drv/mpt*.conf does
not seem

to have a similar throttle.
-- richard

I believe there's a caveat here though. That really only helps if
the total I/O load is actually enough for the controller to handle.
If the sustained I/O workload is still 1600 concurrent I/O's,
lowering the batch won't actually cause any difference in the
timeouts, will it? It would obviously eliminate burstiness (yes, I
made that word up), but if the total sustained I/O load is greater
than the ASIC can handle, it's still going to fall over and die with
a queue of 10, correct?

Yes, but since they are disks, and I'm assuming HDDs here, there is no
chance the disks will be faster than the host's ability to send I/Os ;-)
iostat will show what the queues look like.
-- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

Here is example of the pool config we use:

# zpool status
  pool: pool002
 state: ONLINE
 scrub: scrub stopped after 0h1m with 0 errors on Fri Oct 23 23:07:52 2009
config:

NAME STATE READ WRITE CKSUM
pool002  ONLINE   0 0 0
  raidz2 ONLINE   0 0 0
c9t18d0  ONLINE   0 0 0
c9t17d0  ONLINE   0 0 0
c9t55d0  ONLINE   0 0 0
c9t13d0  ONLINE   0 0 0
c9t15d0  ONLINE   0 0 0
c9t16d0  ONLINE   0 0 0
c9t11d0  ONLINE   0 0 0
c9t12d0  ONLINE   0 0 0
c9t14d0  ONLINE   0 0 0
c9t9d0   ONLINE   0 0 0
c9t8d0   ONLINE   0 0 0
c9t10d0  ONLINE   0 0 0
c9t29d0  ONLINE   0 0 0
c9t28d0  ONLINE   0 0 0
c9t27d0  ONLINE   0 0 0
c9t23d0  ONLINE   0 0 0
c9t25d0  ONLINE   0 0 0
c9t26d0  ONLINE   0 0 0
c9t21d0  ONLINE   0 0 0
c9t22d0  ONLINE   0 0 0
c9t24d0  ONLINE   0 0 0
c9t19d0  ONLINE   0 0 0
  raidz2 ONLINE   0 0 0
c9t30d0  ONLINE   0 0 0
c9t31d0  ONLINE   0 0 0
c9t32d0  ONLINE   0 0 0
c9t33d0  ONLINE   0 0 0
c9t34d0  ONLINE   0 0 0
c9t35d0  ONLINE   0 0 0
c9t36d0  ONLINE   0 0 0
c9t37d0  ONLINE   0 0 0
c9t38d0  ONLINE   0 0 0
c9t39d0  ONLINE   0 0 0
c9t40d0  ONLINE   0 0 0
c9t41d0  ONLINE   0 0 0
c9t42d0  ONLINE   0 0 0
c9t44d0  ONLINE   0 0 0
c9t45d0  ONLINE   0 0 0
c9t46d0  ONLINE   0 0 0
c9t47d0  ONLINE   0 0 0
c9t48d0  ONLINE   0 0 0
c9t49d0  ONLINE   0 0 0
c9t50d0  ONLINE   0 0 0
c9t51d0  ONLINE   0 0 0
c9t52d0  ONLINE   0 0 0
cache
  c8t2d0 ONLINE   0 0 0
  c8t3d0 ONLINE   0 0 0
spares
  c9t20d0AVAIL   
  c9t43d0AVAIL   

errors: No known data errors

  pool: rpool
 state: ONLINE
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
rpool ONLINE   0 0 0
  mirror  ONLINE   0 0 0
c8t0d0s0  ONLINE   0 0 0
c8t1d0s0  ONLINE   0 0 0

errors: No known data errors

...and here is a snapshot of the system using iostat -indexC 5 during a scrub 
of pool002 (c8 is onboard AHCI controller, c9 is LSI SAS 3801E):

  extended device statistics    errors --- 
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b s/w h/w trn tot 
device
0.00.00.00.0  0.0  0.00.00.0   0   0   0   0   0   0 c8
0.00.00.00.0  0.0  0.00.00.0   0   0   0   0   0   0 
c8t0d0
0.00.00.00.0  0.0  0.00.00.0   0   0   0   0   0   0 
c8t1d0
0.00.00.00.0  0.0  0.00.00.0   0   0   0   0   0   0 
c8t2d0
0.00.00.00.0  0.0  0.00.00.0   0   0   0   0   0   0 
c8t3d0
 8738.70.0 555346.10.0  0.1 345.00.0   39.5   0 3875   0   1   1   
2 c9
  194.80.0 11936.90.0  0.0  7.90.0   40.3   0  87   0   0   0   0 
c9t8d0
  194.60.0 12927.90.0  0.0  7.60.0   38.9   0  86   0   0   0   0 
c9t9d0
  194.60.0 12622.60.0  0.0  8.10.0   41.7   0  90   0   0   0   0 
c9t10d0
  201.60.0 13350.90.0  0.0  8.00.0   39.5   0  90   0   0   0   0 
c9t11d0
  194.40.0 12902.30.0  0.0  7.80.0   40.1   0  88   0   0   0   0 
c9t12d0
  194.60.0 12902.30.0  0.0  7.70.0   39.3   0  88   0   0   0   0 
c9t13d0
  195.40.0 12479.00.0  0.0  8.50.0   43.4   0  92   0   0   0   0 
c9t14d0
  197.60.0 13107.40.0  0.0  8.10.0   41.0   0  92   0   0   0   0 
c9t15d0
  198.80.0 12918.10.0  0.0  8.20.0   41.4   0  92   0   0   0   0 
c9t16d0
  201.00.0 13350.30.0  0.0  8.10.0   40.4   0  91   0   0   0   0 
c9t17d0
  201.20.0 13325.00.0  0.0  7.80.0   38.5   0  88   0   0   0   0 
c9t18d0
  200.60.0 13021.50.0  0.0  8.20.0   40.7   0  91   0   0   0   0 
c9t19d0
0.00.00.00.0  0.0  0.00.00.0   0   0   0   0   0   0 
c9t20d0
  196.60.0 12991.9

Re: [zfs-discuss] SNV_125 MPT warning in logfile