On Wed, Aug 29, 2012 at 09:06:28AM -0400, Joe Landman wrote:
> We've found modern LSI
> HBA and RAID gear have had issues with occasional "events" that seem
> to be more firmware bugs or driver bugs than anything else.  The
> gear is stable for very light usage, but when pushed hard (without
> driver/fw updates), it does crash, hard, often with corruption.

That's what I was afraid of :-(

Last week I set about reproducing this problem again on some test boxes, and
most annoyingly, I have been unable to.  The test ran for about 5 days
before one of the (Seagate) hard drives had an I/O error over the weekend,
and XFS shut down as you said it would.

I've just moved the remaining drives to another box, but after an hour it
hasn't failed either.  These boxes are identical specs to the production
boxes.

The production ones may get their filesystems wiped soon anyway, in which
case I can try reproducing on the actual same boxes.

> xfs is a parallel IO file system, ext4 is not.  There is a very good
> chance you are tickling a bug lower in the stack.  Which LSI HBA or
> RAID are you using?

HBAs, one 8 port and one 16 port.

root@dev-storage2:~# ./sas2flash -listall
LSI Corporation SAS2 Flash Utility
Version 12.00.00.00 (2011.11.08) 
Copyright (c) 2008-2011 LSI Corporation. All rights reserved 

        Adapter Selected is a LSI SAS: SAS2116_1(B1) 

Num   Ctlr            FW Ver        NVDATA        x86-BIOS         PCI Addr
----------------------------------------------------------------------------

0  SAS2116_1(B1)   12.00.00.00    0c.00.00.01    07.23.01.00     00:02:00:00
1  SAS2008(B2)     12.00.00.00    0c.00.00.05    07.23.01.00     00:03:00:00

        Finished Processing Commands Successfully.
        Exiting SAS2Flash.

> How have you set this up?

mdadm --create /dev/md/huge -n 24 -c 1024 -l raid0 /dev/sd{b..y}
mkfs.xfs -f -n size=16384 /dev/md/huge

> What kernel rev

ubuntu 12.04, stock kernel 3.2.0-26 (a bit behind on updates; 3.2.0-29 is
latest)

> and whats the
> 
>       modinfo mpt2sas
>       lspci
>       uname -a
> 
> output?

root@dev-storage2:~# modinfo mpt2sas
filename:       
/lib/modules/3.2.0-26-generic/kernel/drivers/scsi/mpt2sas/mpt2sas.ko
version:        10.100.00.00
license:        GPL
description:    LSI MPT Fusion SAS 2.0 Device Driver
author:         LSI Corporation <dl-mptfusionli...@lsi.com>
srcversion:     44529298D89618E1BA4A0EC
alias:          pci:v00001000d0000007Esv*sd*bc*sc*i*
alias:          pci:v00001000d0000006Esv*sd*bc*sc*i*
alias:          pci:v00001000d00000087sv*sd*bc*sc*i*
alias:          pci:v00001000d00000086sv*sd*bc*sc*i*
alias:          pci:v00001000d00000085sv*sd*bc*sc*i*
alias:          pci:v00001000d00000084sv*sd*bc*sc*i*
alias:          pci:v00001000d00000083sv*sd*bc*sc*i*
alias:          pci:v00001000d00000082sv*sd*bc*sc*i*
alias:          pci:v00001000d00000081sv*sd*bc*sc*i*
alias:          pci:v00001000d00000080sv*sd*bc*sc*i*
alias:          pci:v00001000d00000065sv*sd*bc*sc*i*
alias:          pci:v00001000d00000064sv*sd*bc*sc*i*
alias:          pci:v00001000d00000077sv*sd*bc*sc*i*
alias:          pci:v00001000d00000076sv*sd*bc*sc*i*
alias:          pci:v00001000d00000074sv*sd*bc*sc*i*
alias:          pci:v00001000d00000072sv*sd*bc*sc*i*
alias:          pci:v00001000d00000070sv*sd*bc*sc*i*
depends:        scsi_transport_sas,raid_class
intree:         Y
vermagic:       3.2.0-26-generic SMP mod_unload modversions 
parm:           logging_level: bits for enabling additional logging info 
(default=0)
parm:           max_sectors:max sectors, range 64 to 8192  default=8192 (ushort)
parm:           max_lun: max lun, default=16895  (int)
parm:           max_queue_depth: max controller queue depth  (int)
parm:           max_sgl_entries: max sg entries  (int)
parm:           msix_disable: disable msix routed interrupts (default=0) (int)
parm:           missing_delay: device missing delay , io missing delay (array 
of int)
parm:           mpt2sas_fwfault_debug: enable detection of firmware fault and 
halt firmware - (default=0)
parm:           disable_discovery: disable discovery  (int)
parm:           diag_buffer_enable: post diag buffers 
(TRACE=1/SNAPSHOT=2/EXTENDED=4/default=0) (int)

root@dev-storage2:~# lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 Processor Family DRAM 
Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core 
Processor Family PCI Express Root Port (rev 09)
00:01.1 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core 
Processor Family PCI Express Root Port (rev 09)
00:06.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core 
Processor Family PCI Express Root Port (rev 09)
00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family 
USB Enhanced Host Controller #2 (rev 05)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI 
Express Root Port 1 (rev b5)
00:1c.1 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI 
Express Root Port 2 (rev b5)
00:1c.2 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI 
Express Root Port 3 (rev b5)
00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI 
Express Root Port 4 (rev b5)
00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI 
Express Root Port 5 (rev b5)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family 
USB Enhanced Host Controller #1 (rev 05)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5)
00:1f.0 ISA bridge: Intel Corporation C204 Chipset Family LPC Controller (rev 
05)
00:1f.2 IDE interface: Intel Corporation 6 Series/C200 Series Chipset Family 4 
port SATA IDE Controller (rev 05)
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus 
Controller (rev 05)
00:1f.5 IDE interface: Intel Corporation 6 Series/C200 Series Chipset Family 2 
port SATA IDE Controller (rev 05)
01:00.0 Ethernet controller: Intel Corporation 82599EB 10 Gigabit TN Network 
Connection (rev 01)
01:00.1 Ethernet controller: Intel Corporation 82599EB 10 Gigabit TN Network 
Connection (rev 01)
02:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2116 
PCI-Express Fusion-MPT SAS-2 [Meteor] (rev 02)
03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 
PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
04:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 02)
05:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics 
Family (rev 10)
06:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
07:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
08:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection

root@dev-storage2:~# uname -a
Linux dev-storage2.example.com 3.2.0-26-generic #41-Ubuntu SMP Thu Jun 14 
17:49:24 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Anyway, many thanks for sharing your experience. This was definitely
reproducible before, I'll come back when I can reproduce it again :-(

Regards,

Brian.
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to