Hmm... the problem appears to have resolved itself. After a few hours the new drive seems to have gone back into the array, and the original hot spare drive put back into hot-spare state.

So I'm interpreting state 0x0020 to therefore mean something like "hang on while I use this new drive to automatically put everything back as it was before the failure". Is this correct?

Thanks,
Charles

[r...@bsvr ~]# mfiutil show drives
mfi0 Physical Drives:
(  149G) ONLINE<ST9160511NS SN04 serial=9SM236JR>  SATA enclosure 1, slot 0
(  149G) ONLINE<ST9160511NS SN04 serial=9SM237KF>  SATA enclosure 1, slot 1
(  149G) ONLINE<ST9160511NS SN04 serial=9SM236N8>  SATA enclosure 1, slot 2
(  149G) HOT SPARE<ST9160511NS SN04 serial=9SM237EK>  SATA enclosure 1, slot 3
(  149G) ONLINE<ST9160511NS SN04 serial=9SM238AG>  SATA enclosure 1, slot 4



On 10/15/10 3:05 PM, Charles Owens wrote:
 Hello,

We have a mfi-based RAID array with a failed drive. When replacing the failed drive with a brand new one 'mfiutil' reports it having status of "PSTATE 0x0020". Attempts to work with the drive to make it a hot spare are unsuccessful (eg. using "good" and/or "add" subcommands of mfiutil). We've tested procedures for replacing failed drives in the past and haven't run into this.

Looking at the code for mfiutil it appears that this is happening because the mfi controller is reporting a drive status code that mfiutil doesn't know about. The system is remote and in production, so booting into the LSI in-BIOS RAID-management-tool is not an attractive option.

Any help with understanding the situation and potential next steps would be greatly appreciated. More background information follows below.

Thanks,

Charles


Storage configuration:  4-drive RAID 10 array plus one hot spare

[r...@svr ~]# mfiutil show config
mfi0 Configuration: 2 arrays, 1 volumes, 0 spares
    array 0 of 2 drives:
drive 0 ( 149G) ONLINE<ST9160511NS SN04 serial=9SM236JR> SATA enclosure 1, slot 0 drive 1 ( 149G) ONLINE<ST9160511NS SN04 serial=9SM237KF> SATA enclosure 1, slot 1
    array 1 of 2 drives:
drive 4 ( 149G) ONLINE<ST9160511NS SN04 serial=9SM237EK> SATA enclosure 1, slot 3 drive 3 ( 149G) ONLINE<ST9160511NS SN04 serial=9SM236N8> SATA enclosure 1, slot 2
    volume mfid0 (296G) RAID-1 256K OPTIMAL spans:
        array 0
        array 1

[r...@svr ~]# mfiutil show drives
mfi0 Physical Drives:
( 149G) ONLINE<ST9160511NS SN04 serial=9SM236JR> SATA enclosure 1, slot 0 ( 149G) ONLINE<ST9160511NS SN04 serial=9SM237KF> SATA enclosure 1, slot 1 ( 149G) ONLINE<ST9160511NS SN04 serial=9SM236N8> SATA enclosure 1, slot 2 ( 149G) ONLINE<ST9160511NS SN04 serial=9SM237EK> SATA enclosure 1, slot 3 ( 149G) PSTATE 0x0020<ST9160511NS SN04 serial=9SM238AG> SATA enclosure 1, slot 4


Partial system boot log:

Copyright (c) 1992-2009 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.0-RELEASE-p2 #4: Thu Mar  4 04:21:04 UTC 2010
    [email protected]:/usr/obj/usr/src/sys/BEACON
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(R) CPU E5520 @ 2.27GHz (2261.27-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x106a5  Stepping = 5
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,C
MOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Features2=0x9ce3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,DCA
,SSE4.1,SSE4.2,POPCNT>
  AMD Features=0x28100000<NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  TSC: P-state invariant
real memory  = 6442450944 (6144 MB)
avail memory = 6202064896 (5914 MB)
ACPI APIC Table:<INTEL  S5520UR>
FreeBSD/SMP: Multiprocessor System Detected: 16 CPUs
FreeBSD/SMP: 2 package(s) x 4 core(s) x 2 SMT threads

...

mfi0:<LSI MegaSAS 1078> port 0x1000-0x10ff mem 0xb1900000-0xb193ffff,0xb1940000-0xb197ffff irq 16 at device 0.0 on pci6
mfi0: Megaraid SAS driver Ver 3.00
mfi0: [ITHREAD]

...

AcpiOsExecute: failed to enqueue task, consider increasing the debug.acpi.max_tasks tunable ACPI Error (psparse-0633): Method parse/execution failed [\\_SB_.PCI0.HEC2.HSCI] (Node 0xccbff740)mfid0:<MFI Logical Disk> on mfi0
mfid0: 303268MB (621092864 sectors) RAID volume '' is optimal





_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hardware
To unsubscribe, send any mail to "[email protected]"

Reply via email to