|
Bottom line: I think this is a hardware issue with a new mobo I'm using. I'm hoping for confirmation on that before submitting the mobo for an RMA. Background: I recently decided to upgrade my SAN by:
AOC-SAT2-MV8 card1: c3t[0-7]d0 AOC-SAT2-MV8 card2: c4t[0-7]d0 AOC-SAT2-MV8 card3: c5t[0-7]d0 And I create a zpool as follows: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 c3t0d0 ONLINE 0 0 0 c3t4d0 ONLINE 0 0 0 c4t0d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 c5t0d0 ONLINE 0 0 0 c5t4d0 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 c3t5d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c4t5d0 ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 c5t5d0 ONLINE 0 0 0 What triggers the kernel panic: (panic happens immediately) root@san:~# cp one-gig-file.dat /tank/ ##### BEGIN KERNEL PANIC ##### SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major EVENT-TIME: 0x51146c6a.0x6ecf192 (0x2bc40a83a3) PLATFORM: i86pc, CSN: -, HOSTNAME: san SOURCE: SunOS, REV: 5.11 omnios-33fdde4 DESC: Errors have been detected that require a reboot to ensure system integrity. See http://illumos.org/msg/SUNOS-8000-0G for more information. AUTO-RESPONSE: Solaris will attempt to save and diagnose the error telemetry IMPACT: The system will sync files, save a crash dump if needed, and reboot REC-ACTION: Save the error summary below in case telemetry cannot be saved panic[cpu0]/thread=ffffff000f4cbc40: pcieb-0: PCI(-X) Express Fatal Error. (0x45) ffffff000f4cbb70 pcieb:pcieb_intr_handler+1c9 () ffffff000f4cbbe0 unix:av_dispatch_autovect+95 () ffffff000f4cbc20 unix:dispatch_hardint+36 () ffffff000f405a60 unix:switch_sp_and_call+13 () ffffff000f405ac0 unix:do_interrupt+a8 () ffffff000f405ad0 unix:cmnint+ba () ffffff000f405bc0 unix:mach_cpu_idle+6 () ffffff000f405bf0 unix:cpu_idle+11a () ffffff000f405c00 unix:cpu_idle_adaptive+13 () ffffff000f405c20 unix:idle+a7 () ffffff000f405c30 unix:thread_start+8 () syncing file systems... done ereport.io.pci.fabric ena=2bc408a03a00001 detector=[ version=0 scheme="dev" device-path="/pci@0,0/pci10de,376@a" ] bdf=50 device_id=376 vendor_id=10de rev_id=a3 dev_type=40 pcie_off=80 pcix_off=0 aer_off=160 ecc_ver=0 pci_status= 10 pci_command=47 pci_bdg_sec_status=6000 pci_bdg_ctrl=3 pcie_status=0 pcie_command=2037 pcie_dev_cap=8001 pcie_adv_ctl=a0 pcie_ue_status=0 pcie_ue_mask=180000 pcie_ue_sev=62011 pcie_ue_hdr0=0 pcie_ue_hdr1=0 pcie_ue_hdr2=0 pcie_ue_hdr3=0 pcie_ce_status=0 pcie_ce_mask=0 pcie_rp_status=0 pcie_rp_control=0 pcie_adv_rp_status=800007c pcie_adv_rp_command=7 pcie_adv_rp_ce_src_id=0 pcie_adv_rp_ue_src_id=201 remainder=5 severity=1 ereport.io.pci.fabric ena=2bc409115e00001 detector=[ version=0 scheme="dev" device-path="/pci@0,0/pci10de,376@a/pci1033,125@0" ] bdf=200 device_id=125 vendor_id=1033 rev_id=8 dev_type=70 pcie_off=40 pcix_off=54 aer_off=100 ecc_ver=0 pci_status=10 pci_command=47 pci_bdg_sec_status=2420 pci_bdg_ctrl=7 pcix_bdg_status=200 pcix_bdg_sec_status=83 pcie_status=20 pcie_command=2027 pcie_dev_cap=1 pcie_adv_ctl=a0 pcie_ue_status=0 pcie_ue_mask=180000 pcie_ue_sev=62010 pcie_ue_hdr0=4000001 pcie_ue_hdr1=50000f pcie_ue_hdr2= 2020000 pcie_ue_hdr3=0 pcie_ce_status=0 pcie_ce_mask=0 pcie_sue_adv_ctl=0 pcie_sue_status=0 pcie_sue_mask=8 pcie_sue_sev=1340 pcie_sue_hdr0=20003 pcie_sue_hdr1=a0 pcie_sue_hdr2=10000 pcie_sue_hdr3=0 remainder=4 severity=5 ereport.io.pci.fabric ena=2bc4096d5c00001 detector=[ version=0 scheme="dev" device-path="/pci@0,0/pci10de,376@a/pci1033,125@0/pci11ab,11ab@4" ] bdf=320 device_id=6081 vendor_id=11ab rev_id=9 dev_type=101 pcie_off=0 pcix_off=60 aer_off=0 ecc_ver=0 pci_status=2b0 pci_command=157 pcix_status=1830320 pcix_command=30 remainder=3 severity=1 ereport.io.pci.fabric ena=2bc4098c0500001 detector=[ version=0 scheme="dev" device-path="/pci@0,0/pci10de,376@a/pci1033,125@0/pci11ab,11ab@6" ] bdf=330 device_id=6081 vendor_id=11ab rev_id=9 dev_type=101 pcie_off=0 pcix_off=60 aer_off=0 ecc_ver=0 pci_status=2b0 pci_command=157 pcix_status=1830330 pcix_command=30 remainder=2 severity=1 ereport.io.pci.fabric ena=2bc409a74200001 detector=[ version=0 scheme="dev" device-path="/pci@0,0/pci10de,376@a/pci1033,125@0,1" ] bdf=201 device_id=125 vendor_id=1033 rev_id=8 dev_type=70 pcie_off=40 pcix_off=54 aer_off=100 ecc_ver=0 pci_status=10 pci_command=47 pci_bdg_sec_status=6420 pci_bdg_ctrl=7 pcix_bdg_status=201 pcix_bdg_sec_status=c3 pcie_status=6 pcie_command=2027 pcie_dev_cap=1 pcie_adv_ctl=a0 pcie_ue_status=0 pcie_ue_mask=180000 pcie_ue_sev=62010 pcie_ue_hdr0=0 pcie_ue_hdr1=0 pcie_ue_hdr2=0 pcie_ue_hdr3=0 pcie_ce_status=0 pcie_ce_mask=0 pcie_sue_adv_ctl=c pcie_sue_status=1800 pcie_sue_mask=8 pcie_sue_sev=1340 pcie_sue_hdr0=20104 pcie_sue_hdr1=a0 pcie_sue_hdr2=10000 pcie_sue_hdr3=0 pcie_sue_tgt_trans=0 pcie_sue_tgt_addr=0 pcie_sue_tgt_bdf=ffff remainder=1 severity=45 ereport.io.pci.fabric ena=2bc40a0e5600001 detector=[ version=0 scheme="dev" device-path="/pci@0,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6" ] bdf=430 device_id=6081 vendor_id=11ab rev_id=9 dev_type=101 pcie_off=0 pcix_off=60 aer_off=0 ecc_ver=0 pci_status=c3b8 pci_command=157 pcix_status=1830430 pcix_command=30 remainder=0 severity=40 dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel ##### END KERNEL PANIC ##### Thoughts? - is it clearly the mobo? - could it possibly be the [new] cpus? - anything else to try? Thanks. Kent |
- [discuss] kernel panic: PCI(-X) Express Fatal Error. (0x45) Kent Watsen
