Hi all, I get the following panic when scrubbing a pool. After the panic, it will continue scrubbing and possibly panic again, several times. However, if left to run, the pool will finish scrubbing and is reported clean.
The pool is an 8-way raidz2, across several AHCI controllers (Intel and Marvell) on one of those AsRock Avoton C2750D4I Mini-ITX boards that were all the rage a couple of years ago for home servers. I'm only just getting around to putting it into full service, but it's been running fine (with fewer drives) until now. It's running the current PI. I'm assuming there's some load-related locking issue, and hoping it's a solvable software issue rather than bad hardware. I haven't yet really looked at the BIOS options to see if there are some interrupt-mapping options that might move the issue around somehow. I can try that, but I'd rather a deliberate set of tests rather than random shuffling. There's also a flaky ssd in the zones pool (on sata0/0 c3t0d0). It works fine, most of the time, but occasionally goes offline until I power off, fiddle and re-try it. I suspect a cable problem, and will be pulling it out to test separately. It doesn't share a controller with the pool, and has been offline through a full panic cycle, so I'm hoping is unrelated. At least, a failed drive shouldn't be able to cause this, so I'm posting before trying that anyway. So, some crash and config details below. Suggestions and requests for further info welcome (I presume info on driver and interrupt status would be useful, but I don't have the mdb incantations..) [root@d0-50-99-46-c2-00 /var/crash/volatile]# mdb -e '::status;$C' vmcore.5 debugging crash dump vmcore.5 (64-bit) from d0-50-99-46-c2-00 operating system: 5.11 joyent_20160906T181054Z (i86pc) image uuid: (not set) panic message: I/O to pool 'titan' appears to be hung. dump content: kernel pages only ffffff003d54f9d0 vpanic() ffffff003d54fa20 vdev_deadman+0x10b(ffffff0d08289380) ffffff003d54fa70 vdev_deadman+0x4a(ffffff0d10a28640) ffffff003d54fac0 vdev_deadman+0x4a(ffffff0d10bbd040) ffffff003d54faf0 spa_deadman+0xad(ffffff0d11210000) ffffff003d54fb90 cyclic_softint+0xfd(ffffff0d07de9a80, 0) ffffff003d54fba0 cbe_low_level+0x14() ffffff003d54fbf0 av_dispatch_softvect+0x78(2) ffffff003d54fc20 dispatch_softint+0x39(0, 0) ffffff003d4e8a20 switch_sp_and_call+0x13() ffffff003d4e8a60 dosoftint+0x44(ffffff003d4e8ad0) ffffff003d4e8ac0 do_interrupt+0xba(ffffff003d4e8ad0, 0) ffffff003d4e8ad0 _interrupt+0xba() ffffff003d4e8bc0 i86_mwait+0xd() ffffff003d4e8c00 cpu_idle_mwait+0x109() ffffff003d4e8c20 idle+0xa7() ffffff003d4e8c30 thread_start+8() [root@d0-50-99-46-c2-00 /var/crash/volatile]# zpool status -v titan pool: titan state: ONLINE scan: scrub repaired 0 in 3h53m with 0 errors on Sat Sep 10 13:59:42 2016 config: NAME STATE READ WRITE CKSUM titan ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 c4t0d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c0t0d0 ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 c1t0d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 [root@d0-50-99-46-c2-00 /var/crash/volatile]# cfgadm -lv Ap_Id Receptacle Occupant Condition Information When Type Busy Phys_Id sata0/0 connected unconfigured unknown Mod: FRev: SN: unavailable unknown n /devices/pci@0,0/pci1849,1f22@17:0 sata0/1::dsk/c3t1d0 connected configured ok Mod: INTEL SSDSC2BW240A4 FRev: DC32 SN: CVDA44520AGG2403GN unavailable disk n /devices/pci@0,0/pci1849,1f22@17:1 sata0/2::dsk/c3t2d0 connected configured ok Mod: KINGSTON SVP200S37A240G FRev: 502ABBF0 SN: 50026B722C0629B9 unavailable disk n /devices/pci@0,0/pci1849,1f22@17:2 sata0/3 empty unconfigured ok unavailable sata-port n /devices/pci@0,0/pci1849,1f22@17:3 sata1/0::dsk/c4t0d0 connected configured ok Mod: WDC WD80EFZX-68UW8N0 FRev: 83.H0A83 SN: VKJA71SX unavailable disk n /devices/pci@0,0/pci1849,1f32@18:0 sata1/1::dsk/c4t1d0 connected configured ok Mod: WDC WD80EFZX-68UW8N0 FRev: 83.H0A83 SN: VLGUM69Z unavailable disk n /devices/pci@0,0/pci1849,1f32@18:1 sata2/0::dsk/c0t0d0 connected configured ok Mod: WDC WD80EFZX-68UW8N0 FRev: 83.H0A83 SN: VKJ9UZ6X unavailable disk n /devices/pci@0,0/pci8086,1f12@3 /pci10b5,8608@0/pci10b5,8608@1/pci1849,9172@0:0 sata2/1::dsk/c0t1d0 connected configured ok Mod: WDC WD80EFZX-68UW8N0 FRev: 83.H0A83 SN: VKJAVGNX unavailable disk n /devices/pci@0,0/pci8086,1f12@3 /pci10b5,8608@0/pci10b5,8608@1/pci1849,9172@0:1 sata3/0::dsk/c1t0d0 connected configured ok Mod: WDC WD80EFZX-68UW8N0 FRev: 83.H0A83 SN: VKJYRS2Y unavailable disk n /devices/pci@0,0/pci8086,1f13@4 /pci1849,9230@0:0 sata3/1::dsk/c1t1d0 connected configured ok Mod: WDC WD80EFZX-68UW8N0 FRev: 83.H0A83 SN: VLGUJSDZ unavailable disk n /devices/pci@0,0/pci8086,1f13@4 /pci1849,9230@0:1 sata3/2::dsk/c1t2d0 connected configured ok Mod: WDC WD80EFZX-68UW8N0 FRev: 83.H0A83 SN: VLGUNDZZ unavailable disk n /devices/pci@0,0/pci8086,1f13@4 /pci1849,9230@0:2 sata3/3::dsk/c1t3d0 connected configured ok Mod: WDC WD80EFZX-68UW8N0 FRev: 83.H0A83 SN: VLGUJNXZ unavailable disk n /devices/pci@0,0/pci8086,1f13@4 /pci1849,9230@0:3 sata3/4 empty unconfigured ok unavailable sata-port n /devices/pci@0,0/pci8086,1f13@4 /pci1849,9230@0:4 sata3/5 empty unconfigured ok unavailable sata-port n /devices/pci@0,0/pci8086,1f13@4 /pci1849,9230@0:5 sata3/6 empty unconfigured ok unavailable sata-port n /devices/pci@0,0/pci8086,1f13@4 /pci1849,9230@0:6 sata3/7 connected unconfigured ok Mod: MARVELL VIRTUALL FRev: 1.09 SN: unavailable processor n /devices/pci@0,0/pci8086,1f13@4 /pci1849,9230@0:7 usb0/1 connected configured ok Mfg: <undef> Product: <undef> NConfigs: 1 Config: 0 <no cfg str descr> *# usb stuff below here trimmed* -- Dan. ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00 Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb Powered by Listbox: http://www.listbox.com