Just be clear, I was running filebench directly on the x4500, not from
an initiator, so this is probably not a COMSTAR thing.

Ceri

On Wed, Jul 09, 2008 at 03:42:04PM +0100, Ceri Davies wrote:
> We've got an x4500 running SXCE build 91 with stmf configured to share
> out a (currently) small number (9) of LUs to a (currently) small number
> of hosts (4).
> 
> The x4500 is configured with ZFS root mirror, 6 RAIDZ sets across all
> six controllers, some hot spares in the gaps and a RAID10 set to use
> everything else up.
> 
> Since this is an investigative setup, I have been running filebench
> locally on the x4500 to get some stats before moving on to do the same
> on the initiators against the x4500 and our current storage.
> 
> While running the filebench OLTP workload with $filesize=5g on one of
> the RAIDZ pools, the x4500 seemed to hang while creating the fileset.
> On further investigation, a lot of things actually still worked; log in
> via SSH was fine, /usr/bin/ps worked ok, /usr/ucb/ps and any of the
> /usr/proc ptools just hung, man hung, and so on.  "savecore -L" managed
> to do a dump but couldn't seem to exit.
> 
> So I did a hard reset, the system came up fine and I actually do have
> the dump from "savecore -L".  I'm kind of out of my depth with mdb, but
> it looks pretty clear to me that all of the "hung" processes were
> somewhere in ZFS:
> 
> # mdb -k unix.0 vmcore.0 
> mdb: failed to read panicbuf and panic_reg -- current register set will
> be unavailable
> Loading modules: [ unix genunix specfs dtrace cpu.generic
> cpu_ms.AuthenticAMD.15 uppc pcplusmp scsi_vhci zfs sd ip hook neti sctp
> arp usba fctl nca lofs md cpc random crypto nfs fcip logindmux ptm nsctl
> ufs sppp ipc ]
> > ::memstat
> Page Summary                Pages                MB  %Tot
> ------------     ----------------  ----------------  ----
> Kernel                    3085149             12051   74%
> Anon                        20123                78    0%
> Exec and libs                3565                13    0%
> Page cache                 200779               784    5%
> Free (cachelist)           193955               757    5%
> Free (freelist)            663990              2593   16%
> 
> Total                     4167561             16279
> Physical                  4167560             16279
> > ::pgrep ptree
> S    PID   PPID   PGID    SID    UID      FLAGS             ADDR NAME
> R   1825   1820   1825   1803      0 0x4a004000 ffffff04f5096c80 ptree
> R   1798   1607   1798   1607  15000 0x4a004900 ffffff04f7b72930 ptree
> R   1795   1302   1795   1294      0 0x4a004900 ffffff05179f7de0 ptree
> > ::pgrep ptree | ::walk thread | ::findstack
> stack pointer for thread ffffff04ea2ca440: ffffff00201777d0
> [ ffffff00201777d0 _resume_from_idle+0xf1() ]
>   ffffff0020177810 swtch+0x17f()
>   ffffff00201778b0 turnstile_block+0x752()
>   ffffff0020177920 rw_enter_sleep+0x1b0()
>   ffffff00201779f0 zfs_getpage+0x10e()
>   ffffff0020177aa0 fop_getpage+0x9f()
>   ffffff0020177c60 segvn_fault+0x9ef()
>   ffffff0020177d70 as_fault+0x5ae()
>   ffffff0020177df0 pagefault+0x95()
>   ffffff0020177f00 trap+0xbd3()
>   ffffff0020177f10 0xfffffffffb8001d9()
> stack pointer for thread ffffff04e8752400: ffffff001f9307d0
> [ ffffff001f9307d0 _resume_from_idle+0xf1() ]
>   ffffff001f930810 swtch+0x17f()
>   ffffff001f9308b0 turnstile_block+0x752()
>   ffffff001f930920 rw_enter_sleep+0x1b0()
>   ffffff001f9309f0 zfs_getpage+0x10e()
>   ffffff001f930aa0 fop_getpage+0x9f()
>   ffffff001f930c60 segvn_fault+0x9ef()
>   ffffff001f930d70 as_fault+0x5ae()
>   ffffff001f930df0 pagefault+0x95()
>   ffffff001f930f00 trap+0xbd3()
>   ffffff001f930f10 0xfffffffffb8001d9()
> stack pointer for thread ffffff066fbc6a80: ffffff001f27de90
> [ ffffff001f27de90 _resume_from_idle+0xf1() ]
>   ffffff001f27ded0 swtch+0x17f()
>   ffffff001f27df00 cv_wait+0x61()
>   ffffff001f27e040 vmem_xalloc+0x602()
>   ffffff001f27e0b0 vmem_alloc+0x159()
>   ffffff001f27e140 segkmem_xalloc+0x8c()
>   ffffff001f27e1a0 segkmem_alloc_vn+0xcd()
>   ffffff001f27e1d0 segkmem_zio_alloc+0x20()
>   ffffff001f27e310 vmem_xalloc+0x4fc()
>   ffffff001f27e380 vmem_alloc+0x159()
>   ffffff001f27e410 kmem_slab_create+0x7d()
>   ffffff001f27e450 kmem_slab_alloc+0x57()
>   ffffff001f27e4b0 kmem_cache_alloc+0x136()
>   ffffff001f27e4d0 zio_data_buf_alloc+0x28()
>   ffffff001f27e510 arc_get_data_buf+0x175()
>   ffffff001f27e560 arc_buf_alloc+0x9a()
>   ffffff001f27e610 arc_read+0x122()
>   ffffff001f27e6b0 dbuf_read_impl+0x129()
>   ffffff001f27e710 dbuf_read+0xc5()
>   ffffff001f27e7c0 dmu_buf_hold_array_by_dnode+0x1c4()
>   ffffff001f27e860 dmu_read+0xd4()
>   ffffff001f27e910 zfs_fillpage+0x15e()
>   ffffff001f27e9f0 zfs_getpage+0x187()
>   ffffff001f27eaa0 fop_getpage+0x9f()
>   ffffff001f27ec60 segvn_fault+0x9ef()
>   ffffff001f27ed70 as_fault+0x5ae()
>   ffffff001f27edf0 pagefault+0x95()
>   ffffff001f27ef00 trap+0xbd3()
>   ffffff001f27ef10 0xfffffffffb8001d9()
> > ::pgrep go_filebench | ::walk thread | ::findstack
> stack pointer for thread ffffff055ee097e0: ffffff001f2394f0
> [ ffffff001f2394f0 _resume_from_idle+0xf1() ]
>   ffffff001f239530 swtch+0x17f()
>   ffffff001f239560 cv_wait+0x61()
>   ffffff001f2396a0 vmem_xalloc+0x602()
>   ffffff001f239710 vmem_alloc+0x159()
>   ffffff001f2397a0 segkmem_xalloc+0x8c()
>   ffffff001f239800 segkmem_alloc_vn+0xcd()
>   ffffff001f239830 segkmem_zio_alloc+0x20()
>   ffffff001f239970 vmem_xalloc+0x4fc()
>   ffffff001f2399e0 vmem_alloc+0x159()
>   ffffff001f239a70 kmem_slab_create+0x7d()
>   ffffff001f239ab0 kmem_slab_alloc+0x57()
>   ffffff001f239b10 kmem_cache_alloc+0x136()
>   ffffff001f239b30 zio_data_buf_alloc+0x28()
>   ffffff001f239b70 arc_get_data_buf+0x175()
>   ffffff001f239bc0 arc_buf_alloc+0x9a()
>   ffffff001f239c00 dbuf_noread+0x9b()
>   ffffff001f239c30 dmu_buf_will_fill+0x1f()
>   ffffff001f239cd0 dmu_write_uio+0xd3()
>   ffffff001f239dd0 zfs_write+0x468()
>   ffffff001f239e40 fop_write+0x69()
>   ffffff001f239f00 write+0x2af()
>   ffffff001f239f10 sys_syscall+0x17b()
> stack pointer for thread ffffff052eeebea0: ffffff001fbdcd00
> [ ffffff001fbdcd00 _resume_from_idle+0xf1() ]
>   ffffff001fbdcd40 swtch+0x17f()
>   ffffff001fbdcd70 cv_wait+0x61()
>   ffffff001fbdcdb0 exitlwps+0x1cb()
>   ffffff001fbdce30 psig+0x4b1()
>   ffffff001fbdcf00 post_syscall+0x446()
>   ffffff001fbdcf10 0xfffffffffb800cad()
> >
> 
> If I can turn this dump to use somehow, please just let me know.
> 
> Ceri
> -- 
> That must be wonderful!  I don't understand it at all.
>                                                   -- Moliere



> _______________________________________________
> storage-discuss mailing list
> [email protected]
> http://mail.opensolaris.org/mailman/listinfo/storage-discuss


-- 
That must be wonderful!  I don't understand it at all.
                                                  -- Moliere

Attachment: pgpoFW2Uea1f4.pgp
Description: PGP signature

_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to