Michael, Below is the preliminary analysis I received. I have been told I will get another update in a couple of days. Sadly, I am only dealing with the first tier SE as Sun Support doesn't like to let customers talk to PTS. :-( For what it's worth, the case ID is 66201589. Strange, that they think it's a UFS issue...as it doesn't really explain the symptoms I was seeing.
I believe the following threads are involved as both were accessing the same vnode at about the same time: SolarisCAT(vmcore.0/10U)> tlist -l findarg 0x600347d0600 found in %i5 (sp:0x2a102e2d1b0, pc:genunix:pvn_vplist_dirty+0x2f8) ==== user (LWP_SYS) thread: 0x30002d70240 PID: 5882 ==== cmd: db2sysc t_wchan: 0x7000ee6b944 sobj: condition var (from unix:page_lock_es+0x1d8) t_procp: 0x600283f70f0 p_as: 0x60029016d40 size: 1186988032 RSS: 40566784 hat: 0x30004b8e8c0 cnum: CPU3:33/1371 CPU0:24/2513 CPU1:31/980 CPU2:22/1573 cpusran: 0,1,2,3 zone: global t_stk: 0x2a102e2dae0 sp: 0x2a102e2c851 t_stkbase: 0x2a102e28000 t_pri: 60(TS) pctcpu: 0.000000 t_lwp: 0x6004641fa10 machpcb: 0x2a102e2dae0 mstate: LMS_SLEEP ms_prev: LMS_SYSTEM ms_state_start: 2 days 11 hours 37 minutes 58.9535277 seconds earlier ms_start: 2 days 12 hours 19 minutes 22.0699800 seconds earlier psrset: 0 last CPU: 0 idle: 21467354 ticks (2 days 11 hours 37 minutes 53.54 seconds) start: Fri Mar 20 21:20:15 2009 age: 217146 seconds (2 days 12 hours 19 minutes 6 seconds) syscall: #62 fcntl(, 0xffffffff7fffcc61) (sysent: genunix:fcntl+0x0) tstate: TS_SLEEP - awaiting an event tflg: T_DONTBLOCK - for lockfs T_DFLTSTK - stack is default size tpflg: TP_TWAIT - wait to be freed by lwp_wait TP_MSACCT - collect micro-state accounting information tsched: TS_LOAD - thread is in memory TS_DONT_SWAP - thread/LWP should not be swapped pflag: SJCTL - SIGCLD sent when children stop/continue SMSACCT - process is keeping micro-state accounting SMSFORK - child inherits micro-state accounting pc: genunix:cv_wait+0x38: call unix:swtch genunix:cv_wait+0x38(, 0x18495e8, 0x1, 0x80000001, 0x80000000) unix:page_lock_es+0x1d8(, , 0x1850688, 0x1) unix:page_lock(0x7000ee6b900, 0x1, 0x1850688, 0x1) - frame recycled genunix:pvn_vplist_dirty+0x2f8(0x600347d0600, 0x0, 0x12a0e50, 0x110000) ufs:ufs_itrunc+0x41c(0x600347d42e0, 0x0, , 0x60028f9f358, , 0x0) ufs:ufs_trans_itrunc+0x19c(0x600347d42e0, 0x0, 0x0, 0x60028f9f358, , 0x1) ufs:ufs_freesp+0x150(0x600347d0600, 0x2a102e2da88, 0x2003, 0x60028f9f358) ufs:ufs_space+0x70(0x600347d0600, 0xb, 0x2a102e2da88, 0x2003, 0x0) genunix:fop_space+0x28(0x600347d0600, 0xb, 0x2a102e2da88, 0x2003, 0x0, 0x60028f9f358, 0x0) genunix:fcntl+0x9c8(, , 0x2003?) unix:syscall_trap+0xac() -- switch to user thread's user stack -- found in %i0 (sp:0x2a10054f7a0, pc:ufs:ufs_putpages+0x90) ==== kernel thread: 0x3000133d3a0 PID: 3 ==== cmd: fsflush t_wchan: 0x600347d43c0 sobj: reader/writer lock WARNING: rwlock not locked t_procp: 0x60021a7f848(proc_fsflush) p_as: 0x183a4d0(kas) zone: global t_stk: 0x2a10054fae0 sp: 0x2a10054ee41 t_stkbase: 0x2a10054a000 t_pri: 60(SYS) pctcpu: 0.000000 t_lwp: 0x60021a871d8 machpcb: 0x2a10054fae0 mstate: LMS_SLEEP ms_prev: LMS_SYSTEM ms_state_start: 2 days 11 hours 37 minutes 58.0468787 seconds earlier ms_start: 2 days 12 hours 29 minutes 41.4590920 seconds earlier psrset: 0 last CPU: 1 idle: 21467263 ticks (2 days 11 hours 37 minutes 52.63 seconds) start: Fri Mar 20 21:09:55 2009 age: 217766 seconds (2 days 12 hours 29 minutes 26 seconds) tstate: TS_SLEEP - awaiting an event tflg: T_DFLTSTK - stack is default size tpflg: TP_MSACCT - collect micro-state accounting information tsched: TS_LOAD - thread is in memory TS_DONT_SWAP - thread/LWP should not be swapped TS_SIGNALLED - thread was awakened by cv_signal() pflag: SSYS - system resident process SNOWAIT - children never become zombies pc: genunix:turnstile_block+0x600: call unix:swtch genunix:turnstile_block+0x600(0x0, 0x1, 0x600347d43c0, 0x1815a60, 0x0, 0x0) unix:rw_enter_sleep+0x19c(0x600347d43c0?) unix:rw_enter(0x600347d43c0, 0x1) - frame recycled ufs:ufs_putpages+0x90(0x600347d0600) ufs:ufs_putpage(0x600347d0600, 0xc6000, 0x2000, 0x400, 0x60021803e48) - frame recycled genunix:fop_putpage+0x1c(0x600347d0600, 0xc6000, 0x2000, 0x400, 0x60021803e48) genunix:fsflush_do_pages+0x364(, , , , 0x186d000, 0x2cb5) genunix:fsflush+0x3e0(0x0, 0x0) unix:thread_start+0x4() -- end of kernel thread's stack -- I am reassigning this to Bhaskar as he is the ufs expert. Thanks --Brett On Tue, Mar 24, 2009 at 8:35 AM, Michael Schuster <Michael.Schuster at sun.com>wrote: > On 03/24/09 08:29, Brett Monroe wrote: > >> Bill, >> >> Interesting. We don't use NIS or automounter in out enterprise. I ended >> up forcing a panic and sent the core file up to Sun for analysis. The >> support engineer said a preliminary examination of the dump seemed to >> indicate that we were hitting a bug that should have been fixed int the >> kernel rev prior to the one we are running. They are sending it to >> back-line (PTS I assume) for a more thorough investigation. >> > > in this case, please make sure that the engineer is aware of this > discussion - it'd be a waste of everybody's time if they missed out on > relevant information. > > > Michael > -- > Michael Schuster http://blogs.sun.com/recursion > Recursion, n.: see 'Recursion' > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/mdb-discuss/attachments/20090324/a320de80/attachment.html>