Michael,

Below is the preliminary analysis I received.  I have been told I will get
another update in a couple of days.  Sadly, I am only dealing with the first
tier SE as Sun Support doesn't like to let customers talk to PTS. :-(  For
what it's worth, the case ID is  66201589.  Strange, that they think it's a
UFS issue...as it doesn't really explain the symptoms I was seeing.

I believe the following threads are involved as both were accessing the
same vnode at about the same time:

SolarisCAT(vmcore.0/10U)> tlist -l findarg 0x600347d0600
found in %i5 (sp:0x2a102e2d1b0, pc:genunix:pvn_vplist_dirty+0x2f8)
==== user (LWP_SYS) thread: 0x30002d70240  PID: 5882 ====
cmd: db2sysc
t_wchan: 0x7000ee6b944  sobj: condition var (from unix:page_lock_es+0x1d8)
t_procp: 0x600283f70f0
  p_as: 0x60029016d40  size: 1186988032  RSS: 40566784
  hat: 0x30004b8e8c0  cnum: CPU3:33/1371 CPU0:24/2513 CPU1:31/980
CPU2:22/1573
    cpusran: 0,1,2,3
  zone: global
t_stk: 0x2a102e2dae0  sp: 0x2a102e2c851  t_stkbase: 0x2a102e28000
t_pri: 60(TS)  pctcpu: 0.000000
t_lwp: 0x6004641fa10  machpcb: 0x2a102e2dae0
  mstate: LMS_SLEEP  ms_prev: LMS_SYSTEM
  ms_state_start: 2 days 11 hours 37 minutes 58.9535277 seconds earlier
  ms_start: 2 days 12 hours 19 minutes 22.0699800 seconds earlier
psrset: 0  last CPU: 0
idle: 21467354 ticks (2 days 11 hours 37 minutes 53.54 seconds)
start: Fri Mar 20 21:20:15 2009
age: 217146 seconds (2 days 12 hours 19 minutes 6 seconds)
syscall: #62 fcntl(, 0xffffffff7fffcc61) (sysent: genunix:fcntl+0x0)
tstate: TS_SLEEP - awaiting an event
tflg:   T_DONTBLOCK - for lockfs
        T_DFLTSTK - stack is default size
tpflg:  TP_TWAIT - wait to be freed by lwp_wait
        TP_MSACCT - collect micro-state accounting information
tsched: TS_LOAD - thread is in memory
        TS_DONT_SWAP - thread/LWP should not be swapped
pflag:  SJCTL - SIGCLD sent when children stop/continue
        SMSACCT - process is keeping micro-state accounting
        SMSFORK - child inherits micro-state accounting

pc:      genunix:cv_wait+0x38:   call   unix:swtch

genunix:cv_wait+0x38(, 0x18495e8, 0x1, 0x80000001, 0x80000000)
unix:page_lock_es+0x1d8(, , 0x1850688, 0x1)
unix:page_lock(0x7000ee6b900, 0x1, 0x1850688, 0x1) - frame recycled
genunix:pvn_vplist_dirty+0x2f8(0x600347d0600, 0x0, 0x12a0e50, 0x110000)
ufs:ufs_itrunc+0x41c(0x600347d42e0, 0x0, , 0x60028f9f358, , 0x0)
ufs:ufs_trans_itrunc+0x19c(0x600347d42e0, 0x0, 0x0, 0x60028f9f358, , 0x1)
ufs:ufs_freesp+0x150(0x600347d0600, 0x2a102e2da88, 0x2003, 0x60028f9f358)
ufs:ufs_space+0x70(0x600347d0600, 0xb, 0x2a102e2da88, 0x2003, 0x0)
genunix:fop_space+0x28(0x600347d0600, 0xb, 0x2a102e2da88, 0x2003, 0x0,
0x60028f9f358, 0x0)
genunix:fcntl+0x9c8(, , 0x2003?)
unix:syscall_trap+0xac()
-- switch to user thread's user stack --

found in %i0 (sp:0x2a10054f7a0, pc:ufs:ufs_putpages+0x90)
==== kernel thread: 0x3000133d3a0  PID: 3 ====
cmd: fsflush
t_wchan: 0x600347d43c0  sobj: reader/writer lock  WARNING: rwlock not locked
t_procp: 0x60021a7f848(proc_fsflush)
  p_as: 0x183a4d0(kas)
  zone: global
t_stk: 0x2a10054fae0  sp: 0x2a10054ee41  t_stkbase: 0x2a10054a000
t_pri: 60(SYS)  pctcpu: 0.000000
t_lwp: 0x60021a871d8  machpcb: 0x2a10054fae0
  mstate: LMS_SLEEP  ms_prev: LMS_SYSTEM
  ms_state_start: 2 days 11 hours 37 minutes 58.0468787 seconds earlier
  ms_start: 2 days 12 hours 29 minutes 41.4590920 seconds earlier
psrset: 0  last CPU: 1
idle: 21467263 ticks (2 days 11 hours 37 minutes 52.63 seconds)
start: Fri Mar 20 21:09:55 2009
age: 217766 seconds (2 days 12 hours 29 minutes 26 seconds)
tstate: TS_SLEEP - awaiting an event
tflg:   T_DFLTSTK - stack is default size
tpflg:  TP_MSACCT - collect micro-state accounting information
tsched: TS_LOAD - thread is in memory
        TS_DONT_SWAP - thread/LWP should not be swapped
        TS_SIGNALLED - thread was awakened by cv_signal()
pflag:  SSYS - system resident process
        SNOWAIT - children never become zombies

pc:      genunix:turnstile_block+0x600:   call  unix:swtch

genunix:turnstile_block+0x600(0x0, 0x1, 0x600347d43c0, 0x1815a60, 0x0, 0x0)
unix:rw_enter_sleep+0x19c(0x600347d43c0?)
unix:rw_enter(0x600347d43c0, 0x1) - frame recycled
ufs:ufs_putpages+0x90(0x600347d0600)
ufs:ufs_putpage(0x600347d0600, 0xc6000, 0x2000, 0x400, 0x60021803e48) -
frame recycled
genunix:fop_putpage+0x1c(0x600347d0600, 0xc6000, 0x2000, 0x400,
0x60021803e48)
genunix:fsflush_do_pages+0x364(, , , , 0x186d000, 0x2cb5)
genunix:fsflush+0x3e0(0x0, 0x0)
unix:thread_start+0x4()
-- end of kernel thread's stack --

I am reassigning this to Bhaskar as he is the ufs expert.

Thanks
--Brett

On Tue, Mar 24, 2009 at 8:35 AM, Michael Schuster
<Michael.Schuster at sun.com>wrote:

> On 03/24/09 08:29, Brett Monroe wrote:
>
>> Bill,
>>
>> Interesting.  We don't use NIS or automounter in out enterprise.  I ended
>> up forcing a panic and sent the core file up to Sun for analysis. The
>> support engineer said a preliminary examination of the dump seemed to
>> indicate that we were hitting a bug that should have been fixed int the
>> kernel rev prior to the one we are running.  They are sending it to
>> back-line (PTS I assume) for a more thorough  investigation.
>>
>
> in this case, please make sure that the engineer is aware of this
> discussion - it'd be a waste of everybody's time if they missed out on
> relevant information.
>
>
> Michael
> --
> Michael Schuster        http://blogs.sun.com/recursion
> Recursion, n.: see 'Recursion'
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://mail.opensolaris.org/pipermail/mdb-discuss/attachments/20090324/a320de80/attachment.html>

Reply via email to