Jim,
I'll run "iostat -zxnd 1 20" tomorrow when application start. the application recovered after about 40 seconds by itself after the disk I/O dropped. Thanks, James Yang Global Unix Support, IES, GTO Deutsche Bank US Phone: 201-593-1360 Email : jianhua.y...@db.com Pager : 1-800-946-4646 PIN# 6105618 CR: NYC_UNIX_ES_US_UNIX_SUPPORT http://dcsupport.ies.gto.intranet.db.com/ --- This communication may contain confidential and/or privileged information. If you are not the intended recipient (or have received this communication in error) please notify the sender immediately and destroy this communication. Any unauthorized copying, disclosure or distribution of the material in this communication is strictly forbidden. Deutsche Bank does not render legal or tax advice, and the information contained in this communication should not be regarded as such. Jim Mauro <james.ma...@sun.c OM> To Sent by: Jianhua Yang/db/db...@dbamericas james.ma...@sun.co cc M dtrace-discuss@opensolaris.org Subject Re: [dtrace-discuss] disk utilization 12/17/08 07:02 PM is over 200% This is all very odd....iostat is historically extremely reliable. I've never observed stats like that before - zero reads and writes with a non-zero value in the wait queue (forget utilization when it comes to disk - it's a useless metric). IO rates per process are best measured at the VOP layer. Depending on what version of Solaris you're running, you can use the fsinfo provider (fsinfo::fop_read:entry, fsinfo::fop_write:entry). If you don't have the fsinfo provider, instrument the syscall layer to track reads and writes. Can we get another sample, using "iostat -zxnd 1 20"? Does the application recover from the hang, or does it remain hung and require kill/restart? Thanks, /jim Jianhua Yang wrote: > > Hello, > > I use Brendan's sysperfstat script to see the overall system > performance and found the the disk utilization is over 100: > > 15:51:38 14.52 15.01 200.00 24.42 0.00 0.00 83.53 0.00 > 15:51:42 11.37 15.01 200.00 25.48 0.00 0.00 88.43 0.00 > ------ Utilisation ------ ------ Saturation ------ > Time %CPU %Mem *%Disk* %Net CPU Mem *Disk* Net > 15:51:45 11.01 15.01* 200.00* 12.02 0.00 0.00 *95.03* 0.00 > 15:51:48 13.80 15.01 *200.00* 24.87 0.00 0.00 *98.86* 0.00 > 15:51:51 9.44 15.01 *200.00* 17.02 0.00 0.00 *102.64* 0.00 > 15:51:54 9.49 15.01 *164.59* 9.10 0.00 0.00 *83.75* 0.00 > 15:51:57 16.58 15.01 *2.83* 20.46 0.00 0.00 0.00 0.00 > > how can I fix this ? is there new verion of this script ? > > my system is X4600-M1 with hardware RAID of > 0+1 = OS disk =72 GB = d0 > 2+3 = apps data disk = 146 GB = d2, SVM soft partition with one UFS > file system is active > at that time, iostat showed strange output: > cpu > us sy wt id > 13 9 0 78 > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30 > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40 > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52 > 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0 > cpu > us sy wt id > 10 5 0 85 > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30 > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40 > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52 > 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0 > kr/s & kw/s show 0, but wait is 334, > > at this, the application always hang. > > # dtrace -n 'io:::start { @files[pid, execname, args[2]->fi_pathname] > = sum(args[0]->b_bcount); } tick-5sec { exit(); }' > dtrace: description 'io:::start ' matched 7 probes > CPU ID FUNCTION:NAME > 8 49675 :tick-5sec > > 16189 nTrade > /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_outmsg.ledger > 32768 > 25456 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 32768 > 3 fsflush <none> 38912 > 25418 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 49152 > 21372 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 65536 > 16189 nTrade > /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_exerep.ledger > 81920 > 16189 nTrade /export/data/dbxpt3/logs/ntrade.imbalances.log 114688 > 25419 iostat /export/data/dbxpt3/logs/NTRPT3-MOCA.log 114688 > 8018 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 131072 > 24915 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 147456 > 16189 nTrade <none> 207872 > 20900 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 270336 > 0 sched <none> 782336 > 16189 nTrade /export/data/dbxpt3/logs/NTRPT3-MOCA.log 2162688 > > the write is about 10MB/s, did the above dtrace script tell the real > IO going on at that time ? > is there a way to find how many IO generate by processes, and how many > IO are in the IO wait queue ? > is there a way to find out the disk RPM besides checking the physical > drive ? > > Thanks, > > James Yang > --- > This communication may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this communication > in error) please notify the sender immediately and destroy this > communication. Any unauthorized copying, disclosure or distribution of the > material in this communication is strictly forbidden. > > Deutsche Bank does not render legal or tax advice, and the information > contained in this communication should not be regarded as such. > > ------------------------------------------------------------------------ > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss@opensolaris.org >
<<inline: graycol.gif>>
<<inline: pic24774.gif>>
<<inline: ecblank.gif>>
_______________________________________________ dtrace-discuss mailing list dtrace-discuss@opensolaris.org