Re: [dtrace-discuss] disk utilization is over 200%

Jianhua Yang Wed, 17 Dec 2008 18:20:31 -0800


Jim,


I'll run "iostat -zxnd 1 20" tomorrow when application start.

the application recovered after about 40 seconds by itself after the disk I/O
dropped.

Thanks,

James Yang
Global Unix Support,  IES,  GTO
Deutsche Bank US
Phone:   201-593-1360
Email :   jianhua.y...@db.com
Pager :  1-800-946-4646  PIN# 6105618
CR: NYC_UNIX_ES_US_UNIX_SUPPORT
http://dcsupport.ies.gto.intranet.db.com/


---
This communication may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this communication
in error) please notify the sender immediately and destroy this
communication. Any unauthorized copying, disclosure or distribution of the
material in this communication is strictly forbidden.

Deutsche Bank does not render legal or tax advice, and the information
contained in this communication should not be regarded as such.

                                                                       
             Jim Mauro                                                 
             <james.ma...@sun.c                                        
             OM>                                                           To
             Sent by:                   Jianhua Yang/db/db...@dbamericas
             james.ma...@sun.co                                            cc
             M                          dtrace-discuss@opensolaris.org 
                                                                      Subject
                                        Re: [dtrace-discuss] disk utilization
             12/17/08 07:02 PM          is over 200%                   
                                                                       
                                                                       
                                                                       
                                                                       
                                                                       
                                                                       




This is all very odd....iostat is historically extremely reliable.
I've never observed stats like that before - zero reads and writes
with a non-zero value in the wait queue (forget utilization when
it comes to disk - it's a useless metric).

IO rates per process are best measured at the VOP layer.
Depending on what version of Solaris you're running, you
can use the fsinfo provider (fsinfo::fop_read:entry,
fsinfo::fop_write:entry). If you don't have the fsinfo
provider, instrument the syscall layer to track reads and writes.

Can we get another sample, using "iostat -zxnd 1 20"?

Does the application recover from the hang, or does it
remain hung and require kill/restart?

Thanks,
/jim


Jianhua Yang wrote:
>
> Hello,
>
> I use Brendan's sysperfstat script to see the overall system
> performance and found the the disk utilization is over 100:
>
> 15:51:38 14.52 15.01 200.00 24.42 0.00 0.00 83.53 0.00
> 15:51:42 11.37 15.01 200.00 25.48 0.00 0.00 88.43 0.00
> ------ Utilisation ------ ------ Saturation ------
> Time %CPU %Mem *%Disk* %Net CPU Mem *Disk* Net
> 15:51:45 11.01 15.01* 200.00* 12.02 0.00 0.00 *95.03* 0.00
> 15:51:48 13.80 15.01 *200.00* 24.87 0.00 0.00 *98.86* 0.00
> 15:51:51 9.44 15.01 *200.00* 17.02 0.00 0.00 *102.64* 0.00
> 15:51:54 9.49 15.01 *164.59* 9.10 0.00 0.00 *83.75* 0.00
> 15:51:57 16.58 15.01 *2.83* 20.46 0.00 0.00 0.00 0.00
>
> how can I fix this ? is there new verion of this script ?
>
> my system is X4600-M1 with hardware RAID of
> 0+1 = OS disk =72 GB = d0
> 2+3 = apps data disk = 146 GB = d2, SVM soft partition with one UFS
> file system is active
> at that time, iostat showed strange output:
> cpu
> us sy wt id
> 13 9 0 78
> extended device statistics
> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52
> 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0
> cpu
> us sy wt id
> 10 5 0 85
> extended device statistics
> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52
> 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0
> kr/s & kw/s show 0, but wait is 334,
>
> at this, the application always hang.
>
> # dtrace -n 'io:::start { @files[pid, execname, args[2]->fi_pathname]
> = sum(args[0]->b_bcount); } tick-5sec { exit(); }'
> dtrace: description 'io:::start ' matched 7 probes
> CPU ID FUNCTION:NAME
> 8 49675 :tick-5sec
>
> 16189 nTrade
> /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_outmsg.ledger
> 32768
> 25456 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 32768
> 3 fsflush <none> 38912
> 25418 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 49152
> 21372 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 65536
> 16189 nTrade
> /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_exerep.ledger
> 81920
> 16189 nTrade /export/data/dbxpt3/logs/ntrade.imbalances.log 114688
> 25419 iostat /export/data/dbxpt3/logs/NTRPT3-MOCA.log 114688
> 8018 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 131072
> 24915 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 147456
> 16189 nTrade <none> 207872
> 20900 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 270336
> 0 sched <none> 782336
> 16189 nTrade /export/data/dbxpt3/logs/NTRPT3-MOCA.log 2162688
>
> the write is about 10MB/s, did the above dtrace script tell the real
> IO going on at that time ?
> is there a way to find how many IO generate by processes, and how many
> IO are in the IO wait queue ?
> is there a way to find out the disk RPM besides checking the physical
> drive ?
>
> Thanks,
>
> James Yang
> ---
> This communication may contain confidential and/or privileged information.
> If you are not the intended recipient (or have received this communication
> in error) please notify the sender immediately and destroy this
> communication. Any unauthorized copying, disclosure or distribution of the
> material in this communication is strictly forbidden.
>
> Deutsche Bank does not render legal or tax advice, and the information
> contained in this communication should not be regarded as such.
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org
>

<<inline: graycol.gif>>

<<inline: pic24774.gif>>

<<inline: ecblank.gif>>

_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Re: [dtrace-discuss] disk utilization is over 200%

Reply via email to