Re: [dtrace-discuss] How to dig deeper

2008-12-08 Thread Hans-Peter
The buffer cache was already huge. So I decided not to increase it. There is a KEEP pool of 5G which is hardly used. If needed I will sacrifice this cache and add it to the DEFAULT cache. Until now it looks promising. The average log file sync wait time has dropped from about 70ms to 7ms. But we

Re: [dtrace-discuss] How to dig deeper

2008-12-05 Thread przemolicc
On Fri, Dec 05, 2008 at 05:40:19AM -0800, Hans-Peter wrote: Ok thanks a lot sofar. We have planned down time for remounting the filesystems this weekend. We will see what happens. You can remount it online: mount -o remount,forcedirectio /mount point mount -o remount,noforcedirectio /mount

Re: [dtrace-discuss] How to dig deeper

2008-12-04 Thread Jim Mauro
Do you have directio enabled on UFS? Especially for the redo logs? With directio enabled, UFS writes to the log do not serialize on the RW lock for the log file(s). directio will also bypass the memory cache, so you need to increase the Oracle db_block_buffers when enabling UFS directio.

Re: [dtrace-discuss] How to dig deeper

2008-12-04 Thread przemolicc
On Thu, Dec 04, 2008 at 04:55:04AM -0800, Hans-Peter wrote: Do you have directio enabled on UFS? Especially for the redo logs? That is the strange thing. filesystemio_options has been set to setall which is asynch and directio. But when I dtrace the fbt calls I see only directio calls

Re: [dtrace-discuss] How to dig deeper

2008-12-04 Thread David Miller
Hi Hans-Peter, There is an Oracle bug that is not fixed until 10.2.0.4 (if I remember right) where filesystemio_options is broken and so it won't do the correct thing with directio and ufs. The locks you are seeing are classic single-writer-lock issues and indicate that you aren't getting

Re: [dtrace-discuss] How to dig deeper

2008-12-04 Thread Jim Mauro
For the record, my friend Phil Harman reminded me that it's not the log files we care about for directio in terms of single-writer lock break-up. We care about directio for redo logs to avoid read-modify-write, which happens when the write is not memory-page aligned. Sorry about that.

Re: [dtrace-discuss] How to dig deeper

2008-12-03 Thread Jim Mauro
%busy is meaningless unless you're looking at a single disk that can only have 1 outstanding IO in it's active queue, which is to say %busy is a useless metric for anything disk that's been designed and built in the last decade. Ignore %busy. Focus on queue depths and queue service times, both of

Re: [dtrace-discuss] How to dig deeper

2008-12-03 Thread Zhu, Lejun
-discuss@opensolaris.org Subject: Re: [dtrace-discuss] How to dig deeper Hello all, I added a clause to my script. sysinfo::: /self-traceme==1 pid == $1/ { trace(execname); printf(sysinfo: timestamp : %d , timestamp); } A subsequent trace created a file of about 19000 lines. I

Re: [dtrace-discuss] How to dig deeper

2008-12-03 Thread Jim Mauro
The sysinfo provider isn't the best choice for measuring disk IO times. Run; #dtrace -s /usr/demo/dtrace/iotime.d /jim Hans-Peter wrote: Hello all, I added a clause to my script. sysinfo::: /self-traceme==1 pid == $1/ { trace(execname); printf(sysinfo: timestamp : %d

Re: [dtrace-discuss] How to dig deeper

2008-12-03 Thread Jim Mauro
Also - since this is Oracle, are the Oracle files on a file system, or raw devices? If a file system, which one? /jim Jim Mauro wrote: The sysinfo provider isn't the best choice for measuring disk IO times. Run; #dtrace -s /usr/demo/dtrace/iotime.d /jim Hans-Peter wrote: Hello

Re: [dtrace-discuss] How to dig deeper

2008-12-03 Thread Hans-Peter
: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Hans-Peter Sent: 2008年12月3日 21:41 To: dtrace-discuss@opensolaris.org Subject: Re: [dtrace-discuss] How to dig deeper Hello all, I added a clause to my script. sysinfo::: /self-traceme==1 pid == $1/ { trace(execname

Re: [dtrace-discuss] How to dig deeper

2008-12-03 Thread Hans-Peter
Hi Mauro, Yes I understand why sysinfo is not the best to measure IO. But I just wanted to see when in the whole trace the actual physical write was being done. So it seems to me that, because the sysinfo:::pwrite is right at the end the performance bottle neck is more because of the locking

Re: [dtrace-discuss] How to dig deeper

2008-12-02 Thread Jim Mauro
Start with iostat. It's simple, and provides an average of service times for disk IOs (iostat -xnz 1, the asvc_t column is average service times in milliseconds). As Jim Litchfield pointed out in a previous thread, keep in mind it is an average, so you won't see nasty peaks, but if the average is