Mike,

Thanks for the reply.

Well, the problem I am actually dealing can not be fought we awr, since
I am actually
Dealing with problems on an oracle standby server (a sun T2000), with 6
standbys running on it.
The problem I am faced with is that 4 of 6 standbys apply logs at
reasonable rates and the 5 or 6th standby's
Apply goes from a quick 3 min apply, to upwards of 60minutes per log.
Now, it is possible that the transactions
In the redolog are so repetive (deletes to the same blocks), etc that
the apply rate is shriekingly bad.

We are using UFS with forcedirectio on the mount point option, probably
not enough ram to do what we want, and a T2000 with a very weird
multi-threading and virtual cpu model.

So, the question I posed, I guess really wasn't as simple as it sounded.
I am doing a multi-front approach:  1) calculating total box i/o's per
second 2) total bandwidth per second (include log file transfers, etc).
3) drill down to the largest consumers (mainly the ora_p00) parallel
recovery processes and the MRP processes, 4) calculating their overall
combined throughput on the entire box (across all 6 standbys).
5) Track processor switches between the processes (since I fear that the
T2000 is actually starving certain processes, while keeping certain
processes on a give CPU ) (lots of reading about T2000 shows this is
definitely possible).

And others.

This problem isn't appearing as simple as I would like and I guess I am
trying to use Dtrace to solve all my problems (yep, looking for that
silver bullet).

Thanks for the help.

 
Brian
 
----------------------
Brian P Michael
Technical Management Consultant
Rolta TUSC, Inc.
micha...@tusc.com
630-960-2909 x1181
http://www.tusc.com
 
The information contained in this transmission is privileged and
confidential information intended for the use of the individual or
entity named above.  If the reader of this message is not the intended
recipient, you are hereby notified that any dissemination, distribution
or copying of this communication is strictly prohibited.  If you have
received this transmission in error, do not read it.  Please immediately
reply to the sender that you have received this communication in error
and then delete it.  Thank you.

-----Original Message-----
From: Mike Gerdts [mailto:mger...@gmail.com] 
Sent: Thursday, December 17, 2009 6:27 PM
To: Michael Brian - IL
Cc: dtrace-discuss@opensolaris.org
Subject: Re: [dtrace-discuss] Looking for help on 2 items...

On Thu, Dec 17, 2009 at 5:51 PM, Michael Brian - IL <micha...@tusc.com>
wrote:
> I am pretty new to Dtrace but use the Dtrace Toolkit when trying to 
> troubleshoot I/O issues On Oracle.
>
> I am looking for help on how to do the following:
> I am trying to answer whether adding more HBA Cards/ports would be 
> effective.
>
> To do this, I need to know the i/o's per second As well as total 
> bandwidth per second.
>
> Has anyone done this before?

Sure - and dtrace isn't needed

# iostat -xCn 1 | nawk '$0 ~ /device|c.$/'

> Does anyone have any other ideas on how to attack this problem?

Your DBA's should be able to tell give you more detailed data such as
how individual tables and files are performing.  Ask for an AWR report
(e.g. http://users.telenet.be/oraguy.be/awr2.htm).  It could be that
there is just a hot LUN due to multiple hot tables being on the same
file system.  You can see the relative performance of the disks with
iostat (get rid of the nawk at the end).

Adding more paths to storage can sometimes confuse the storage array,
causing it to be less efficient, thereby making your problem worse.
Depending on I/O patterns, striping, array type, etc., you could end up
making it so that the array no longer recognizes sequential reads
(thereby missing out on readahead) or you could end up with more copies
of the most active data in the array's cache while slightly less active
data is evicted.  I've generally found that if you are at the point of
thinking you need more paths you are best off making any LUN available
only to two HBAs.

If you are using veritas file systems, you should be using odm.  If you
think you are using odm, verify that you really are by using odmstat on
some of the database files to be sure you don't see just a bunch of
zeros.  If you are using UFS or NFS, be sure that you are using
directio.  If you are using ASM or raw disks you should be good on this
front.  Typically problems in this area will result in very high %sys
(vmstat) while not a lot of real work is getting done.

> I have been tuning Oracle for quite some time now, and I am 
> continually Asked to prove what I tend to know naturally, that the 
> classic 1 HBA, 2 port card Isn't cutting it.
>
> I also have similar discussions on whether I am saturating the BUS on 
> a particular box.

What kind of bus?  What speed are the HBA's?  If you are on a x8 PCIe
connected to a dual port 2 Gb HBA, you are going to max out the HBA's
while only at no more than 25% of the PCIe bandwidth.  On the other
hand, a dual 4 Gb HBA in a PCI or PCI-X slot could certainly be
problematic.  You may want to take a look at busstat if you feel like
you are likely overwhelming a bus.

Also, if prstat -mL is showing that you have queries where usr + sys add
up to 100 for extended periods of time, you may have problems with
queries not having fast enough CPU's.  If you see the lat (time waiting
to get on a cpu) more than a few percent, you could probably benefit
from having more CPU or faster CPU's.

Hmmm... I forgot to mention dtrace...

One thing I have found with the dtrace toolkit scripts on multi terabyte
OLTP databases is that the default values in dtrace and/or the scripts
are not large enough to store the amount of data required.
 As such there are more drops than data points, severely limiting the
usefulness of the scripts unless you tune them.

I love dtrace, but to date I haven't found it to be more useful for
database analysis than a lot of the tools that have existed in Solaris
for a decade or so.

--
Mike Gerdts
http://mgerdts.blogspot.com/
_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Reply via email to