On Aug 7, 2009, at 10:39 AM, Stewart Smith wrote:
On Wed, Aug 05, 2009 at 03:13:34PM +1000, Arjen Lentz wrote:
The issue in MySQL has been overhead of such instrumentation,
particularly also when not used. Some cause 5-20% perf loss which is
unacceptable.
110% agree.
If you're not doing analysis of anything, it shouldn't cost you.
You also shouldn't have to restart, rebuild or anything like that.
I think I know how to do this too.
I have this inkling that it's the "if(profiling_enabled)" inserted
everywhere that kills us.
This is pretty easy to check. Say we have some function f() that is
going to do some counting for us (e.g. number of rows fetched,
number of
times mutex X was taken). If profiling is disabled, we want this to
use
0 CPU.
calling an empty function int f(int) a billion times in a loop is
roughly equivilnet of just running through the loop (yes, i built with
gcc -O0 and checked the produced code). By roughly I do mean next to
impossible to measure.
If you add a simple "if(x) something;" to the function f(), it is
noticably slower! (roughly 20% in my tests).
So we really don't want to do that compare.
Now... about this time somebody is going to jump up and suggest using
DTrace to insert code at runtime. Not on Linux, so is worse than
useless
here.
But we can do some cool self modifying code tricks.
The same do-nothing f() does not take any longer to run if we insert
a few
no-ops. (i tried inserting 4 NOP instructions, which are single
byte...
i do wonder if the multi-byte NOP instruction could help here too).
So... when a profile hook is enabled, we just modify f() to call the
real profiling function. This can either be done with an atomic
instruction writing out the appropriate CALL instruction, or we can
put
in a small JMP around the NOPs as we fill it out.
and there's a number of tricks to do this pretty easily for all the
possible points to hook in profiling stuff.
Modifying code is an option, but at the same time it is quite a hack.
A major disadvantage is that it has to be done for each type of
hardware supported.
I have another suggestion, which I have found works well for PBXT (http://pbxt.blogspot.com/2008/12/xtstat-tells-you-exactly-what-pbxt-is.html
).
A simple increment is a very cheap operation, as long as it can be
done without requiring a lock.
(And, if you are just doing an increment, then you don't have to
bother with a if(profiling_enabled), you just do the increment all the
time.)
To avoid locking, each thread needs a complete set of tracking
variables (counters) as part of its THD structure.
You also need a list of all THD structures.
The profiling code pays the price for this. In order to get the
current state of all counters it goes through the list of THDs and
accumulates the THD related counters.
But, this is OK, because this price is only paid when you are actually
profiling.
This method not only works for things like "number of bytes written",
but can also be used to measure time. There is a little trick involved
here, but the result is that you can see, for example, if the server
is hanging in a fsync() call in realtime.
Then we should create a kind of "drizzlestat" program which SELECTs
the current counter values, and displays the statistics in columns.
This is much better then dumping loads of performance schema tables on
a user and saying, the data is there if you need it.
I am also not a believer in gathering statistics on everything (for
example, every semaphore), and letting the user figure out what is
important.
As the developers we need to decide what are the performance critical
parameters, and just provide those statistics. Of course, statistics
can be added later if we see we have missed something. But rather that
then a whole bunch of irrelevant values that make finding a problem
like looking for a needle in a haystack.
--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com
_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help : https://help.launchpad.net/ListHelp