On Aug 7, 2009, at 3:10 PM, Jay Pipes wrote:
To avoid locking, each thread needs a complete set of tracking
variables (counters) as part of its THD structure.
s/THD/Session
Oops, sorry, how very old school of me! ;)
Also, you must understand that there is no one-to-one thread-to-
Session guarantee.
Because Sessions may be executed in a thread pool, there must be a
way of either:
Yes, true! I did not mention the need to "merge" statistics when a
session is closed.
The current value of the statistics is derived from the global sum,
plus the sum of all running sessions.
a) Merging Session-local stats into the global system variables
structure upon Session destruction or rescheduling via a scheduling
thread. Currently this operation does not acquire a lock around the
global systems variables in the Session destructor:
Session::~Session()
{
...
add_to_status(&global_status_var, &status_var);
...
}
void add_to_status(STATUS_VAR *to_var, STATUS_VAR *from_var)
{
ulong *end= (ulong*) ((unsigned char*) to_var +
offsetof(STATUS_VAR, last_system_status_var) +
sizeof(ulong));
ulong *to= (ulong*) to_var, *from= (ulong*) from_var;
while (to != end)
*(to++)+= *(from++);
}
I don't know if this critical section was deliberately left
unprotected by LOCK_status or not...still looking into this. Also,
MontyT is completely redesigning the system variables system, so the
above "bookmarking" code will not likely look the same in a few weeks.
Either a lock or atomic op would be required here. In fact a spinlock
would probably be the best because the lock is only held for a short
time.
Either way, you incur locking and instruction costs. These costs
have been deemed too high by MySQL engineering for the hundreds
(thousands?) of metrics that the MySQL performance schema monitors
(or is able to monitor). This is likely because the frequency of
certain events in the performance schema is quite high?
I agree that if you have 1000's of metrics that this method becomes
too expensive. But I think what is missing is a little thought about
which metrics make sense, and which do not.
The profiling code pays the price for this. In order to get the
current state of all counters it goes through the list of THDs and
accumulates the THD related counters.
But, this is OK, because this price is only paid when you are
actually profiling.
Agreed in principle, yes.
This method not only works for things like "number of bytes
written", but can also be used to measure time. There is a little
trick involved here, but the result is that you can see, for
example, if the server is hanging in a fsync() call in realtime.
Then we should create a kind of "drizzlestat" program which SELECTs
the current counter values, and displays the statistics in columns.
Before this is possible, an API into the performance data counters
must be written. I don't want programs willy-nilly accessing
internal kernel and storage engine data without going through a
proper interface...we're trying to move away from that sort of
thing :)
I'm not sure what you mean year, but why not use an information schema
table? It returns one row for each counter. The row has an ID (which
identifies the counter) and a value.
So the performance counters are never written to a table, the current
value of each counter is just returned dynamically when a select is
done on the table.
This is much better then dumping loads of performance schema tables
on a user and saying, the data is there if you need it.
Agreed.
I am also not a believer in gathering statistics on everything (for
example, every semaphore), and letting the user figure out what is
important.
OK, sure, but what if you don't already know the cause of your
slowdown is a mutex or semaphore and want to find this out?
Yup, good question!
For me this is a matter of whether the tool is created for DBAs/
Consultants or for the developers of the database.
I think such a tool should be useful to DBAs/Consultants and a
valuable _support_ tool for the developers.
So in the case you mention, the main thing is that we notice that the
relevant counter is missing from the statistics.
We would notice this when the transactions per second go down, but
none of the counters we have go up.
Then it is time to pull out other tools to look for the bottleneck
(such as http://mituzas.lt/2009/02/15/poor-mans-contention-profiling).
The funny thing is: the goal will then be to remove this bottleneck,
which means removing the semaphore (at least in the current form),
which will mean removing the statistic.
So in a correctly optimized server you only have statistics on
semaphores that are _not_ bottlenecks. Which means you don't need the
statistics!
So my thinking is that, in the long run, we should only have
statistics that tend to come and go as problems depending on your
hardware setup etc.
A drizzlestat tool is then a help to developers in the sense that it
quickly enables us to eliminate the mundane reasons for bad
performance. And also to monitor the general performance
characteristics at runtime.
As the developers we need to decide what are the performance
critical parameters, and just provide those statistics. Of course,
statistics can be added later if we see we have missed something.
But rather that then a whole bunch of irrelevant values that make
finding a problem like looking for a needle in a haystack.
Agreed, but see point above...
Marc Alff took an approach that causes almost no overhead if the
performance schema is not *compiled in*. There is an overhead if
the performance schema is compiled in and the DBA is not careful to
specify only those things she is interested in.
One problem with "compiled in" statistics is that they are often not
there when you need them.
I'd love to find a perfect medium between Marc's approach (which
nicely NOOPs the performance schema code behind #define templates
when it is not compiled in) and your discussion above of non-storage
of all data pieces automatically.
Does Marc Alff's approach using the same method that Stewart proposed,
i.e. "if(profiling_enabled)" when compiled in and NOPS where possible
to remove this overhead when not required?
--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com
_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help : https://help.launchpad.net/ListHelp