My thoughts after skimming the blueprint online:
1 -- too much focus on counters. Counters are eye candy. So what if
you had 99 of these and 2 of those? When you're talking about
performance, you care about measuring two things: time and tasks.
Both are necessary. A counter isn't a task, to the user of a
database. A query is a task. If you're digging into the internals to
optimize individual queries, you might care about sub-tasks like sorts
and disk I/Os and such, but without knowing the time each execution of
the sub-task took, the knowledge of how many of them happened is
basically useless. It's a distraction.
2 -- following in MySQL's footsteps too much. One example: Slowlog
and global query log are redundant. There needs to be one log, and it
needs to be much richer. Look at the Percona patches to MySQL for
quick wins -- add things like thread/session ID, for example. Add
timing information on tasks within the query. Add information on
whether the query did a sort, what its plan was, etc. The Percona
patches are actually not a great example here -- they are
directionally correct ("add more info") but you need to recognize that
they are an ugly compromise -- they add some can't-live-without
information without perturbing the server much. A greenfield design
should be much more ambitious; it should say not only "there was a
sort" (that's a counter, which I just finished mocking) but how big it
was, and how much time was spent in it. Postgres's log is another
example to look at.
IMO everything should begin or end with queries, queries, queries.
That is the unit of work that a database server does. Everything else
-- I/O, sorts, blah blah -- should be a drill-down from (or drill-up
to) queries, queries, queries. Any of that stuff that isn't possible
to correlate to specific queries is just a curiosity. Again --
suppose I'm silly and I decide to figure out what I/O my server is
doing so much of -- so what? If I don't know how to blame the I/O
operations on queries, I can't reduce the I/O. (If I can start with
I/O operations and correlate them with queries, it is legitimate to
optimize bottom-up instead of top-down, so "drill-down" shouldn't be
the only approach.)
I will now step aside and gesture towards Cary Millsap instead of
blathering on. Basically, Drizzle has the chance here to support
methodical investigation into performance -- realize that there is a
logical time+tasks approach to doing that, and build to support a
smart user, instead of throwing counters at the user and asking
him/her to try to second-guess what it really means :-)
Thanks
Baron
_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help : https://help.launchpad.net/ListHelp