G'Day Martin, Folks,

On Fri, Jan 25, 2008 at 08:50:33AM +0000, Martin MC Brown wrote:
[...]
> You should find, however, that the latest version on mysql.bkbits.net  
> of the 6.0 tree has the code. You'll need to use --enable-dtrace  
> during configure to enable it during build.

Thanks for the link, and I can see the 26 probes listed in sql/probes.h.

I'm glad this provider is now public and we can discuss it - it's going
to be of great use, and I'm really looking forward to writing more scripts
for it.  I tested a prototype about six months ago and emailed demos of
scripts and suggestions for probe and argument additions; by the looks of
the current probe list, these demos and suggestions may still be relevant,
so I've attached them below.

Note, the demos that follow are not possible with the probes I see in
the 6.0 tree, since they require arguments other than just the thread id.

The 6.0 tree does look like it has a well thought out set of basic probes,
but not interesting arguments such as query strings and database names.
Adding these arguments shouldn't be that difficult (I still have the
mysql patch from when I tried it earlier), the difficult part is choosing
consistent arguments across probes (something worth discussing here).
Writing DTrace scripts will help confirm that the argument choice is
working.


*** Demos ***

The following are fairly simple but useful.  The scripts are in the tar
file attached.

mysqld_qsnoop.d traces queries live and prints the user, host and query
string.

# ./mysqld_qsnoop.d 
 [EMAIL PROTECTED]          QUERY
 [EMAIL PROTECTED]     select * from months
 [EMAIL PROTECTED]     select * from months where num > 6
 [EMAIL PROTECTED]     select * from months where num > 6 and num < 11
 [EMAIL PROTECTED]     show tables
 [EMAIL PROTECTED]     show databases
 [EMAIL PROTECTED]     select * from words where name > 'z'
^C

In the above example, several simple queries were performed.  This script
can be handy for use when troubleshooting to see what is happening.  For busy
servers the output can scroll very fast, and mysqld_qcount.d may be more
appropriate.

...

mysqld_qcount.d counts the queries that are requested from mysqld. 

# ./mysqld_qcount.d 
Tracing... Hit Ctrl-C to end.
^C
 [EMAIL PROTECTED]          COUNT QUERY
 [EMAIL PROTECTED]         1 select * from months where num > 3
 [EMAIL PROTECTED]         1 select * from months where num > 3 and num < 9
 [EMAIL PROTECTED]         1 select num, name from months
 [EMAIL PROTECTED]         1 show databases
 [EMAIL PROTECTED]         3 select * from months

In the above example output, the "select * from months" query was perfomed 3
times, where the others were all performed once.

...

mysqld_qchit.d traces mysqld query cache activity, and prints a report
of hit and miss counts by query string, and totals.

# ./mysqld_qchit.d 
Tracing... Hit Ctrl-C to end.
^C
   QUERY                                                        RESULT  COUNT
   select * from months                                           miss      1
   select * from months where num > 3                              hit      1
   select * from months where num > 3                             miss      1
   select * from months where num > 3 and num < 9                 miss      1
   select * from words where name > 'z'                           miss      1
   show databases                                                 miss      1
   show tables                                                    miss      1
   select * from months where num > 3 and num < 9                  hit      3
   select * from words where name > 'z'                            hit      4
   select * from months                                            hit      9

Hits     : 17
Misses   : 6
Hit Rate : 73%

...

mysqld_qestat.d traces mysqld query execute time and prints a summary line
for each query.  Query execution is the last stage in processing a query 
(after parsing and planning), and only occurs for queries that miss the
query cache.

# ./mysqld_qestat.d 
 DATABASE         [EMAIL PROTECTED]             ms RET QUERY
 BrendanDatabase  [EMAIL PROTECTED]         0   0 show databases                
  
 BrendanDatabase  [EMAIL PROTECTED]         0   0 show tables                   
  
 BrendanDatabase  [EMAIL PROTECTED]         0   0 select * from months          
  
 BrendanDatabase  [EMAIL PROTECTED]         1   0 select * from months          
  
 BrendanDatabase  [EMAIL PROTECTED]        14   0 select * from words where 
nam...
 BrendanDatabase  [EMAIL PROTECTED]       158   0 select * from words           
  
 BrendanDatabase  [EMAIL PROTECTED]        51   0 select * from words           
  
 BrendanDatabase  [EMAIL PROTECTED]        41   0 select * from words           
  
 BrendanDatabase  [EMAIL PROTECTED]        35   0 select * from words           
  
 BrendanDatabase  [EMAIL PROTECTED]        35   0 select * from words           
  
 BrendanDatabase  [EMAIL PROTECTED]        22  -1 select * from bogus     
^C

In the example output above, the execution time can be seen for each
query execution, in milliseconds.  The RET column shows the return
value for that query, with -1 meaning an error; the "select * from bogus"
query returned an error as that table does not exist.

...

mysqld_qetime.d traces mysqld query execute time and prints distribution
plots of nanosecond times for each query.  Query execution is the last stage
in processing a query (after parsing and planning), and only occurs for
queries that miss the query cache.

# ./mysqld_qetime.d 
Tracing... Hit Ctrl-C to end.
^C
   [EMAIL PROTECTED]      show tables

           value  ------------- Distribution ------------- count    
          131072 |                                         0        
          262144 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1        
          524288 |                                         0        

   [EMAIL PROTECTED]      show databases

           value  ------------- Distribution ------------- count    
          524288 |                                         0        
         1048576 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1        
         2097152 |                                         0        

   [EMAIL PROTECTED]      select * from months

           value  ------------- Distribution ------------- count    
           65536 |                                         0        
          131072 |@@@@@@@@@@@@@@@@@                        9        
          262144 |                                         0        
          524288 |@@@@@@@@@@@@@@@@@@@@@                    11       
         1048576 |                                         0        
         2097152 |                                         0        
         4194304 |                                         0        
         8388608 |@@                                       1        
        16777216 |                                         0        

   [EMAIL PROTECTED]      select * from words

           value  ------------- Distribution ------------- count    
        16777216 |                                         0        
        33554432 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 5        
        67108864 |                                         0        

   [EMAIL PROTECTED]      select * from words where name < 'fish'

           value  ------------- Distribution ------------- count    
         8388608 |                                         0        
        16777216 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@    11       
        33554432 |@@@                                      1        
        67108864 |                                         0        

In the above output, the "select * from months" query had 1 execution
take between 8 and 16 milliseconds, 11 executions take between 0.5 and 1.0
milliseconds, and 9 executions take between 131 and 262 microseconds.
Presumably the slow execution (8 to 16 ms) was the first (which DTrace
can confirm), which put the data in memory.

...

Anyhow, I want to write more scripts - but I'll need more arguments
and probes available first (see suggestions below).  Trying to create
useful scripts is a good exercise to generate ideas for what else
this provider needs to support; which is similar in idea to what Mike
and Jon suggested - writing a real world case study would also generate
such ideas.


*** Suggestions ***

here are my suggestions:

1) Probes

There are many probes to pick from, which is good, but some are missing -
such as probes for the query cache (I have added query cache probes in
my patch).  A quick way to check what might be useful, is to search for
the DBUG_PRINT calls in the code - if it is instrumented for debugging,
it may be useful to instrument it for DTrace.

I'd suggest probes to observe:

- query cache hits / misses
- connection logins (successful and failed)
- database startup / shutdown
- sorts
- replication events

Of course, we need to focus on probes that are actually useful at 
solving real issues, and not create too many unneeded probes.  Having a
small sufficient set of probes will make it easier to keep the provider
as a stable interface.

Also, the "-end" probes I think would be better renamed "-done" probes,
as is the convention in other DTrace providers (such as "io").

2) Probe Arguments

The provider really needs many more arguments, and there are plenty to
pick from in the THD class and what it inherits.  Here is what I got
going in my suggested patch:

Probes,
        query-parse-start(thr, database, user, host, query)
        query-parse-done(thr, database, user, host, query, result)
        query-execute-start(thr, database, user, host, query)
        query-execute-done(thr, database, user, host, query, result)
        query-cache-hit(thr, database, user, host, query)
        query-cache-miss(thr, database, user, host, query)

Arguments,
        (void *)thr             THR Class pointer (for raw debugging)
        (char *)database        Database name string
        (char *)user            Username string
        (char *)host            Hostname or IP address string
        (char *)query           SQL Query string
        (int)result             Result (0 == success, -1 == failure)

I didn't find thread_id useful (DTrace already has thread IDs), but the
thr pointer may be useful (both as a unique ID and for raw debugging - where
pretty much any data can be fetch if you know the offset).

The order of arguments is from most to least common.  Most probes should
have thr and database, so they go first, followed by user and host,
then specific arguments - such as query string and result.

The result argument is invaluable.  I picked variables that seemed reasonable
for that patch - but please check them more carefully (they look fine,
but I didn't read all the code).

There is a lot of very cool info in the THD class, but we need to consider
stability (and not expose private implementation details).

For connection/login probes, the Security_context class has some very
interesting fields - check them out.  Although, they aren't always populated
(I used "user" and "host_or_ip" in my suggested patch - as they seem most
frequently populated).

cheers,

Brendan

-- 
Brendan
[CA, USA]

Attachment: mysqld-demos-1.0.tar
Description: Unix tar archive

_______________________________________________
dtrace-discuss mailing list
[email protected]

Reply via email to