Re: Curiosity killed the `stats cachedump`

Neil Mckee Fri, 19 Aug 2011 12:55:28 -0700

On Aug 19, 2011, at 12:56 AM, dormando wrote:

>>>> So there's no need to hesitate if you can already do (1) today.  Let's
>>>> face it, you have been very successful and there are rather a lot of
>>>> users who have already gotten past (2) :)
>>> 
>>> Okay, I'm kinda tired of that argument. Just beacuse you say something
>>> isn't possible, doesn't mean we can't make it work anyway. If you believe
>>> they're divergent, stop saying that they're divergent and prove it with
>>> examples. However I'd rather spend my time writing features than
>>> pretending to know if a theoretical patch will work or not.
>>> 
>>> We want to work towards a system that can encompass a replacement for
>>> "stats cachedump". If we can design something which generates sflow as a
>>> subset, that'll be totally amazing! We can even use your patches as
>>> reference for creating a core shipped plugin.
>>> 
>>> If people want to use sflow today, they can apply your patches and use it.
>>> As is such with open source.
>>> 
>>> -Dormando
>> 
>> 
>> I didn't say it wasn't possible.... but never mind all that.  A
>> core-shipped plugin would be great.  Let me know if there's anything I
>> can do to help.
> 
> That was more strongly worded than I intended, I apologize; I don't agree
> that it's worth rushing. Not "rushing" is why we haven't already settled
> on TOPKEYS the way it is. I don't really intend to throw something else in
> there immediately.


No worries.  I apologize for my impatience.  You are right.  There is no rush.

But you did ask for more specific examples,  so for what it's worth,  here are 
some reasons why I think features for (1) in-production cluster-wide sampling 
and (2) testing and troubleshooting should be kept as separate as possible:

A. They will rarely be used at the same time on the same node.
B.  If they are used concurrently (e.g. troubleshooting a production node),  
then using (2) should have no effect on (1).
C. The cluster-wide configuration used for (1) is likely to be very different 
from the interactive configuration for (2).
D.  Getting a feed of randomly-sampled transactions is probably the only think 
they will have in common.  After that, (1) will simply send the sample over 
UDP,  while (2) might apply regex-filtering,  value-field analysis, various 
tests on the expiration times and slab allocation and finally stream results 
out on a TCP connection - probably using some ASCII format instead of XDR.
E.  Even on the part that they do have in common  (1) is likely to want only a 
handful of samples per second per node (e.g. 1-in-10000),  while (2) is much 
more likely to want a more aggressive  feed such as 1-in-10,  or even 1-in-1.  
It seems likely that this difference will impact the implementation.  For 
example, that time-duration measurement would be unthinkable at 1-in-1,  but 
could be quite OK at 1-in-50000.
F.  Even if (2) may be considered higher priority,  I think it's easier to see 
how (1) can be completed and tied in a bow.  I should stress here that I'm not 
expecting anyone to use my code!  I just think you guys could knock (1) out 
pretty easily and reap immediate benefits,  while (2) could take a while to 
crystallize.


Getting unnecessarily detailed,  let's say you implemented the plugin sampling 
something like this:

possibly_sample_transaction(connection, protocol, operation, key, val,  status) 
{
    r = next_random(connection->thread);
    for(i = 0; i < num_sampling_plugins; i++) {
        consumer = sampling_plugins[i];
        if(r <= consumer->probability_threshold)  {
         (*consumer->sample_callback)(connection, protocol, operation, key, 
value, status);
       }
    }
}

Compare that with the number of instructions and branches involved here:

possibly_sample_transaction(connection, protocol, operation, key, value, 
status) {
    if(next_random(connection->thread) <= probability_threshold) { 
      take_sample(connection, protocol, operation, key, value, status);
    }
}

Or,  if you allow one sampling_probability to be treated specially and turned 
into a countdown-to-next-sample,  then you can do it this way and save more:

possibly_sample_transaction(connection, protocol, operation, key, value, 
status) {
    if(unlikely(--connection->thread->countdown== 0)) {
        connection->thread->countdown = compute_next_countdown();
        take_sample(connection, protocol, operation, key, value, status);
   }
}

At this point you could easily turn it into a macro so that there is no extra 
function-call in the critical path,  just a decrement-and-test on the 
thread->countdown.

I don't know if it matters so much to shave a few dozen cycles off the critical 
path,  but my point was just to illustrate that even in the small area of 
overlap between (1) and (2) you might still be grateful someday if you kept 
them entirely separate.

Thoughts?

Neil

Re: Curiosity killed the `stats cachedump`

Reply via email to