On Aug 18, 2011, at 5:19 PM, dormando wrote: >>> >>>> On a positive note, it does seem like there is some consensus on the >>>> value of random-transaction-sampling here. But do we have agreement >>>> that this feed should be made available for external consumption (i.e. >>>> the whole cluster sends to one place that is not itself a memcached >>>> node), and that UDP should be used as the transport? I'd like to >>>> understand if we are on the same page when it comes to these broader >>>> architectural questions. >>> >>> I think I do agree with that. The question is whether we do that by >>> making an sFlow interface or a sample interface? >> >> Do you mean a hook that can be used by a plugin to receive randomly >> sampled transactions? That would allow you to inline the >> random-sampling and eliminate most of the overhead. An sFlow plugin >> would then just have to register for the feed; possibly sub-sample if >> the internal 1-in-N rate was more aggressive than the requested sFlow >> sampling-rate; marshall the samples into UDP datagrams, and send them >> to the configured destinations. I like this solution because it means >> the performance-critical part would be baked in by the experts and fully >> tested with every new release. >> >> But if you've already done the hard work, and everyone is going to want >> the UDP feed, then why not offer that too? I probably made it look hard >> with my bad coding, but all you have to do is XDR-encode it and call >> sendto(). > > We can ship plugins with the core codebase, so sflow would still work "out > of the box", it just wouldn't be what the system was based off of. > > On that note, how critical is it for sflow packets to contain timing data? > Benchmarking will show for sure, but history tells me that this should be > optional.
Not critical at all. The duration_uS field can be set to -1 in the XDR output to indicate that it is not implemented. I added this measurement when porting to the 1.6 branch, where it makes more sense. I left it in when I updated the 1.4 branch because, well, the overhead seemed negligible and the numbers still seemed like they might be revealing something (though I wasn't sure what exactly). The start-time field is currently used as the "we're going to sample this one" flag. However that could easily be changed to just set a bit instead. Two system calls per sample would be saved. The practice of marking a transaction to be sampled at the beginning and then actually taking the sample at the end when the status is known could also be replaced by the old scheme from last year where we do both steps at the same time. However it was actually easier to implement with the two-step approach because of the way that there are only two or three ways that a transaction can start and a whole myriad of ways that it can end. So the first step (the coin-tossing) only has to happen in those two or three places and it's easier to know that you have counted everything once. Breaking it up like this also gives you the choice of accumulating details incrementally (the key, the status-code etc.) in whatever is the easiest place. > > What would be pretty awesome is sflow-ish from libmemcached, since the > only place it *really* matters how long something took is from the > perspective of a client. Profiling the server is only going to tell me if > the box is swapping, as it's extremely uncommon to nail the locks. Yes, a client might well offer sFlow-MEMCACHE transaction samples (as well as enclosing sFlow-HTTP transaction samples, if applicable). However you would probably still want to instrument at the server end to ensure that you were getting the full picture. There might be a whole menagerie of different C, Python, Perl and Java clients in use. > >> Finally, I accept that the engine-pu branch is the focus of future >> development, but... any thoughts on what to do for the 1.4.* versions? > > I'm kicking out one release of 1.4.* monthly until 1.6 supersedes it. That > said I have a backlog of bugs and higher priority changes that will likely > keep me busy for a few months. Unless of course someone sponsors me to > spend more time on it :) In the mean time I could strip down the current patch and reduce it's code footprint considerably - but would that help? Neil > > -Dormando
