I just noticed that I didn't answer this question from Dormando: "Is the only reason to keep it exactly the way it is because it's already done and you have customers who rely on it?"
I should clarify: In sFlow, only the architectural framework is fixed. The actual fields that are sent can always be revised, and I was hoping to get your input on that. Specifically, the sFlow standard defines the random 1-in-N transaction- sampling and periodic counter-push mechanisms, so that the critical path is not impacted and all other processing is moved to the central collector (such as filtering, thresholding, top-keys, missed-keys... and anything else that you might think of later). Is that already too constraining? It's not easy to see what else you could do without impacting the critical path and cluster-wide scalability. I guess it's true that you could write every transaction into a ring-buffer and still implement sFlow as a consumer of that 1-in-1 feed, but wouldn't that already be adding too many cycles to the critical path? Especially if most of the time the ring-buffer feature was not being used for anything else (plus there are measurements like the transaction-time-duration that only make sense on a sampled basis because they require two extra system calls). Naturally I agree that it would be nice to find just one open standard that would satisfy both the "tcpdump" style investigation and still meet the operational monitoring requirements, but I'm not aware of anything like that. My experience has been that it is helpful, often vital, to keep the two separate. So please comment: In the current implementation, each randomly- sampled transaction record contains the following fields: enum: protocol; /* ASCII/BINARY */ enum: command; /* e.g. GET or INCR */ string: key; integer: number_of_keys; /* e.g. if the GET was for N keys at once */ integer: value_bytes; integer: duration_uS; /* should we try for nanoseconds? */ enum: status; /* e.g. NOT_FOUND */ and the socket info is included too: enum: IP_protocol; local_IP; remote_IP; local_port; remote_port; Nothing is cast in stone at this stage. If there is something else that should be included that is important for operational monitoring, then please suggest(!) The idea is to agree on the fields and then a standard sFlow tag for that particular structure can be issued, so that a whole ecosystem of collector software can know exactly what to expect if it sees that XDR tag - just as the same tools can be receiving data from switches, routers, firewall, web-servers, load-balancers, servers and hypervisors that all implement their own profile of sFlow. It's always possible to send a different structure (with a different tag) in the future, and anyone can make up their own tag if they want to send something experimental, but the real value here is in agreeing in a set of fields that really capture what is going on, and allow the memcached function to be fitted neatly into a larger measurement framework. Neil On May 19, 7:43 pm, [email protected] wrote: > Comment #8 on issue 202 by [email protected]: TOP_KEYS feature > fixeshttp://code.google.com/p/memcached/issues/detail?id=202 > > OK! When you have a moment please try this: > > https://github.com/sflow-nhm/memcached > > ./configure --enable-sflow > > I forked from the "engine" branch, and added sFlow support. There is no > additional locking in the critical path, so this should have almost no > impact on performance provided the sampling 1-in-N is chosen sensibly. > (Please test and confirm!) > > In addition to cluster-wide top-keys, missed-keys etc. you also get > microsecond-resolution response-time measurements; the value-size in bytes > for each sampled operation and the layer-4 socket. So the sFlow collector > may choose to correlate results by client IP/subnet/country as well as by > cluster node or any function of the sampled keys. > > The logic is best described by the daemon/sflow_mc.h file where the steps > are captured as macros so that they can be inserted in the right places in > memcached.c with minimal source-code footprint. The sflow_sample_test() fn > is called at the beginning of each ascii or binary operation, and it > tosses a coin to decide whether to sample that operation. If so, it just > records the start time. At the end of the transaction, if the start_time > was set then the sFlow sample is encoded and submitted to be sent out. > > To configure for 1-in-5000, edit /etc/hsflowd.auto to look like this: > > rev_start=1 > polling=30 > sampling.memcache=5000 > agentIP=10.211.55.4 > collector=127.0.0.1 6343 > rev_end=1 > > Inserting correct IP address for agentIP. > > If you compile and run "sflowtool" from the sources, you should see the > ascii output:http://www.inmon.com/technology/sflowTools.php > > For more background and a simple example, see > here:http://blog.sflow.com/2010/10/memcached-missed-keys.html > > The periodic sFlow counter-export is not working yet (that's what the > polling=30 setting is for). I think the default-engine needs to implement > the .get_stats_block API call before that will work. Let me know if you > want me to try adding that. > > Best Regards, > Neil > > P.S. I did try to do this as an engine-shim, but the engine protocol is > really a different, internal, protocol. There was not a 1:1 > correspondence with the standard memcached operations.
