I just noticed that I didn't answer this question from Dormando:  "Is
the only reason to keep it exactly the way it is because it's already
done and you have customers who rely on it?"

I should clarify:  In sFlow, only the architectural framework is
fixed.  The actual fields that are sent can always be revised,  and I
was hoping to get your input on that.

Specifically, the sFlow standard defines the random 1-in-N transaction-
sampling and periodic counter-push mechanisms,  so that the critical
path is not impacted and all other processing is moved to the central
collector (such as filtering, thresholding, top-keys, missed-keys...
and anything else that you might think of later).  Is that already too
constraining?  It's not easy to see what else you could do without
impacting the critical path and cluster-wide scalability.  I guess
it's true that you could write every transaction into a ring-buffer
and still implement sFlow as a consumer of that 1-in-1 feed,  but
wouldn't that already be adding too many cycles to the critical path?
Especially if most of the time the ring-buffer feature was not being
used for anything else (plus there are measurements like the
transaction-time-duration that only make sense on a sampled basis
because they require two extra system calls).  Naturally I agree that
it would be nice to find just one open standard that would satisfy
both the "tcpdump" style investigation and still meet the operational
monitoring requirements,  but I'm not aware of anything like that.  My
experience has been that it is helpful, often vital,  to keep the two
separate.

So please comment:  In the current implementation,  each randomly-
sampled transaction record contains the following fields:

  enum: protocol; /* ASCII/BINARY */
  enum: command;  /* e.g. GET or INCR */
  string: key;
  integer: number_of_keys; /* e.g.  if the GET was for N keys at once
*/
  integer: value_bytes;
  integer: duration_uS;  /* should we try for nanoseconds? */
  enum: status;  /* e.g. NOT_FOUND */

and the socket info is included too:

  enum: IP_protocol;
  local_IP;
  remote_IP;
  local_port;
  remote_port;

Nothing is cast in stone at this stage.  If there is something else
that should be included that is important for operational monitoring,
then please suggest(!)

The idea is to agree on the fields and then a standard sFlow tag for
that particular structure can be issued,  so that a whole ecosystem of
collector software can know exactly what to expect if it sees that XDR
tag - just as the same tools can be receiving data from switches,
routers, firewall, web-servers, load-balancers, servers and
hypervisors that all implement their own profile of sFlow.   It's
always possible to send a different structure (with a different tag)
in the future,  and anyone can make up their own tag if they want to
send something experimental,  but the real value here is in agreeing
in a set of fields that really capture what is going on,  and allow
the memcached function to be fitted neatly into a larger measurement
framework.

Neil


On May 19, 7:43 pm, [email protected] wrote:
> Comment #8 on issue 202 by [email protected]: TOP_KEYS feature 
> fixeshttp://code.google.com/p/memcached/issues/detail?id=202
>
> OK!  When you have a moment please try this:
>
> https://github.com/sflow-nhm/memcached
>
> ./configure --enable-sflow
>
> I forked from the "engine" branch,  and added sFlow support.  There is no  
> additional locking in the critical path, so this should have almost no  
> impact on performance provided the sampling 1-in-N is chosen sensibly.    
> (Please test and confirm!)
>
> In addition to cluster-wide top-keys, missed-keys etc. you also get  
> microsecond-resolution response-time measurements;  the value-size in bytes  
> for each sampled operation and the layer-4 socket.   So the sFlow collector  
> may choose to correlate results by client IP/subnet/country as well as by  
> cluster node or any function of the sampled keys.
>
> The logic is best described by the daemon/sflow_mc.h file where the steps  
> are captured as macros so that they can be inserted in the right places in  
> memcached.c with minimal source-code footprint.  The sflow_sample_test() fn  
> is called at the beginning of each ascii or binary operation,  and it  
> tosses a coin to decide whether to sample that operation.  If so,  it just  
> records the start time.  At the end of the transaction,  if the start_time  
> was set then the sFlow sample is encoded and submitted to be sent out.
>
> To configure for 1-in-5000,  edit /etc/hsflowd.auto to look like this:
>
> rev_start=1
> polling=30
> sampling.memcache=5000
> agentIP=10.211.55.4
> collector=127.0.0.1 6343
> rev_end=1
>
> Inserting correct IP address for agentIP.
>
> If you compile and run "sflowtool" from the sources,  you should see the  
> ascii output:http://www.inmon.com/technology/sflowTools.php
>
> For more background and a simple example,  see 
> here:http://blog.sflow.com/2010/10/memcached-missed-keys.html
>
> The periodic sFlow counter-export is not working yet (that's what the  
> polling=30 setting is for).  I think the default-engine needs to implement  
> the .get_stats_block API call before that will work.  Let me know if you  
> want me to try adding that.
>
> Best Regards,
> Neil
>
> P.S.  I did try to do this as an engine-shim,  but the engine protocol is  
> really a different, internal,  protocol.  There was not a 1:1  
> correspondence with the standard memcached operations.

Reply via email to