is there somewhere i can copy edit this document?

a bit nitpicky, i know, but i found a few mistakes just while browsing it...
section 2.1 both "suites" should be "suits," section 3.4 "it's" should be
"its," etc.

awl
On Apr 11, 2011 3:05 PM, "Trond Norbye" <[email protected]> wrote:
> What's new in memcached
> =======================
>
> (part two - new feature proposals)
>
> Table of Contents
> =================
> 1 Protocol
> 1.1 Virtual buckets!
> 1.2 TAP
> 1.3 New commands
> 1.3.1 VERBOSITY
> 1.3.2 TOUCH, GAT and GATQ
> 1.3.3 SET_VBUCKET, GET_VBUCKET, DEL_VBUCKET
> 1.3.4 TAP_CONNECT
> 1.3.5 TAP_MUTATION, TAP_DELETE, TAP_FLUSH
> 1.3.6 TAP_OPAQUE
> 1.3.7 TAP_VBUCKET_SET
> 1.3.8 TAP_CHECKPOINT_START and TAP_CHECKPOINT_END
> 2 Modularity
> 2.1 Engines
> 2.2 Extensions
> 2.2.1 Logger
> 2.2.2 Daemon
> 2.2.3 ASCII commands
> 3 New stats
> 3.1 Stats returned by the default stats command
> 3.1.1 libevent
> 3.1.2 rejected_conns
> 3.1.3 stats related to TAP
> 3.2 topkeys
> 3.3 aggregate
> 3.4 settings
> 3.4.1 extension
> 3.4.2 topkeys
>
>
> 1 Protocol
> ~~~~~~~~~~~
>
> Intentionally, there is no significant difference in protocol over
> 1.4.x. There is one minor change, but it should be transparent to
> most users.
>
> 1.1 Virtual buckets!
> =====================
>
> We don't know who originally came up with the idea, but we've heard
> rumors that it might be Anatoly Vorobey or Brad Fitzpatrick. In lieu
> of a full explanation on this, the concept is that instead of mapping
> each key to a server we map it to a virtual bucket. These virtual
> buckets are then distributed across all of the servers. To ease the
> introduction of this we've assigned the two reserved bytes in the
> binary protocol for specifying the vbucket id, which allowed us to
> avoid protocol extensions.
>
> Note that this change should allow for complete compatibility if the
> clients and the server are not aware of vbuckets. These should have
> been set to 0 according to the original binary protocol specification,
> which means that they will always use vbucket 0.
>
> The idea is that we can move these vbuckets between servers such that
> you can "grow" or "shrink" your cluster without losing data in your
> cache. The classic memcached caching engine does _not_ implement
> support for multiple vbuckets right now, but it is on the roadmap to
> create a version of the engine in memcached to support this (it is a
> question of memory efficiency, and there are currently not many
> clients that support them).
>
> Defining this now will allow us to start moving down the path to
> vbuckets in the default_engine and allow other engine implementors to
> consider vbuckets in their design.
>
> You can read more about the mechanics of it here:
> [http://dustin.github.com/2010/06/29/memcached-vbuckets.html]
>
> However, you _cannot_ use a mix of clients that are vbucket aware and
> clients who don't use vbuckets, but then again it doesn't make sense
> to use a vbucket aware backend if your clients don't know how to
> access them. This is why we believe a protocol change isn't
> warranted.
>
> Defining this now will allow us to start moving down the path to
> vbuckets in the default_engine and allow other engine implementors to
> consider vbuckets in their design.
>
> 1.2 TAP
> ========
>
> In order to facilitate vbucket transfers, among other use cases where
> people want to see what's inside the server, we added to the binary
> protocol a set of commands collectively called TAP. The intention is
> to allow "clients" to receive a stream of notifications whenever data
> change in the server. It is solely up to the backing store to
> implement this, so it can make decisions about what resources are used
> to implement TAP. This functionality is commonly needed enough though
> that the core is aware of it, leaving specific implementation to
> engines.
>
> 1.3 New commands
> =================
>
> There are a few new commands available. The following sections
> provides a brief description of them. Please check protocol_binary.h
> for the implementation details.
>
> 1.3.1 VERBOSITY
> ----------------
>
> We did not have an equivalent of the verbosity command in the textual
> protocol. This command allows the user to change the verbosity level
> on your running server by using the binary protocol. Why do we need
> this? There is a command line option you may use to disable the ascii
> protocol, so we need this command in order to change the logging level
> in those configurations.
>
> 1.3.2 TOUCH, GAT and GATQ
> --------------------------
>
> One of the problems with the existing commands in memcached is that
> you couldn't tell the memcached server that the object is still valid
> and we just want a longer expiration. Normally you want to put an
> expiry time on the objects, so that you can get an indication if your
> cache is big enough (by watching the eviction stats.. if your
> memcached server has a high eviction rate your cache isn't big enough
> for what you want to have in there). The normal idea is that the items
> you're normally using would be bumped to the front of your LRU (and
> hence not be kicked out immediately).
>
> The touch command lets you set the expiry time for an object without
> retrieving the object. In most cases, you will not want to do this
> unless you provide a CAS value to ensure that you're touching the
> correct version of the object.
>
> GAT means "get and touch" and returns the object in addition to
> setting a new expiration time. This allows you to have a rolling
> window of expiry that has a TTL in addition to the access time. For
> example, you can instruct memcached to allow an object to live no
> later than five minutes after the last time it was access (but as
> always, it may expire sooner).
>
> 1.3.3 SET_VBUCKET, GET_VBUCKET, DEL_VBUCKET
> --------------------------------------------
>
> These commands are used to set, get or delete a vbucket on the server.
>
> 1.3.4 TAP_CONNECT
> ------------------
>
> Connect and request that the server initialize a TAP stream.
>
> The point of this command is to allow clients to connect and specify a
> few things about the data they wish to receive. Specifically, the
> client will typically specify a date either in the past or in the
> future along with specifying a vbucket. The server will then stream
> data mutated since that given date or if a future date is specified,
> only stream new mutations as they arrive. The specific details about
> which mutations to send may vary on implementation.
>
> 1.3.5 TAP_MUTATION, TAP_DELETE, TAP_FLUSH
> ------------------------------------------
>
> TAP_MUTATION is a notification that an item changed value in the
> server.
>
> The mutation typically comes with the new value.
>
> TAP_DELETE is a notification that a key was deleted on the server.
>
> Finally, to avoid having to send a complete list of all the keys in
> the server when the user issues a flush, we can send a single message
> (TAP_FLUSH) representing the flush. Please note that the FLUSH
> message means _ALL_ vbuckets, and not just a single vbucket.
>
> 1.3.6 TAP_OPAQUE
> -----------------
>
> To allow storage engines to send their own messages over the tap
> stream between each other, a tap opaque message is defined. It is
> completely up to the storage engine to specify the internal layout of
> the package.
>
> 1.3.7 TAP_VBUCKET_SET
> ----------------------
>
> This is a message requesting a vbucket set. It is similar to the
> set_vbucket command, with the difference that this message comes over
> a tap connection (with the extra info a tap message contains)
>
> 1.3.8 TAP_CHECKPOINT_START and TAP_CHECKPOINT_END
> --------------------------------------------------
>
> The checkpoint start and end messages may be used by engine who wants
> to use checkpoints. Checkpoints are an optional feature that may be
> used by some engines to allow clients to start at a checkpoint
> position. By doing so, the client need not do a full "backfill" even
> if it is revisiting a server after having been gone for a while. The
> TAP_CHECKPOINT_START tells a client that it's the start of a new
> checkpoint, and the TAP_CHECKPOINT_END tells the client when it's
> received everything for that given checkpoint.
>
> 2 Modularity
> ~~~~~~~~~~~~~
>
> As we mentioned in the first email on changes, one big difference with
> this new work is that we've tried to refactor memcached into being a
> modular application instead of being monolithic. In the future, we'd
> like to make the command parser as a separate module, so that we may
> load the parsers separately.
>
> 2.1 Engines
> ============
>
> We've done a lot of work trying to refactor the code in memcached to
> avoid the tight coupling between the command protocol parser and the
> actual item storage.
>
> The idea with the engine interface is that the memcached process loads
> a dynamically loadable object and calls a well known function to get a
> set of function pointers. All communication between the memcached
> process and the engine is performed through these function
> pointers. The memcached process provides a set of services to the
> engine as well through another set of function pointers.
>
> The beauty of this is that the user may choose between a set of
> different storage engines that suites their runtime
> environment. People have different requirements for their server. Some
> people need ACID, others may prefer ecstacy ;-) The storage interface
> may let them design their app by using the memcached protocol, and
> they can just swap in the backend that suites their needs (may it be
> persistence, replication (sync or lazily) etc..)
>
> 2.2 Extensions
> ===============
>
> The item storage isn't the only place we've tried to create a level of
> modularity. People run memcached in different environments with
> different requirements. You specify the extensions you want to use by
> adding the -X command line argument.
>
> 2.2.1 Logger
> -------------
>
> We've seen a lot of different requests when it comes to logging. Some
> want it to a file, some to syslog (or Windows event log) and some want
> it to standard out. By default memcached will print to stderr, but
> you may specify a different logger by loading the appropriate module
> with the -X command line argument
>
> 2.2.2 Daemon
> -------------
>
> You might want to have some daemons providing extra services inside
> your memcached server. Examples would be things like a doors server
> to provide additional access to your server (Trond's favorite), or
> perhaps a "dispatcher" offering a threadpool for your engines to
> use?).
>
> 2.2.3 ASCII commands
> ---------------------
>
> If you really need to extend the ASCII protocol, you may now load
> additional ASCII commands as loadable modules. We don't need a
> separate module for binary commands, because those are already handled
> inside memcached due to the fixed semantics on the protocol. This
> isn't necessarily encouraged, but sometimes it is required to get
> something done quick.
>
> 3 New stats
> ~~~~~~~~~~~~
>
> There are a number of new stats introduced. The key supplied in the
> status command is passed to the storage engine to allow the storage
> engine to add extra information to the existing stats commands, and to
> create their own stat commands.
>
> 3.1 Stats returned by the default stats command
> ================================================
>
> 3.1.1 libevent
> ---------------
>
> Over the time we've seen a lot of bugs around people using an old
> version of libevent. That's part of the reason why we bundle a well
> known version of libevent in the release distribution. Memcached
> checks the libevent version during startup, and will refuse to start
> if the one used is too old. Since most operating systems use shared
> libraries these days, you might be using another version than the one
> you originally used when you first built memcached. In order for us to
> see which library people are using we decided to put it into the stats
> as well.
>
> 3.1.2 rejected_conns
> ---------------------
>
> The number of times a connection attempt was refused (due when we're
> hitting the maximum number of connections.
>
> 3.1.3 stats related to TAP
> ---------------------------
>
> There are a number of stats related to the packages used in the TAP
> protocol. These stats will only appear if they are non-zero:
>
> tap_checkpoint_start_received tap_checkpoint_start_sent
> tap_checkpoint_end_received tap_checkpoint_end_sent
> tap_connect_received tap_delete_received tap_delete_sent
> tap_flush_received tap_flush_sent tap_mutation_received
> tap_mutation_sent tap_opaque_received tap_opaque_sent
> tap_vbucket_set_received tap_vbucket_set_sent
>
> 3.2 topkeys
> ============
>
> You may get information about the most popular keys in memcached by
> exporting the environment variable MEMCACHED_TOP_KEYS to the number of
> keys you would want memcached to keep track of. There is no such thing
> as a free lunch, so enabling this can have a small memory and speed
> impact. We've decided to _disable_ this by default, so you need to
> export this variable to enable the feature. Ex:
>
> me@localhost:> MEMCACHED_TOP_KEYS=10 ./memcached
>
> Running "stats topkeys" would return something like
>
> STAT my-key2 get_hits=0,get_misses=1,cmd_set=0,incr_hits=0,
> incr_misses=0,decr_hits=0,decr_misses=0,delete_hits=0,
> delete_misses=0,evictions=0,cas_hits=0,cas_badval=0,
> cas_misses=0,ctime=2,atime=2
> STAT my-key1
> get_hits=1,get_misses=0,cmd_set=1,incr_hits=0,
> incr_misses=0,decr_hits=0,decr_misses=0,delete_hits=0,
> delete_misses=0,evictions=0,cas_hits=0,cas_badval=0,
> cas_misses=0,ctime=12,atime=12
>
> (Line breaks and indentations added to make it more readable in this
> document):
>
> 3.3 aggregate
> ==============
>
> The combination of the storage engine interface and the SASL auth
> allows for the combination of a connection-based stats. The aggregate
> subcommand is used to aggregate the stats from all of the connections
> on the server. The stats returned from the aggregate subcommand is the
> same as the normal stats command.
>
> 3.4 settings
> =============
>
> There are times an engine may want to share details about it's
> configuration through stats. This argument to stats will get you
> there.
>
> Just to show a couple of examples...
>
> 3.4.1 extension
> ----------------
>
> Displays one of the extensions loaded (may appear multiple times).
>
> ex:
>
> STAT logger syslog
> STAT ascii_extension scrub
> STAT ascii_extension noop
> STAT ascii_extension echo
>
> 3.4.2 topkeys
> --------------
>
> The number of keys we are monitoring.
>
> There may be many other settings exposed, depending on the engine's
> configuration.
>

Reply via email to