ps. folks please look this over and evaluate. Do you understand everything? Does anything suck? Need more clarification? Whatever?
http://code.google.com/p/memcached/downloads/detail?name=memcached-1.6.0_beta1.tar.gz ^ easy-bake oven form beta release. Passes tests on a bunch of platforms, but possibly not OpenBSD. Make evaluating! Give major feedback. -Dormando On Mon, 11 Apr 2011, Trond Norbye wrote: > What's new in memcached > ======================= > > (part two - new feature proposals) > > Table of Contents > ================= > 1 Protocol > 1.1 Virtual buckets! > 1.2 TAP > 1.3 New commands > 1.3.1 VERBOSITY > 1.3.2 TOUCH, GAT and GATQ > 1.3.3 SET_VBUCKET, GET_VBUCKET, DEL_VBUCKET > 1.3.4 TAP_CONNECT > 1.3.5 TAP_MUTATION, TAP_DELETE, TAP_FLUSH > 1.3.6 TAP_OPAQUE > 1.3.7 TAP_VBUCKET_SET > 1.3.8 TAP_CHECKPOINT_START and TAP_CHECKPOINT_END > 2 Modularity > 2.1 Engines > 2.2 Extensions > 2.2.1 Logger > 2.2.2 Daemon > 2.2.3 ASCII commands > 3 New stats > 3.1 Stats returned by the default stats command > 3.1.1 libevent > 3.1.2 rejected_conns > 3.1.3 stats related to TAP > 3.2 topkeys > 3.3 aggregate > 3.4 settings > 3.4.1 extension > 3.4.2 topkeys > > > 1 Protocol > ~~~~~~~~~~~ > > Intentionally, there is no significant difference in protocol over > 1.4.x. There is one minor change, but it should be transparent to > most users. > > 1.1 Virtual buckets! > ===================== > > We don't know who originally came up with the idea, but we've heard > rumors that it might be Anatoly Vorobey or Brad Fitzpatrick. In lieu > of a full explanation on this, the concept is that instead of mapping > each key to a server we map it to a virtual bucket. These virtual > buckets are then distributed across all of the servers. To ease the > introduction of this we've assigned the two reserved bytes in the > binary protocol for specifying the vbucket id, which allowed us to > avoid protocol extensions. > > Note that this change should allow for complete compatibility if the > clients and the server are not aware of vbuckets. These should have > been set to 0 according to the original binary protocol specification, > which means that they will always use vbucket 0. > > The idea is that we can move these vbuckets between servers such that > you can "grow" or "shrink" your cluster without losing data in your > cache. The classic memcached caching engine does _not_ implement > support for multiple vbuckets right now, but it is on the roadmap to > create a version of the engine in memcached to support this (it is a > question of memory efficiency, and there are currently not many > clients that support them). > > Defining this now will allow us to start moving down the path to > vbuckets in the default_engine and allow other engine implementors to > consider vbuckets in their design. > > You can read more about the mechanics of it here: > [http://dustin.github.com/2010/06/29/memcached-vbuckets.html] > > However, you _cannot_ use a mix of clients that are vbucket aware and > clients who don't use vbuckets, but then again it doesn't make sense > to use a vbucket aware backend if your clients don't know how to > access them. This is why we believe a protocol change isn't > warranted. > > Defining this now will allow us to start moving down the path to > vbuckets in the default_engine and allow other engine implementors to > consider vbuckets in their design. > > 1.2 TAP > ======== > > In order to facilitate vbucket transfers, among other use cases where > people want to see what's inside the server, we added to the binary > protocol a set of commands collectively called TAP. The intention is > to allow "clients" to receive a stream of notifications whenever data > change in the server. It is solely up to the backing store to > implement this, so it can make decisions about what resources are used > to implement TAP. This functionality is commonly needed enough though > that the core is aware of it, leaving specific implementation to > engines. > > 1.3 New commands > ================= > > There are a few new commands available. The following sections > provides a brief description of them. Please check protocol_binary.h > for the implementation details. > > 1.3.1 VERBOSITY > ---------------- > > We did not have an equivalent of the verbosity command in the textual > protocol. This command allows the user to change the verbosity level > on your running server by using the binary protocol. Why do we need > this? There is a command line option you may use to disable the ascii > protocol, so we need this command in order to change the logging level > in those configurations. > > 1.3.2 TOUCH, GAT and GATQ > -------------------------- > > One of the problems with the existing commands in memcached is that > you couldn't tell the memcached server that the object is still valid > and we just want a longer expiration. Normally you want to put an > expiry time on the objects, so that you can get an indication if your > cache is big enough (by watching the eviction stats.. if your > memcached server has a high eviction rate your cache isn't big enough > for what you want to have in there). The normal idea is that the items > you're normally using would be bumped to the front of your LRU (and > hence not be kicked out immediately). > > The touch command lets you set the expiry time for an object without > retrieving the object. In most cases, you will not want to do this > unless you provide a CAS value to ensure that you're touching the > correct version of the object. > > GAT means "get and touch" and returns the object in addition to > setting a new expiration time. This allows you to have a rolling > window of expiry that has a TTL in addition to the access time. For > example, you can instruct memcached to allow an object to live no > later than five minutes after the last time it was access (but as > always, it may expire sooner). > > 1.3.3 SET_VBUCKET, GET_VBUCKET, DEL_VBUCKET > -------------------------------------------- > > These commands are used to set, get or delete a vbucket on the server. > > 1.3.4 TAP_CONNECT > ------------------ > > Connect and request that the server initialize a TAP stream. > > The point of this command is to allow clients to connect and specify a > few things about the data they wish to receive. Specifically, the > client will typically specify a date either in the past or in the > future along with specifying a vbucket. The server will then stream > data mutated since that given date or if a future date is specified, > only stream new mutations as they arrive. The specific details about > which mutations to send may vary on implementation. > > 1.3.5 TAP_MUTATION, TAP_DELETE, TAP_FLUSH > ------------------------------------------ > > TAP_MUTATION is a notification that an item changed value in the > server. > > The mutation typically comes with the new value. > > TAP_DELETE is a notification that a key was deleted on the server. > > Finally, to avoid having to send a complete list of all the keys in > the server when the user issues a flush, we can send a single message > (TAP_FLUSH) representing the flush. Please note that the FLUSH > message means _ALL_ vbuckets, and not just a single vbucket. > > 1.3.6 TAP_OPAQUE > ----------------- > > To allow storage engines to send their own messages over the tap > stream between each other, a tap opaque message is defined. It is > completely up to the storage engine to specify the internal layout of > the package. > > 1.3.7 TAP_VBUCKET_SET > ---------------------- > > This is a message requesting a vbucket set. It is similar to the > set_vbucket command, with the difference that this message comes over > a tap connection (with the extra info a tap message contains) > > 1.3.8 TAP_CHECKPOINT_START and TAP_CHECKPOINT_END > -------------------------------------------------- > > The checkpoint start and end messages may be used by engine who wants > to use checkpoints. Checkpoints are an optional feature that may be > used by some engines to allow clients to start at a checkpoint > position. By doing so, the client need not do a full "backfill" even > if it is revisiting a server after having been gone for a while. The > TAP_CHECKPOINT_START tells a client that it's the start of a new > checkpoint, and the TAP_CHECKPOINT_END tells the client when it's > received everything for that given checkpoint. > > 2 Modularity > ~~~~~~~~~~~~~ > > As we mentioned in the first email on changes, one big difference with > this new work is that we've tried to refactor memcached into being a > modular application instead of being monolithic. In the future, we'd > like to make the command parser as a separate module, so that we may > load the parsers separately. > > 2.1 Engines > ============ > > We've done a lot of work trying to refactor the code in memcached to > avoid the tight coupling between the command protocol parser and the > actual item storage. > > The idea with the engine interface is that the memcached process loads > a dynamically loadable object and calls a well known function to get a > set of function pointers. All communication between the memcached > process and the engine is performed through these function > pointers. The memcached process provides a set of services to the > engine as well through another set of function pointers. > > The beauty of this is that the user may choose between a set of > different storage engines that suites their runtime > environment. People have different requirements for their server. Some > people need ACID, others may prefer ecstacy ;-) The storage interface > may let them design their app by using the memcached protocol, and > they can just swap in the backend that suites their needs (may it be > persistence, replication (sync or lazily) etc..) > > 2.2 Extensions > =============== > > The item storage isn't the only place we've tried to create a level of > modularity. People run memcached in different environments with > different requirements. You specify the extensions you want to use by > adding the -X command line argument. > > 2.2.1 Logger > ------------- > > We've seen a lot of different requests when it comes to logging. Some > want it to a file, some to syslog (or Windows event log) and some want > it to standard out. By default memcached will print to stderr, but > you may specify a different logger by loading the appropriate module > with the -X command line argument > > 2.2.2 Daemon > ------------- > > You might want to have some daemons providing extra services inside > your memcached server. Examples would be things like a doors server > to provide additional access to your server (Trond's favorite), or > perhaps a "dispatcher" offering a threadpool for your engines to > use?). > > 2.2.3 ASCII commands > --------------------- > > If you really need to extend the ASCII protocol, you may now load > additional ASCII commands as loadable modules. We don't need a > separate module for binary commands, because those are already handled > inside memcached due to the fixed semantics on the protocol. This > isn't necessarily encouraged, but sometimes it is required to get > something done quick. > > 3 New stats > ~~~~~~~~~~~~ > > There are a number of new stats introduced. The key supplied in the > status command is passed to the storage engine to allow the storage > engine to add extra information to the existing stats commands, and to > create their own stat commands. > > 3.1 Stats returned by the default stats command > ================================================ > > 3.1.1 libevent > --------------- > > Over the time we've seen a lot of bugs around people using an old > version of libevent. That's part of the reason why we bundle a well > known version of libevent in the release distribution. Memcached > checks the libevent version during startup, and will refuse to start > if the one used is too old. Since most operating systems use shared > libraries these days, you might be using another version than the one > you originally used when you first built memcached. In order for us to > see which library people are using we decided to put it into the stats > as well. > > 3.1.2 rejected_conns > --------------------- > > The number of times a connection attempt was refused (due when we're > hitting the maximum number of connections. > > 3.1.3 stats related to TAP > --------------------------- > > There are a number of stats related to the packages used in the TAP > protocol. These stats will only appear if they are non-zero: > > tap_checkpoint_start_received tap_checkpoint_start_sent > tap_checkpoint_end_received tap_checkpoint_end_sent > tap_connect_received tap_delete_received tap_delete_sent > tap_flush_received tap_flush_sent tap_mutation_received > tap_mutation_sent tap_opaque_received tap_opaque_sent > tap_vbucket_set_received tap_vbucket_set_sent > > 3.2 topkeys > ============ > > You may get information about the most popular keys in memcached by > exporting the environment variable MEMCACHED_TOP_KEYS to the number of > keys you would want memcached to keep track of. There is no such thing > as a free lunch, so enabling this can have a small memory and speed > impact. We've decided to _disable_ this by default, so you need to > export this variable to enable the feature. Ex: > > me@localhost:> MEMCACHED_TOP_KEYS=10 ./memcached > > Running "stats topkeys" would return something like > > STAT my-key2 get_hits=0,get_misses=1,cmd_set=0,incr_hits=0, > incr_misses=0,decr_hits=0,decr_misses=0,delete_hits=0, > delete_misses=0,evictions=0,cas_hits=0,cas_badval=0, > cas_misses=0,ctime=2,atime=2 > STAT my-key1 > get_hits=1,get_misses=0,cmd_set=1,incr_hits=0, > incr_misses=0,decr_hits=0,decr_misses=0,delete_hits=0, > delete_misses=0,evictions=0,cas_hits=0,cas_badval=0, > cas_misses=0,ctime=12,atime=12 > > (Line breaks and indentations added to make it more readable in this > document): > > 3.3 aggregate > ============== > > The combination of the storage engine interface and the SASL auth > allows for the combination of a connection-based stats. The aggregate > subcommand is used to aggregate the stats from all of the connections > on the server. The stats returned from the aggregate subcommand is the > same as the normal stats command. > > 3.4 settings > ============= > > There are times an engine may want to share details about it's > configuration through stats. This argument to stats will get you > there. > > Just to show a couple of examples... > > 3.4.1 extension > ---------------- > > Displays one of the extensions loaded (may appear multiple times). > > ex: > > STAT logger syslog > STAT ascii_extension scrub > STAT ascii_extension noop > STAT ascii_extension echo > > 3.4.2 topkeys > -------------- > > The number of keys we are monitoring. > > There may be many other settings exposed, depending on the engine's > configuration. > >
