What is the difference between GAT and GATQ?

- Nelz

On Tue, Apr 12, 2011 at 02:52, Trond Norbye <[email protected]> wrote:
>
> On 12. apr. 2011, at 05.55, dormando wrote:
>
>> ps. folks please look this over and evaluate. Do you understand
>> everything? Does anything suck? Need more clarification? Whatever?
>>
>> http://code.google.com/p/memcached/downloads/detail?name=memcached-1.6.0_beta1.tar.gz
>> ^ easy-bake oven form beta release. Passes tests on a bunch of platforms,
>> but possibly not OpenBSD.
>>
>
> I've fixed that bug. See: 
> https://github.com/memcached/memcached/commit/cc3941084188195fc8b43fcdc05cec3dab5a4bd4
>
> Cheers,
>
> Trond
>
>
>> Make evaluating! Give major feedback.
>>
>> -Dormando
>>
>> On Mon, 11 Apr 2011, Trond Norbye wrote:
>>
>>>                       What's new in memcached
>>>                       =======================
>>>
>>> (part two - new feature proposals)
>>>
>>> Table of Contents
>>> =================
>>> 1 Protocol
>>>    1.1 Virtual buckets!
>>>    1.2 TAP
>>>    1.3 New commands
>>>        1.3.1 VERBOSITY
>>>        1.3.2 TOUCH, GAT and GATQ
>>>        1.3.3 SET_VBUCKET, GET_VBUCKET, DEL_VBUCKET
>>>        1.3.4 TAP_CONNECT
>>>        1.3.5 TAP_MUTATION, TAP_DELETE, TAP_FLUSH
>>>        1.3.6 TAP_OPAQUE
>>>        1.3.7 TAP_VBUCKET_SET
>>>        1.3.8 TAP_CHECKPOINT_START and TAP_CHECKPOINT_END
>>> 2 Modularity
>>>    2.1 Engines
>>>    2.2 Extensions
>>>        2.2.1 Logger
>>>        2.2.2 Daemon
>>>        2.2.3 ASCII commands
>>> 3 New stats
>>>    3.1 Stats returned by the default stats command
>>>        3.1.1 libevent
>>>        3.1.2 rejected_conns
>>>        3.1.3 stats related to TAP
>>>    3.2 topkeys
>>>    3.3 aggregate
>>>    3.4 settings
>>>        3.4.1 extension
>>>        3.4.2 topkeys
>>>
>>>
>>> 1 Protocol
>>> ~~~~~~~~~~~
>>>
>>> Intentionally, there is no significant difference in protocol over
>>> 1.4.x.  There is one minor change, but it should be transparent to
>>> most users.
>>>
>>> 1.1 Virtual buckets!
>>> =====================
>>>
>>> We don't know who originally came up with the idea, but we've heard
>>> rumors that it might be Anatoly Vorobey or Brad Fitzpatrick.  In lieu
>>> of a full explanation on this, the concept is that instead of mapping
>>> each key to a server we map it to a virtual bucket.  These virtual
>>> buckets are then distributed across all of the servers.  To ease the
>>> introduction of this we've assigned the two reserved bytes in the
>>> binary protocol for specifying the vbucket id, which allowed us to
>>> avoid protocol extensions.
>>>
>>> Note that this change should allow for complete compatibility if the
>>> clients and the server are not aware of vbuckets.  These should have
>>> been set to 0 according to the original binary protocol specification,
>>> which means that they will always use vbucket 0.
>>>
>>> The idea is that we can move these vbuckets between servers such that
>>> you can "grow" or "shrink" your cluster without losing data in your
>>> cache. The classic memcached caching engine does _not_ implement
>>> support for multiple vbuckets right now, but it is on the roadmap to
>>> create a version of the engine in memcached to support this (it is a
>>> question of memory efficiency, and there are currently not many
>>> clients that support them).
>>>
>>> Defining this now will allow us to start moving down the path to
>>> vbuckets in the default_engine and allow other engine implementors to
>>> consider vbuckets in their design.
>>>
>>> You can read more about the mechanics of it here:
>>> [http://dustin.github.com/2010/06/29/memcached-vbuckets.html]
>>>
>>> However, you _cannot_ use a mix of clients that are vbucket aware and
>>> clients who don't use vbuckets, but then again it doesn't make sense
>>> to use a vbucket aware backend if your clients don't know how to
>>> access them.  This is why we believe a protocol change isn't
>>> warranted.
>>>
>>> Defining this now will allow us to start moving down the path to
>>> vbuckets in the default_engine and allow other engine implementors to
>>> consider vbuckets in their design.
>>>
>>> 1.2 TAP
>>> ========
>>>
>>> In order to facilitate vbucket transfers, among other use cases where
>>> people want to see what's inside the server, we added to the binary
>>> protocol a set of commands collectively called TAP.  The intention is
>>> to allow "clients" to receive a stream of notifications whenever data
>>> change in the server.  It is solely up to the backing store to
>>> implement this, so it can make decisions about what resources are used
>>> to implement TAP.  This functionality is commonly needed enough though
>>> that the core is aware of it, leaving specific implementation to
>>> engines.
>>>
>>> 1.3 New commands
>>> =================
>>>
>>> There are a few new commands available.  The following sections
>>> provides a brief description of them.  Please check protocol_binary.h
>>> for the implementation details.
>>>
>>> 1.3.1 VERBOSITY
>>> ----------------
>>>
>>> We did not have an equivalent of the verbosity command in the textual
>>> protocol.  This command allows the user to change the verbosity level
>>> on your running server by using the binary protocol.  Why do we need
>>> this? There is a command line option you may use to disable the ascii
>>> protocol, so we need this command in order to change the logging level
>>> in those configurations.
>>>
>>> 1.3.2 TOUCH, GAT and GATQ
>>> --------------------------
>>>
>>> One of the problems with the existing commands in memcached is that
>>> you couldn't tell the memcached server that the object is still valid
>>> and we just want a longer expiration.  Normally you want to put an
>>> expiry time on the objects, so that you can get an indication if your
>>> cache is big enough (by watching the eviction stats.. if your
>>> memcached server has a high eviction rate your cache isn't big enough
>>> for what you want to have in there).  The normal idea is that the items
>>> you're normally using would be bumped to the front of your LRU (and
>>> hence not be kicked out immediately).
>>>
>>> The touch command lets you set the expiry time for an object without
>>> retrieving the object.  In most cases, you will not want to do this
>>> unless you provide a CAS value to ensure that you're touching the
>>> correct version of the object.
>>>
>>> GAT means "get and touch" and returns the object in addition to
>>> setting a new expiration time.  This allows you to have a rolling
>>> window of expiry that has a TTL in addition to the access time.  For
>>> example, you can instruct memcached to allow an object to live no
>>> later than five minutes after the last time it was access (but as
>>> always, it may expire sooner).
>>>
>>> 1.3.3 SET_VBUCKET, GET_VBUCKET, DEL_VBUCKET
>>> --------------------------------------------
>>>
>>> These commands are used to set, get or delete a vbucket on the server.
>>>
>>> 1.3.4 TAP_CONNECT
>>> ------------------
>>>
>>> Connect and request that the server initialize a TAP stream.
>>>
>>> The point of this command is to allow clients to connect and specify a
>>> few things about the data they wish to receive.  Specifically, the
>>> client will typically specify a date either in the past or in the
>>> future along with specifying a vbucket.  The server will then stream
>>> data mutated since that given date or if a future date is specified,
>>> only stream new mutations as they arrive.  The specific details about
>>> which mutations to send may vary on implementation.
>>>
>>> 1.3.5 TAP_MUTATION, TAP_DELETE, TAP_FLUSH
>>> ------------------------------------------
>>>
>>> TAP_MUTATION is a notification that an item changed value in the
>>> server.
>>>
>>> The mutation typically comes with the new value.
>>>
>>> TAP_DELETE is a notification that a key was deleted on the server.
>>>
>>> Finally, to avoid having to send a complete list of all the keys in
>>> the server when the user issues a flush, we can send a single message
>>> (TAP_FLUSH) representing the flush.  Please note that the FLUSH
>>> message means _ALL_ vbuckets, and not just a single vbucket.
>>>
>>> 1.3.6 TAP_OPAQUE
>>> -----------------
>>>
>>> To allow storage engines to send their own messages over the tap
>>> stream between each other, a tap opaque message is defined.  It is
>>> completely up to the storage engine to specify the internal layout of
>>> the package.
>>>
>>> 1.3.7 TAP_VBUCKET_SET
>>> ----------------------
>>>
>>> This is a message requesting a vbucket set. It is similar to the
>>> set_vbucket command, with the difference that this message comes over
>>> a tap connection (with the extra info a tap message contains)
>>>
>>> 1.3.8 TAP_CHECKPOINT_START and TAP_CHECKPOINT_END
>>> --------------------------------------------------
>>>
>>> The checkpoint start and end messages may be used by engine who wants
>>> to use checkpoints.  Checkpoints are an optional feature that may be
>>> used by some engines to allow clients to start at a checkpoint
>>> position.  By doing so, the client need not do a full "backfill" even
>>> if it is revisiting a server after having been gone for a while.  The
>>> TAP_CHECKPOINT_START tells a client that it's the start of a new
>>> checkpoint, and the TAP_CHECKPOINT_END tells the client when it's
>>> received everything for that given checkpoint.
>>>
>>> 2 Modularity
>>> ~~~~~~~~~~~~~
>>>
>>> As we mentioned in the first email on changes, one big difference with
>>> this new work is that we've tried to refactor memcached into being a
>>> modular application instead of being monolithic.  In the future, we'd
>>> like to make the command parser as a separate module, so that we may
>>> load the parsers separately.
>>>
>>> 2.1 Engines
>>> ============
>>>
>>> We've done a lot of work trying to refactor the code in memcached to
>>> avoid the tight coupling between the command protocol parser and the
>>> actual item storage.
>>>
>>> The idea with the engine interface is that the memcached process loads
>>> a dynamically loadable object and calls a well known function to get a
>>> set of function pointers.  All communication between the memcached
>>> process and the engine is performed through these function
>>> pointers.  The memcached process provides a set of services to the
>>> engine as well through another set of function pointers.
>>>
>>> The beauty of this is that the user may choose between a set of
>>> different storage engines that suites their runtime
>>> environment.  People have different requirements for their server. Some
>>> people need ACID, others may prefer ecstacy ;-) The storage interface
>>> may let them design their app by using the memcached protocol, and
>>> they can just swap in the backend that suites their needs (may it be
>>> persistence, replication (sync or lazily) etc..)
>>>
>>> 2.2 Extensions
>>> ===============
>>>
>>> The item storage isn't the only place we've tried to create a level of
>>> modularity.  People run memcached in different environments with
>>> different requirements. You specify the extensions you want to use by
>>> adding the -X command line argument.
>>>
>>> 2.2.1 Logger
>>> -------------
>>>
>>> We've seen a lot of different requests when it comes to logging. Some
>>> want it to a file, some to syslog (or Windows event log) and some want
>>> it to standard out.  By default memcached will print to stderr, but
>>> you may specify a different logger by loading the appropriate module
>>> with the -X command line argument
>>>
>>> 2.2.2 Daemon
>>> -------------
>>>
>>> You might want to have some daemons providing extra services inside
>>> your memcached server.  Examples would be things like a doors server
>>> to provide additional access to your server (Trond's favorite), or
>>> perhaps a "dispatcher" offering a threadpool for your engines to
>>> use?).
>>>
>>> 2.2.3 ASCII commands
>>> ---------------------
>>>
>>> If you really need to extend the ASCII protocol, you may now load
>>> additional ASCII commands as loadable modules.  We don't need a
>>> separate module for binary commands, because those are already handled
>>> inside memcached due to the fixed semantics on the protocol.  This
>>> isn't necessarily encouraged, but sometimes it is required to get
>>> something done quick.
>>>
>>> 3 New stats
>>> ~~~~~~~~~~~~
>>>
>>> There are a number of new stats introduced.  The key supplied in the
>>> status command is passed to the storage engine to allow the storage
>>> engine to add extra information to the existing stats commands, and to
>>> create their own stat commands.
>>>
>>> 3.1 Stats returned by the default stats command
>>> ================================================
>>>
>>> 3.1.1 libevent
>>> ---------------
>>>
>>> Over the time we've seen a lot of bugs around people using an old
>>> version of libevent.  That's part of the reason why we bundle a well
>>> known version of libevent in the release distribution.  Memcached
>>> checks the libevent version during startup, and will refuse to start
>>> if the one used is too old.  Since most operating systems use shared
>>> libraries these days, you might be using another version than the one
>>> you originally used when you first built memcached.  In order for us to
>>> see which library people are using we decided to put it into the stats
>>> as well.
>>>
>>> 3.1.2 rejected_conns
>>> ---------------------
>>>
>>> The number of times a connection attempt was refused (due when we're
>>> hitting the maximum number of connections.
>>>
>>> 3.1.3 stats related to TAP
>>> ---------------------------
>>>
>>> There are a number of stats related to the packages used in the TAP
>>> protocol.  These stats will only appear if they are non-zero:
>>>
>>> tap_checkpoint_start_received tap_checkpoint_start_sent
>>> tap_checkpoint_end_received tap_checkpoint_end_sent
>>> tap_connect_received tap_delete_received tap_delete_sent
>>> tap_flush_received tap_flush_sent tap_mutation_received
>>> tap_mutation_sent tap_opaque_received tap_opaque_sent
>>> tap_vbucket_set_received tap_vbucket_set_sent
>>>
>>> 3.2 topkeys
>>> ============
>>>
>>> You may get information about the most popular keys in memcached by
>>> exporting the environment variable MEMCACHED_TOP_KEYS to the number of
>>> keys you would want memcached to keep track of.  There is no such thing
>>> as a free lunch, so enabling this can have a small memory and speed
>>> impact.  We've decided to _disable_ this by default, so you need to
>>> export this variable to enable the feature. Ex:
>>>
>>> me@localhost:> MEMCACHED_TOP_KEYS=10 ./memcached
>>>
>>> Running "stats topkeys" would return something like
>>>
>>> STAT my-key2 get_hits=0,get_misses=1,cmd_set=0,incr_hits=0,
>>>     incr_misses=0,decr_hits=0,decr_misses=0,delete_hits=0,
>>>     delete_misses=0,evictions=0,cas_hits=0,cas_badval=0,
>>>     cas_misses=0,ctime=2,atime=2
>>> STAT my-key1
>>>     get_hits=1,get_misses=0,cmd_set=1,incr_hits=0,
>>>     incr_misses=0,decr_hits=0,decr_misses=0,delete_hits=0,
>>>     delete_misses=0,evictions=0,cas_hits=0,cas_badval=0,
>>>     cas_misses=0,ctime=12,atime=12
>>>
>>> (Line breaks and indentations added to make it more readable in this
>>> document):
>>>
>>> 3.3 aggregate
>>> ==============
>>>
>>> The combination of the storage engine interface and the SASL auth
>>> allows for the combination of a connection-based stats.  The aggregate
>>> subcommand is used to aggregate the stats from all of the connections
>>> on the server.  The stats returned from the aggregate subcommand is the
>>> same as the normal stats command.
>>>
>>> 3.4 settings
>>> =============
>>>
>>> There are times an engine may want to share details about it's
>>> configuration through stats.  This argument to stats will get you
>>> there.
>>>
>>> Just to show a couple of examples...
>>>
>>> 3.4.1 extension
>>> ----------------
>>>
>>> Displays one of the extensions loaded (may appear multiple times).
>>>
>>> ex:
>>>
>>> STAT logger syslog
>>> STAT ascii_extension scrub
>>> STAT ascii_extension noop
>>> STAT ascii_extension echo
>>>
>>> 3.4.2 topkeys
>>> --------------
>>>
>>> The number of keys we are monitoring.
>>>
>>> There may be many other settings exposed, depending on the engine's
>>> configuration.
>>>
>>>
>
>

Reply via email to