On Mon, Feb 23, 2009 at 10:02 AM, dormando <[email protected]> wrote:
> > Yo, > > I'm a little confused by this thread... It appears that the point is to > reduce pain or reduce the time required in a full restart of a memcached > cluster. > > This request looks like it would encourage folks to get themselves into > positions where a full restart of a memcached instance is too much pain to > bear. Now you can't upgrade the server, upgrade memcached, or tolerate a > hardware failure. I've seen too many shops get themselves into this > position, and it's frustrating since it stunts our ability to get bugfixes > and features deployed. > > It feels excessive if the only real benefit is being able to do a full > data flush in less time? Is there anything I'm missing? > > -Dormando > > On Sat, 21 Feb 2009, Jean-Charles Redoutey wrote: > > > ok, if you put the future flush in the same basket, I am not *offended* > ;-) > > > > imho, the main bone of contention we have is we don't consider the same > way > > the age of an item. > > > > As I understand, for you, this is somehow the "content age", i.e. the > time > > the oldest part used to construct the item has been put in cache. > > > > For me, this is more a "technical age", i.e. the time the item has been > > physically put in the cache, whatever the status of what has been used to > > build it. > > > > imho also, both are valid approaches: > > > > - the first one ensures the recentness of the actual content, which is > > really useful if data put in cache is built from other data already in > > cache > > > > - the second one ensures the technical recentness, which is important for > at > > least 2 points: > > > > - if the way you put data in cache evolves, at some point, the current > > version may not be compatible with very old ones, and you want to ensure > you > > don't have this kind of very old item still in the cache > > > > - if the distribution of the items on the nodes changes (e.g. change in > > the number of server), you have originally the item on one node, then it > > goes on a another one and then, after another distrbution change, it can > go > > back where it was before (typical case, a configuration fallback); if > > neither the LRU nor the TTL has deleted this old item, you will end up > not > > having a cache miss but a cache hit on a deprecated data, something we > > definitly want to avoid. Basically, to prevent that, we need to ensure no > > data in the cache is older than the last configuration change. > > > > In the end, the first one can only be ensured by the instananeous flush. > > > > The second admittely also, but this can also be done with a nearly null > > impact on the DB with the negative delay feature, and imho this is worth > a > > dedicated feature. If this has to go through a new command, I don't mind > at > > all, I proposed the negative delay since in means less than 10 lines of > > code, no impact on actual feature and, even if this may be misleading to > > people reading the feature description unattentively, the exact behaviour > > can be precisely described in less than 20 words... > > > > --- > > Jean-Charles > > > > > > On Fri, Feb 20, 2009 at 23:08, Dustin <[email protected]> wrote: > > > > > > > > > > > On Feb 20, 10:29 am, Jean-Charles Redoutey <[email protected]> > > > wrote: > > > > > > > If we go for 2, the *right* way to use the delayed flush would be > > > something > > > > like flush +10 on server a and flush +20 on server b. > > > > > > I've also argued for the removal of flush with delay. It was > > > semantically confusing with delete with reserve (which was remove), > > > and is really easy to do as a client feature. I don't think it makes > > > sense to exist as a server function at all. > > > > > > > The only way to have the global consistency you are describing is to > > > flush > > > > all the nodes with the exact same delay, which is simply unapplicable > to > > > > production cache. > > > > > > Well, flush them at the same time. If you issue a flush on my > > > client, it does all of them as concurrently as possible. > > > > > > It's about reducing the window of error. As you've pointed out, the > > > larger the window is, the more of a chance there is. > > > > > > > Since you can't ensure the *real* age of a data, basically the first > time > > > > any part used to build it has been put in any of the node within the > > > > cluster, why not focus on something you can ensure, the time this > > > particular > > > > data was put in cache? In which case, whatever the sign of the flush > > > delay, > > > > we have the same semantic. > > > > > > I suppose the difference of opinion regarding the semantics is that > > > in the non-negative case (existing code), *all* records are > > > invalidated within memcached. > > > > > > Does anyone actually use a "future" flush that can't be done client- > > > side? > > > > > > > > > > -- "Be excellent to each other"
