> Hey Dormando, thanks again for some comments... appreciate the help.
>
> Maybe i wasn't clear enough. I need only 1 minute persistence, and i can lose
> data sometimes, just i can't keep loosing data every minute due to
> constant evictions caused by LRU. Actually i have just wrote that in my
> previous post. We're loosing about 1 minute of non-meaningfull data every
> week because of restart that we do when memory starts to fill up (even with
> our patch reclaiming using linked list, we limit reclaiming to keep
> speed better)... so the memory fills up after a week, not 30 minutes...
Can you explain what you're seeing in more detail? Your data only needs to
persist for 1 minute, but it's being evicted before 1 minute is up?
You made it sound like you had some data which never expired? Is this
true?
If your instance is 16GB, takes a week to fill up, but data only needs to
persist for a minute but isn't, something else is very broken? Or am I
still misunderstanding you?
> Now im creating better solution, to limit locking as linked list is getting
> bigger.
>
> I explained what was worst implications of unwanted evictions (or loosing all
> data in cache) in my use case:
> 1. loosing ~1 minute of non-significant data that's about to be stored in sql
> 2. "flat" distribution of load to workers (not taking response times into
> account because stats reset).
> 3. resorting to alternative targeting algorithm (with global, not local
> statistics).
>
> I never, ever said im going to write data that have to be persistent
> permanently. It's actually same idea as delayed write. If power fails you
> loose 5s of data, but you can do 100x more writes. So you need the data to be
> persistent in memory, between writes the data **can't be lost**.
> However you can lose it sometimes, that's the tradeoff that some people can
> make and some not. Obviously I can't keep loosing this data each
> minute, because if i loose much it'll become meaningfull.
>
> Maybe i wasn't clear in that matter. I can loose all data even 20 times a
> day. Sensitive data is stored using bulk update or transactions,
> bypassing that "delayed write" layer. "0 evictions", that's the kind of
> "persistence" im going for. So items are persistent for some very short
> periods of time (1-5 minutes) without being killed. It's just different use
> case. Running in production since 2 years, based on 1.4.13, tested for
> corectness, monitored so we have enough memory and 0 evictions (just reclaims)
>
> When i came here with same idea ~2 years ago you just said it's very stupid,
> now you even made me look like a moron :) And i can understand why you
> don't want features that are not ~O(1) perfectly, but please don't get so
> personal about different ideas to do things and use cases, just because
> these won't work for you.
>
>
>
>
>
> W dniu czwartek, 10 kwietnia 2014 20:53:12 UTC+2 użytkownik Dormando napisał:
> You really really really really really *must* not put data in memcached
> which you can't lose.
>
> Seriously, really don't do it. If you need persistence, try using a
> redis
> instance for the persistent stuff, and use memcached for your cache
> stuff.
> I don't see why you feel like you need to write your own thing,
> there're a
> lot of persistent key/value stores (kyotocabinet/etc?). They have a much
> lower request ceiling and don't handle the LRU/cache pattern as well,
> but
> that's why you can use both.
>
> Again, please please don't do it. You are damaging your company. You
> are a
> *danger* to your company.
>
> On Thu, 10 Apr 2014, Slawomir Pryczek wrote:
>
> > Hi Dormando, thanks for suggestions, background thread would be
> nice...
> > The idea is actually that with 2-3GB i get plenty of evictions of
> items that need to be fetched later. And with 16GB i still get
> evictions,
> > actually probably i could throw more memory than 16G and it'd only
> result in more expired items sitting in the middle of slabs,
> forever... Now im
> > going for persistence. Sounds probably crazy, but we're having some
> data that we can't loose:
> > 1. statistics, we aggregate writes to DB using memcached (+list
> implementation). If these items get evicted we're loosing rows in db.
> Loosing data
> > sometimes isn't a big problem. Eg. we restart memcached once a week
> so we're loosing 1 minute of data every week. But if we have
> evictions we're
> > loosing data constantly (which we can't have)
> > 2. we drive load balancer using data in memcached for statistics,
> again, not nice to loose data often because workers can get
> incorrect amount of
> > traffic.
> > 3. we're doing some adserving optimizations, eg. counting per-domain
> ad priority, for one domain it takes about 10 seconds to analyze
> all data and
> > create list of ads, so can't be done online... we put result of this
> in memcached, if we loose too much of this the system will start
> to serve
> > suboptimal ads (because it'll need to switch to more general data or
> much simpler algorithm that can be done instantly)
> >
> > Probably would be best to rewrite all this using C or golang, and use
> memcached just for caching, but it'd take too much time which
> we don't have
> > currently...
> >
> > I have seen twitter and nk implementations that seem to do what i
> need, but they seem old (based on old code), so I prefer to modify
> code of recent
> > "official" memcached, to not be stuck with old code or abandonware.
> Actually there are many topics about limitations of currrent
> eviction algo and
> > option to enable some background thread to do scraping based on
> statistics of most filled slabs (with some parameter to specify if it
> should take
> > light or aggressive approach) would be nice...
> >
> > As for the code... is that slab_rebalance_move function in slab.c? It
> seems a little difficult to gasp without some DOCs of how
> things are
> > working... can you please write a very short description of how this
> "angry birds" more workd?
>
> Look at doc/protocol.txt for explanations of the slab move options. the
> names are greppable back to the source.
>
> > I have quick question about this above... linked is item that's
> placed on linked list, but what other flags means, and why 2 last are
> 2 of them
> > temporary?
> > #define ITEM_LINKED 1
> > #define ITEM_CAS 2
> >
> > /* temp */
> > #define ITEM_SLABBED 4
> > #define ITEM_FETCHED 8
> >
> > This from slab_rebalance_move seems interesting:
> > refcount = refcount_incr(&it->refcount);
> > ...
> > if (refcount == 1) { /* item is unlinked, unused */
> > ...
> > } else if (refcount == 2) { /* item is linked but not busy */
> >
> > Is there some docs about refcounts, locks and item states? Basically
> why item with refcount 2 is not busy? You're increasing refcount
> by 1 on
> > select, then again when reading data? Can refcount ever be higher
> than 2 (3 in above case), meaning 2 threads can access same item?
>
> The comment on the same line is explaining exactly what it means.
>
> Unfortunately it's a bit of a crap shoot. I think I wrote a threads
> explanation somewhnere (some release notes, or in a file in there, I
> can't
> quite remember offhand). Since scaling the thread code it got a lot more
> complicated. You have to be extremely careful under what circumstances
> you
> access items (you must hold an item lock + the refcount must be 2 if you
> want to unlink it).
>
> You'll just have to study it a bit, sorry. Grep around to see where the
> flags are used.
>
> > Thanks.
> >
> > W dniu czwartek, 10 kwietnia 2014 06:05:30 UTC+2 użytkownik Dormando
> napisał:
> > > Hi Guys,
> > > im running a specific case where i don't want (actually can't
> have) to have evicted items (evictions = 0 ideally)... now i
> have
> > created some simple
> > > algo that lock the cache, goes through linked list and evicts
> items... it makes some problems, like 10-20ms cache locks on
> some
> > cases.
> > >
> > > Now im thinking about going through each slab memory (slabs
> keep a list of allocated memory regions) ... looking for items,
> if
> > expired item is
> > > found, evict it... this way i can go eg. 10k items or 1MB of
> memory at a time + pick slabs with high utilization and run this
> > "additional" eviction
> > > only on them... so it'll prevent allocating memory just
> because unneded data with short TTL is occupying HEAD of the list.
> > >
> > > With this linked list eviction im able to run on 2-3GB of
> memory... without it 16GB of memory is exhausted in 1-2h and then
> memcached
> > starts to
> > > kill "good" items (leaving expired ones wasting memory)...
> > >
> > > Any comments?
> > > Thanks.
> >
> > you're going a bit against the base algorithm. if stuff is
> falling out of
> > 16GB of memory without ever being utilized again, why is that
> critical?
> > Sounds like you're optimizing the numbers instead of actually
> tuning
> > anything useful.
> >
> > That said, you can probably just extend the slab rebalance
> code. There's a
> > hook in there (which I called "Angry birds mode") that drives a
> slab
> > rebalance when it'd otherwise run an eviction. That code
> already safely
> > walks the slab page for unlocked memory and frees it; you could
> edit it
> > slightly to check for expiration and then freelist it into the
> slab class
> > instead.
> >
> > Since it's already a background thread you could further modify
> it to just
> > wake up and walk pages for stuff to evict.
> >
> > --
> >
> > ---
> > You received this message because you are subscribed to the Google
> Groups "memcached" group.
> > To unsubscribe from this group and stop receiving emails from it,
> send an email to [email protected].
> > For more options, visit https://groups.google.com/d/optout.
> >
> >
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>
>
--
---
You received this message because you are subscribed to the Google Groups
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.