Re: Extstore revival after crash

Javier Arias Losada Mon, 24 Apr 2023 04:05:29 -0700

Hi there,

one thing we've done to mitigate this kind of risk is having two copies of 
every shard in different availability zones in our cloud provider. Also, we 
run in kubernetes so for us nodes leaving the cluster is a relatively 
frequent issue... we are playing with a small process that does the warmup 
of new nodes quicker.


Since we have more than one copy of the data, we do a warmup process. Our 
cache nodes are MUCH MUCH smaller... so this approach might not be 
reasonable for your use-case.

This is how our process works, when a new node is restarted or any other 
situation that involves an empty memcached process starting, our warmup 
process: 
locates the warmer node for the shard
gets all the keys and TTLS with from the warmer node: lru_crawler metadump 
all
traverses in reverse the list of keys (lru_crawler goes from the least 
recently used, for this it's better to go from most recent).
For each key: get the value from the warmer node and add (not set) it to 
the cold node, including TTL.

This process might lead to some small data inconcistencies, it will depend 
on your use case how important that is.

Since our access patterns are very skewed (a small % of keys gets the 
bigger % of traffic, at least during some time) going in reverse in the LRU 
dump helps being much more effective.

Best
Javier Arias
On Sunday, April 23, 2023 at 7:24:28 PM UTC+2 dormando wrote:

> Hey,
>
> Thanks for reaching out!
>
> There is no crash safety in memcached or extstore; it does look like the
> data is on disk but it is actually spread across memory and disk, with
> recent or heavily accessed data staying in RAM. Best case you only recover
> your cold data. Further, keys can appear multiple times in the extstore
> datafile and we rely on the RAM index to know which one is current.
>
> I've never heard of anyone losing an entire cluster, but people do try to
> mitigate this by replicating cache across availability zones/regions.
> This can be done with a few methods, like our new proxy code. I'd be happy
> to go over a few scenarios if you'd like.
>
> -Dormando
>
> On Sun, 23 Apr 2023, 'Danny Kopping' via memcached wrote:
>
> > First off, thanks for the amazing work @dormando & others!
> > Context:
> > I work at Grafana Labs, and we are very interested in trying out 
> extstore for some very large (>50TB) caches. We plan to split this 50TB 
> cache into about
> > 35 different nodes, each with 1.5TB of NVMe & a small memcached 
> instance. Losing any given node will result in losing ~3% of the overall 
> cache which is
> > acceptable, however if we lose all nodes at once somehow, losing all of 
> our cache will be pretty bad and will put severe pressure on our backend.
> >
> > Ask:
> > Having looked at the file that extstore writes on disk, it looks like it 
> has both keys & values contained in it. Would it be possible to "re-warm" 
> the
> > cache on startup by scanning this data and resubmitting it to itself? We 
> could then have add some condition to our readiness check in k8s to wait 
> until
> > the data is all re-warmed and then allow traffic to flow to those 
> instances. Is this feature planned for anytime soon?
> >
> > Thanks!
> >
> > --
> >
> > ---
> > You received this message because you are subscribed to the Google 
> Groups "memcached" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to memcached+...@googlegroups.com.
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/memcached/cc45382b-eee7-4e37-a841-d210bf18ff4bn%40googlegroups.com
> .
> >
> >
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to memcached+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/memcached/cc97a126-f5c1-49e8-9e0b-d370efac7224n%40googlegroups.com.

Re: Extstore revival after crash

Reply via email to