Re: Extstore revival after crash

2023-04-23 Thread dormando
Hey,

Thanks for reaching out!

There is no crash safety in memcached or extstore; it does look like the
data is on disk but it is actually spread across memory and disk, with
recent or heavily accessed data staying in RAM. Best case you only recover
your cold data. Further, keys can appear multiple times in the extstore
datafile and we rely on the RAM index to know which one is current.

I've never heard of anyone losing an entire cluster, but people do try to
mitigate this by replicating cache across availability zones/regions.
This can be done with a few methods, like our new proxy code. I'd be happy
to go over a few scenarios if you'd like.

-Dormando

On Sun, 23 Apr 2023, 'Danny Kopping' via memcached wrote:

> First off, thanks for the amazing work @dormando & others!
> Context:
> I work at Grafana Labs, and we are very interested in trying out extstore for 
> some very large (>50TB) caches. We plan to split this 50TB cache into about
> 35 different nodes, each with 1.5TB of NVMe & a small memcached instance. 
> Losing any given node will result in losing ~3% of the overall cache which is
> acceptable, however if we lose all nodes at once somehow, losing all of our 
> cache will be pretty bad and will put severe pressure on our backend.
>
> Ask:
> Having looked at the file that extstore writes on disk, it looks like it has 
> both keys & values contained in it. Would it be possible to "re-warm" the
> cache on startup by scanning this data and resubmitting it to itself? We 
> could then have add some condition to our readiness check in k8s to wait until
> the data is all re-warmed and then allow traffic to flow to those instances. 
> Is this feature planned for anytime soon?
>
> Thanks!
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups 
> "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to memcached+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/memcached/cc45382b-eee7-4e37-a841-d210bf18ff4bn%40googlegroups.com.
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to memcached+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/memcached/51ca89fe-352-52b8-e145-a04f09da940%40rydia.net.


Extstore revival after crash

2023-04-23 Thread 'Danny Kopping' via memcached
First off, thanks for the amazing work @dormando & others!

*Context:*
*I work at Grafana Labs, and we are very interested in trying out extstore 
for some very large (>50TB) caches. We plan to split this 50TB cache into 
about 35 different nodes, each with 1.5TB of NVMe & a small memcached 
instance. Losing any given node will result in losing ~3% of the overall 
cache which is acceptable, however if we lose all nodes at once somehow, 
losing all of our cache will be pretty bad and will put severe pressure on 
our backend.*

Ask:
Having looked at the file that extstore writes on disk, it looks like it 
has both keys & values contained in it. Would it be possible to "re-warm" 
the cache on startup by scanning this data and resubmitting it to itself? 
We could then have add some condition to our readiness check in k8s to wait 
until the data is all re-warmed and then allow traffic to flow to those 
instances. Is this feature planned for anytime soon?

Thanks!

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to memcached+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/memcached/cc45382b-eee7-4e37-a841-d210bf18ff4bn%40googlegroups.com.