I've recently configured something like this for a backup cluster with
these settings:

ceph osd pool set cache_test hit_set_type bloom
ceph osd pool set cache_test hit_set_count 1
ceph osd pool set cache_test hit_set_period 7200
ceph osd pool set cache_test target_max_bytes 1000000000000
ceph osd pool set cache_test min_read_recency_for_promote 1
ceph osd pool set cache_test min_write_recency_for_promote 0
ceph osd pool set cache_test cache_target_dirty_ratio 0.00001
ceph osd pool set cache_test cache_target_dirty_high_ratio 0.33
ceph osd pool set cache_test cache_target_full_ratio 0.8


The goal here was just to handle bad IO patterns generated by bad
backup software (why do they love to run with a stupidly low queue
depth and small IOs?)
It's not ideal and doesn't really match your use case (since the data
in question isn't read back here)

But yeah, I also thought about building a specialized cache mode that
just acts as a write buffer, there are quite a few applications that
would benefit from that.

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Mon, Dec 2, 2019 at 11:40 PM Robert LeBlanc <rob...@leblancnet.us> wrote:
>
> I'd like to configure a cache tier to act as a write buffer, so that if 
> writes come in, it promotes objects, but reads never promote an object. We 
> have a lot of cold data so we would like to tier down to an EC pool (CephFS) 
> after a period of about 30 days to save space. The storage tier and the 
> 'cache' tier would be on the same spindles, so the only performance 
> improvement would be from the faster writes with replication. So we don't 
> want to really move data between tiers.
>
> The idea would be to not promote on read since EC read performance is good 
> enough and have writes go to the cache tier where the data may be 'hot' for a 
> week or so, then get cold.
>
> It seems that we would only need one hit_set and if -1 can't be set for 
> min_read_recency_for_promote, I could probably use 2 which would never hit 
> because there is only one set, but that may error too. The follow up is how 
> big a set should be as it only really tells if an object "may" be in cache 
> and does not determine when things are flushed, so it really only matters how 
> out-of-date we are okay with the bloom filter being out of date, right? So we 
> could have it be a day long if we are okay with that stale rate? Is there any 
> advantage to having a longer period for a bloom filter? Now, I'm starting to 
> wonder if I even need a bloom filter for this use case, can I get tiering to 
> work without it and only use cache_min_flush_age/cach_min_evict_age since I 
> don't care about promoting when there are X hits in Y time?
>
> Thanks
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to