Hello Cephers.
Can someone help me in my cache tier configuration? I have 4 same SSD drives
176GB (184196208K) in SSD pool, how to determine target_max_bytes? I assume
that should be (4 drives* 188616916992 bytes )/ 3 replica = 251489222656
bytes *85% (because of full disk warning)
It will be 213765839257 bytes ~200GB. I make this little bit lower (160GB)
and after some time whole cluster stops on full disk error. One of SSD
drives are full. I see that use of space at the osd is not equal:
32 0.17099 1.00000 175G 127G 49514M 72.47 1.77 95
42 0.17099 1.00000 175G 120G 56154M 68.78 1.68 90
37 0.17099 1.00000 175G 136G 39670M 77.95 1.90 102
47 0.17099 1.00000 175G 130G 46599M 74.09 1.80 97
My setup:
ceph --admin-daemon /var/run/ceph/ceph-osd.32.asok config show | grep cache
"debug_objectcacher": "0\/5",
"mon_osd_cache_size": "10",
"mon_cache_target_full_warn_ratio": "0.66",
"mon_warn_on_cache_pools_without_hit_sets": "true",
"client_cache_size": "16384",
"client_cache_mid": "0.75",
"mds_cache_size": "100000",
"mds_cache_mid": "0.7",
"mds_dump_cache_on_map": "false",
"mds_dump_cache_after_rejoin": "false",
"osd_pool_default_cache_target_dirty_ratio": "0.4",
"osd_pool_default_cache_target_dirty_high_ratio": "0.6",
"osd_pool_default_cache_target_full_ratio": "0.8",
"osd_pool_default_cache_min_flush_age": "0",
"osd_pool_default_cache_min_evict_age": "0",
"osd_tier_default_cache_mode": "writeback",
"osd_tier_default_cache_hit_set_count": "4",
"osd_tier_default_cache_hit_set_period": "1200",
"osd_tier_default_cache_hit_set_type": "bloom",
"osd_tier_default_cache_min_read_recency_for_promote": "3",
"osd_tier_default_cache_min_write_recency_for_promote": "3",
"osd_map_cache_size": "200",
"osd_pg_object_context_cache_count": "64",
"leveldb_cache_size": "134217728",
"filestore_omap_header_cache_size": "1024",
"filestore_fd_cache_size": "128",
"filestore_fd_cache_shards": "16",
"keyvaluestore_header_cache_size": "4096",
"rbd_cache": "true",
"rbd_cache_writethrough_until_flush": "true",
"rbd_cache_size": "33554432",
"rbd_cache_max_dirty": "25165824",
"rbd_cache_target_dirty": "16777216",
"rbd_cache_max_dirty_age": "1",
"rbd_cache_max_dirty_object": "0",
"rbd_cache_block_writes_upfront": "false",
"rgw_cache_enabled": "true",
"rgw_cache_lru_size": "10000",
"rgw_keystone_token_cache_size": "10000",
"rgw_bucket_quota_cache_size": "10000",
Rule for SSD:
rule ssd {
ruleset 1
type replicated
min_size 1
max_size 10
step take ssd
step choose firstn 2 type rack
step chooseleaf firstn 2 type host
step emit
step take ssd
step chooseleaf firstn -2 type osd
step emit
}
OSD tree with SSD:
-8 0.68597 root ssd
-9 0.34299 rack skwer-ssd
-16 0.17099 host ceph40-ssd
32 0.17099 osd.32 up 1.00000 1.00000
-19 0.17099 host ceph50-ssd
42 0.17099 osd.42 up 1.00000 1.00000
-11 0.34299 rack nzoz-ssd
-17 0.17099 host ceph45-ssd
37 0.17099 osd.37 up 1.00000 1.00000
-22 0.17099 host ceph55-ssd
47 0.17099 osd.47 up 1.00000 1.00000
Can someone help? Any ideas? It is normal that whole cluster stops at disk
full error on cache tier, I was thinking that only one of pools can stops
and other without cache tier should still work.
Best regards,
--
Mateusz Skała
[email protected]
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com