Hi,
I have a space problem on a production cluster, like if there is unused
data not freed : "ceph df" and "rados df" reports 613GB of data, and
disk usage is 2640GB (with 3 replica). It should be near 1839GB.
I have 5 hosts, 3 with SAS storage and 2 with SSD storage. I use crush
rules to put pools on SAS or on SSD.
My pools :
# ceph osd dump | grep ^pool
pool 0 'data' rep size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num
576 pgp_num 576 last_change 68315 owner 0 crash_replay_interval 45
pool 1 'metadata' rep size 3 min_size 1 crush_ruleset 1 object_hash rjenkins
pg_num 576 pgp_num 576 last_change 68317 owner 0
pool 2 'rbd' rep size 3 min_size 1 crush_ruleset 2 object_hash rjenkins pg_num
576 pgp_num 576 last_change 68321 owner 0
pool 3 'hdd3copies' rep size 3 min_size 1 crush_ruleset 4 object_hash rjenkins
pg_num 200 pgp_num 200 last_change 172933 owner 0
pool 6 'ssd3copies' rep size 3 min_size 1 crush_ruleset 7 object_hash rjenkins
pg_num 800 pgp_num 800 last_change 172929 owner 0
pool 9 'sas3copies' rep size 3 min_size 1 crush_ruleset 4 object_hash rjenkins
pg_num 2048 pgp_num 2048 last_change 172935 owner 0
Only hdd3copies, sas3copies and ssd3copies are really used :
# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
76498G 51849G 24648G 32.22
POOLS:
NAME ID USED %USED OBJECTS
data 0 46753 0 72
metadata 1 0 0 0
rbd 2 8 0 1
hdd3copies 3 2724G 3.56 5190954
ssd3copies 6 613G 0.80 347668
sas3copies 9 3692G 4.83 764394
My CRUSH rules was :
rule SASperHost {
ruleset 4
type replicated
min_size 1
max_size 10
step take SASroot
step chooseleaf firstn 0 type host
step emit
}
and :
rule SSDperOSD {
ruleset 3
type replicated
min_size 1
max_size 10
step take SSDroot
step choose firstn 0 type osd
step emit
}
but, since the cluster was full because of that space problem, I swith to a
different rule :
rule SSDperOSDfirst {
ruleset 7
type replicated
min_size 1
max_size 10
step take SSDroot
step choose firstn 1 type osd
step emit
step take SASroot
step chooseleaf firstn -1 type net
step emit
}
So with that last rule, I should have only one replica on my SSD OSD, so 613GB
of space used. But if I check on OSD I see 1212GB really used.
I also use snapshots, maybe snapshots are ignored by "ceph df" and "rados df" ?
Thanks for any help.
Olivier
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com