I will start with I am very new to ceph and am trying to teach myself the
ins and outs. While doing this I have been creating and destroying pools
as I experiment on some test hardware. Something I noticed was that when a
pool is deleted, the space is not always freed 100%. This is true even
after days of idle time.
Right now with 7 OSD and a few empty pools I have 70GBs of raw spaced used.
Now, I am not sure if this is normal, but I did migrate my OSDs to
bluestore and have been adding OSDs. So maybe some space is just overhead
for each OSD? I lost one of my disks and the usage dropped to 70GBs.
Though when I had that failure I got some REALLY odd results from ceph -s…
Note the number of data objects (242 total) vs. the number of degraded
objects (101 of 726):
------------------
root@MediaServer:~# ceph -s
cluster:
id: 26c81563-ee27-4967-a950-afffb795f29e
health: HEALTH_WARN
1 filesystem is degraded
insufficient standby MDS daemons available
1 osds down
Degraded data redundancy: 101/726 objects degraded (13.912%), 92
pgs unclean, 92 pgs degraded, 92 pgs undersized
services:
mon: 2 daemons, quorum TheMonolith,MediaServer
mgr: MediaServer.domain(active), standbys: TheMonolith.domain
mds: MediaStoreFS-1/1/1 up {0=MediaMDS=up:reconnect(laggy or crashed)}
osd: 8 osds: 7 up, 8 in
rgw: 2 daemons active
data:
pools: 8 pools, 176 pgs
objects: 242 objects, 3568 bytes
usage: 80463 MB used, 10633 GB / 10712 GB avail
pgs: 101/726 objects degraded (13.912%)
92 active+undersized+degraded
84 active+clean
------------------
After reweighting the failed OSD out:
------------------
root@MediaServer:/var/log/ceph# ceph -s
cluster:
id: 26c81563-ee27-4967-a950-afffb795f29e
health: HEALTH_WARN
1 filesystem is degraded
insufficient standby MDS daemons available
services:
mon: 2 daemons, quorum TheMonolith,MediaServer
mgr: MediaServer.domain(active), standbys: TheMonolith.domain
mds: MediaStoreFS-1/1/1 up {0=MediaMDS=up:reconnect(laggy or crashed)}
osd: 8 osds: 7 up, 7 in
rgw: 2 daemons active
data:
pools: 8 pools, 176 pgs
objects: 242 objects, 3568 bytes
usage: 71189 MB used, 8779 GB / 8849 GB avail
pgs: 176 active+clean
------------------
My pools:
------------------
root@MediaServer:/var/log/ceph# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
8849G 8779G 71189M 0.79
POOLS:
NAME ID USED %USED MAX AVAIL
OBJECTS
.rgw.root 6 1322 0 3316G
3
default.rgw.control 7 0 0 3316G
11
default.rgw.meta 8 0 0 3316G
0
default.rgw.log 9 0 0 3316G
207
MediaStorePool 19 0 0 5970G
0
MediaStorePool-Meta 20 2246 0 3316G
21
MediaStorePool-WriteCache 21 0 0 3316G
0
rbd 22 0 0 4975G
0
------------------
Am I looking at some sort of a file system leak, or is this normal?
Also, before I deleted (or broke rather) my last pool, I marked OSDs in and
out and tracked the space. The data pool was erasure with 4 data and 1
parity and all data cleared from the cache pool:
Obj Used Total Size
Data Expected Usage Difference Notes
639 10712 417 521.25 -117.75 8 OSDs
337k 636 10246 417 521.25 -114.75 7 OSDs (complete removal, osd 0, 500GB)
337k 629 10712 417 521.25 -107.75 8 OSDs (Wiped and re-added as osd.51002)
337k 631 9780 417 521.25 -109.75 7 OSDs (out, crush removed, osd 5, 1TB)
337k 649 10712 417 521.25 -127.75 8 OSDs (crush add, osd in)
337k 643 9780 417 521.25 -121.75 7 OSDs (out, osd 5, 1TB)
337k 625 9780 417 521.25 -103.75 7 OSDs (crush reweight 0, osd 5, 1TB)
There was enough difference between the in and out of OSDs that I kinda
think something is up. Even with the 80GBs removed from the difference when
I have no data at all, that still leaved me with upwards of 40GBs of
unaccounted for usage...
Debian 9 \ Kernel: 4.4.0-104-generic
ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous
(stable)
Thanks for your input! It's appreciated!
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com