[ceph-users] Ceph not reclaiming space or overhead?

Brian Woods Thu, 21 Dec 2017 16:20:46 -0800

I will start with I am very new to ceph and am trying to teach myself the
ins and outs.  While doing this I have been creating and destroying pools
as I experiment on some test hardware.  Something I noticed was that when a
pool is deleted, the space is not always freed 100%.  This is true even
after days of idle time.


Right now with 7 OSD and a few empty pools I have 70GBs of raw spaced used.

Now, I am not sure if this is normal, but I did migrate my OSDs to
bluestore and have been adding OSDs.  So maybe some space is just overhead
for each OSD?  I lost one of my disks and the usage dropped to 70GBs.
Though when I had that failure I got some REALLY odd results from ceph -s…
  Note the number of data objects (242 total) vs. the number of degraded
objects (101 of 726):

------------------

root@MediaServer:~# ceph -s

 cluster:

    id:     26c81563-ee27-4967-a950-afffb795f29e

    health: HEALTH_WARN

           1 filesystem is degraded

           insufficient standby MDS daemons available

           1 osds down

           Degraded data redundancy: 101/726 objects degraded (13.912%), 92
pgs unclean, 92 pgs degraded, 92 pgs undersized

 services:

    mon: 2 daemons, quorum TheMonolith,MediaServer

    mgr: MediaServer.domain(active), standbys: TheMonolith.domain

    mds: MediaStoreFS-1/1/1 up  {0=MediaMDS=up:reconnect(laggy or crashed)}

    osd: 8 osds: 7 up, 8 in

    rgw: 2 daemons active

 data:

    pools:   8 pools, 176 pgs

    objects: 242 objects, 3568 bytes

    usage:   80463 MB used, 10633 GB / 10712 GB avail

    pgs:     101/726 objects degraded (13.912%)

            92 active+undersized+degraded

            84 active+clean

------------------

After reweighting the failed OSD out:

------------------

root@MediaServer:/var/log/ceph# ceph -s

 cluster:

    id:     26c81563-ee27-4967-a950-afffb795f29e

    health: HEALTH_WARN

           1 filesystem is degraded

           insufficient standby MDS daemons available

 services:

    mon: 2 daemons, quorum TheMonolith,MediaServer

    mgr: MediaServer.domain(active), standbys: TheMonolith.domain

    mds: MediaStoreFS-1/1/1 up  {0=MediaMDS=up:reconnect(laggy or crashed)}

    osd: 8 osds: 7 up, 7 in

    rgw: 2 daemons active

 data:

    pools:   8 pools, 176 pgs

    objects: 242 objects, 3568 bytes

    usage:   71189 MB used, 8779 GB / 8849 GB avail

    pgs:     176 active+clean

------------------

My pools:

------------------

root@MediaServer:/var/log/ceph# ceph df

GLOBAL:

    SIZE      AVAIL     RAW USED     %RAW USED

    8849G     8779G       71189M          0.79

POOLS:

    NAME                          ID     USED     %USED     MAX AVAIL
OBJECTS

    .rgw.root                     6      1322         0         3316G
3

    default.rgw.control           7         0         0         3316G
11

    default.rgw.meta              8         0         0         3316G
0

    default.rgw.log               9         0         0         3316G
207

    MediaStorePool                19        0         0         5970G
0

    MediaStorePool-Meta           20     2246         0         3316G
21

    MediaStorePool-WriteCache     21        0         0         3316G
0

    rbd                           22        0         0         4975G
0

------------------

Am I looking at some sort of a file system leak, or is this normal?


Also, before I deleted (or broke rather) my last pool, I marked OSDs in and
out and tracked the space. The data pool was erasure with 4 data and 1
parity and all data cleared from the cache pool:


Obj Used Total Size
Data Expected Usage Difference Notes

639 10712 417 521.25 -117.75 8 OSDs
337k 636 10246 417 521.25 -114.75 7 OSDs (complete removal, osd 0, 500GB)
337k 629 10712 417 521.25 -107.75 8 OSDs (Wiped and re-added as osd.51002)
337k 631 9780 417 521.25 -109.75 7 OSDs (out, crush removed, osd 5, 1TB)
337k 649 10712 417 521.25 -127.75 8 OSDs (crush add, osd in)
337k 643 9780 417 521.25 -121.75 7 OSDs (out, osd 5, 1TB)
337k 625 9780 417 521.25 -103.75 7 OSDs (crush reweight 0, osd 5, 1TB)

There was enough difference between the in and out of OSDs that I kinda
think something is up. Even with the 80GBs removed from the difference when
I have no data at all, that still leaved me with upwards of 40GBs of
unaccounted for usage...


Debian 9 \ Kernel: 4.4.0-104-generic

ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous
(stable)


Thanks for your input! It's appreciated!

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph not reclaiming space or overhead?

Reply via email to