Re: [ceph-users] Lost space or expected?

2018-03-23 Thread David Turner
The first thing I looked at was if you had any snapshots/clones in your
pools, but that count is 0 for you.  Second, I would look at seeing if you
have orphaned objects from deleted RBDs.  You could check that by comparing
a list of the rbd 'block_name_prefix' for all of the rbds in the pool with
the prefix of object names in that pool.

rados ls | cut -d . -f1,2 | sort -u | grep ^rbd_data
for rbd in $(rbd ls); do rbd info --pool rbd-replica-ssd $rbd | awk
'/block_name_prefix/ {print $2}'; done | sort

Alternatively you can let bash do the work for you by diff'ing the output
of the commands directly

diff <(rados ls | cut -d . -f1,2 | sort -u | grep ^rbd_data) <(for rbd in
$(rbd ls); do rbd info --pool rbd-replica-ssd $rbd | awk
'/block_name_prefix/ {print $2}'; done | sort) | awk '/>/ {print $2}'

Anything listed are rbd prefixes with objects for rbds that do not exist.
If you do have any that show up here, you would want to triple check that
the RBD doesn't actually exist and then work on finding the objects with
that prefix and delete them with something like `rados ls | grep $prefix |
rados rm`.

Also to note, rbd_data is not the only thing that uses the rbd prefix,
there is also rbd_header, rbd_object_map, and perhaps other things that
will also need to be cleaned up if you find orphans.  Hopefully you
don't... but hopefully you do so you can get an answer to your question and
a direction to go.

On Tue, Mar 20, 2018 at 9:54 AM Caspar Smit  wrote:

> Hi all,
>
> Here's the output of 'rados df' for one of our clusters (Luminous 12.2.2):
>
> ec_pool 75563G 19450232 0 116701392 0 0 0 385351922 27322G 800335856 294T
> rbd 42969M 10881 0 32643 0 0 0 615060980 14767G 970301192 207T
> rbdssd 252G 65446 0 196338 0 0 0 29392480 1581G 211205402 2601G
>
> total_objects 19526559
> total_used 148T
> total_avail 111T
> total_space 259T
>
>
> ec_pool (k=4, m=2)
> rbd (size = 3/2)
> rbdssd (size = 3/2)
>
> If i calculate the space i should be using:
>
> ec_pool = 75 TB x 1.5 = 112.5 TB  (4+2 is storage times 1.5 right?)
> rbd = 42 GB x 3 = 150 GB
> rbdssd = 252 GB x 3 = 756 GB
>
> Let's say 114TB in total.
>
> Why is there 148TB used space? (That's a 30TB difference)
> Is this expected behaviour? A bug? (if so, how can i reclaim this space?)
>
> kind regards,
> Caspar
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Lost space or expected?

2018-03-20 Thread Caspar Smit
Hi all,

Here's the output of 'rados df' for one of our clusters (Luminous 12.2.2):

ec_pool 75563G 19450232 0 116701392 0 0 0 385351922 27322G 800335856 294T
rbd 42969M 10881 0 32643 0 0 0 615060980 14767G 970301192 207T
rbdssd 252G 65446 0 196338 0 0 0 29392480 1581G 211205402 2601G

total_objects 19526559
total_used 148T
total_avail 111T
total_space 259T


ec_pool (k=4, m=2)
rbd (size = 3/2)
rbdssd (size = 3/2)

If i calculate the space i should be using:

ec_pool = 75 TB x 1.5 = 112.5 TB  (4+2 is storage times 1.5 right?)
rbd = 42 GB x 3 = 150 GB
rbdssd = 252 GB x 3 = 756 GB

Let's say 114TB in total.

Why is there 148TB used space? (That's a 30TB difference)
Is this expected behaviour? A bug? (if so, how can i reclaim this space?)

kind regards,
Caspar
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com