Yep, this looks fine..
hmm... sorry, but I'm out of ideas what's happening..
Anyway I think ceph reports are more trustworthy than rgw ones. Looks
like some issue with rgw reporting or may be some object leakage.
Regards,
Igor
On 7/3/2019 6:34 PM, Andrei Mikhailovsky wrote:
Hi Igor.
The numbers are identical it seems:
.rgw.buckets 19 15 TiB 78.22 4.3 TiB *8786934*
# cat /root/ceph-rgw.buckets-rados-ls-all |wc -l
*8786934*
Cheers
------------------------------------------------------------------------
*From: *"Igor Fedotov" <[email protected]>
*To: *"andrei" <[email protected]>
*Cc: *"ceph-users" <[email protected]>
*Sent: *Wednesday, 3 July, 2019 13:49:02
*Subject: *Re: [ceph-users] troubleshooting space usage
Looks fine - comparing bluestore_allocated vs. bluestore_stored
shows a little difference. So that's not the allocation overhead.
What's about comparing object counts reported by ceph and radosgw
tools?
Igor.
On 7/3/2019 3:25 PM, Andrei Mikhailovsky wrote:
Thanks Igor, Here is a link to the ceph perf data on several osds.
https://paste.ee/p/IzDMy
In terms of the object sizes. We use rgw to backup the data
from various workstations and servers. So, the sizes would be
from a few kb to a few gig per individual file.
Cheers
------------------------------------------------------------------------
*From: *"Igor Fedotov" <[email protected]>
*To: *"andrei" <[email protected]>
*Cc: *"ceph-users" <[email protected]>
*Sent: *Wednesday, 3 July, 2019 12:29:33
*Subject: *Re: [ceph-users] troubleshooting space usage
Hi Andrei,
Additionally I'd like to see performance counters dump for
a couple of HDD OSDs (obtained through 'ceph daemon osd.N
perf dump' command).
W.r.t average object size - I was thinking that you might
know what objects had been uploaded... If not then you
might want to estimate it by using "rados get" command on
the pool: retrieve some random object set and check their
sizes. But let's check performance counters first - most
probably they will show loses caused by allocation.
Also I've just found similar issue (still unresolved) in
our internal tracker - but its root cause is definitely
different from allocation overhead. Looks like some
orphaned objects in the pool. Could you please compare and
share the amounts of objects in the pool reported by "ceph
(or rados) df detail" and radosgw tools?
Thanks,
Igor
On 7/3/2019 12:56 PM, Andrei Mikhailovsky wrote:
Hi Igor,
Many thanks for your reply. Here are the details about
the cluster:
1. Ceph version - 13.2.5-1xenial (installed from Ceph
repository for ubuntu 16.04)
2. main devices for radosgw pool - hdd. we do use a
few ssds for the other pool, but it is not used by radosgw
3. we use BlueStore
4. Average rgw object size - I have no idea how to
check that. Couldn't find a simple answer from google
either. Could you please let me know how to check that?
5. Ceph osd df tree:
6. Other useful info on the cluster:
# ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL
%USE VAR PGS TYPE NAME
-1 112.17979 - 113 TiB 90 TiB 23 TiB
79.25 1.00 - root uk
-5 112.17979 - 113 TiB 90 TiB 23 TiB
79.25 1.00 - datacenter ldex
-11 112.17979 - 113 TiB 90 TiB 23 TiB
79.25 1.00 - room ldex-dc3
-13 112.17979 - 113 TiB 90 TiB 23 TiB
79.25 1.00 - row row-a
-4 112.17979 - 113 TiB 90 TiB 23 TiB
79.25 1.00 - rack ldex-rack-a5
-2 28.04495 - 28 TiB 22 TiB 6.2 TiB
77.96 0.98 - host arh-ibstorage1-ib
0 hdd 2.73000 0.79999 2.8 TiB 2.3 TiB 519 GiB
81.61 1.03 145 osd.0
1 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 847 GiB
70.00 0.88 130 osd.1
2 hdd 2.73000 1.00000 2.8 TiB 2.2 TiB 561 GiB
80.12 1.01 152 osd.2
3 hdd 2.73000 1.00000 2.8 TiB 2.3 TiB 469 GiB
83.41 1.05 160 osd.3
4 hdd 2.73000 1.00000 2.8 TiB 1.8 TiB 983 GiB
65.18 0.82 141 osd.4
32 hdd 5.45999 1.00000 5.5 TiB 4.4 TiB 1.1 TiB
80.68 1.02 306 osd.32
35 hdd 2.73000 1.00000 2.8 TiB 1.7 TiB 1.0 TiB
62.89 0.79 126 osd.35
36 hdd 2.73000 1.00000 2.8 TiB 2.3 TiB 464 GiB
83.58 1.05 175 osd.36
37 hdd 2.73000 0.89999 2.8 TiB 2.5 TiB 301 GiB
89.34 1.13 160 osd.37
5 ssd 0.74500 1.00000 745 GiB 642 GiB 103 GiB
86.15 1.09 65 osd.5
-3 28.04495 - 28 TiB 24 TiB 4.5 TiB
84.03 1.06 - host arh-ibstorage2-ib
9 hdd 2.73000 0.95000 2.8 TiB 2.4 TiB 405 GiB
85.65 1.08 158 osd.9
10 hdd 2.73000 0.89999 2.8 TiB 2.4 TiB 352 GiB
87.52 1.10 169 osd.10
11 hdd 2.73000 1.00000 2.8 TiB 2.0 TiB 783 GiB
72.28 0.91 160 osd.11
12 hdd 2.73000 0.84999 2.8 TiB 2.4 TiB 359 GiB
87.27 1.10 153 osd.12
13 hdd 2.73000 1.00000 2.8 TiB 2.4 TiB 348 GiB
87.69 1.11 169 osd.13
14 hdd 2.73000 1.00000 2.8 TiB 2.5 TiB 283 GiB
89.97 1.14 170 osd.14
15 hdd 2.73000 1.00000 2.8 TiB 2.2 TiB 560 GiB
80.18 1.01 155 osd.15
16 hdd 2.73000 0.95000 2.8 TiB 2.4 TiB 332 GiB
88.26 1.11 178 osd.16
26 hdd 5.45999 1.00000 5.5 TiB 4.4 TiB 1.0 TiB
81.04 1.02 324 osd.26
7 ssd 0.74500 1.00000 745 GiB 607 GiB 138 GiB
81.48 1.03 62 osd.7
-15 28.04495 - 28 TiB 22 TiB 6.4 TiB
77.40 0.98 - host arh-ibstorage3-ib
18 hdd 2.73000 0.95000 2.8 TiB 2.5 TiB 312 GiB
88.96 1.12 156 osd.18
19 hdd 2.73000 1.00000 2.8 TiB 2.0 TiB 771 GiB
72.68 0.92 162 osd.19
20 hdd 2.73000 1.00000 2.8 TiB 2.0 TiB 733 GiB
74.04 0.93 149 osd.20
21 hdd 2.73000 1.00000 2.8 TiB 2.2 TiB 533 GiB
81.12 1.02 155 osd.21
22 hdd 2.73000 1.00000 2.8 TiB 2.1 TiB 692 GiB
75.48 0.95 144 osd.22
23 hdd 2.73000 1.00000 2.8 TiB 1.6 TiB 1.1 TiB
58.43 0.74 130 osd.23
24 hdd 2.73000 1.00000 2.8 TiB 2.2 TiB 579 GiB
79.51 1.00 146 osd.24
25 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 886 GiB
68.63 0.87 147 osd.25
31 hdd 5.45999 1.00000 5.5 TiB 4.7 TiB 758 GiB
86.50 1.09 326 osd.31
6 ssd 0.74500 0.89999 744 GiB 640 GiB 104 GiB
86.01 1.09 61 osd.6
-17 28.04494 - 28 TiB 22 TiB 6.3 TiB
77.61 0.98 - host arh-ibstorage4-ib
8 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 909 GiB
67.80 0.86 141 osd.8
17 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 904 GiB
67.99 0.86 144 osd.17
27 hdd 2.73000 1.00000 2.8 TiB 2.1 TiB 654 GiB
76.84 0.97 152 osd.27
28 hdd 2.73000 1.00000 2.8 TiB 2.3 TiB 481 GiB
82.98 1.05 153 osd.28
29 hdd 2.73000 1.00000 2.8 TiB 1.9 TiB 829 GiB
70.65 0.89 137 osd.29
30 hdd 2.73000 1.00000 2.8 TiB 2.0 TiB 762 GiB
73.03 0.92 142 osd.30
33 hdd 2.73000 1.00000 2.8 TiB 2.3 TiB 501 GiB
82.25 1.04 166 osd.33
34 hdd 5.45998 1.00000 5.5 TiB 4.5 TiB 968 GiB
82.77 1.04 325 osd.34
39 hdd 2.73000 0.95000 2.8 TiB 2.4 TiB 402 GiB
85.77 1.08 162 osd.39
38 ssd 0.74500 1.00000 745 GiB 671 GiB 74 GiB
90.02 1.14 68 osd.38
TOTAL 113 TiB 90 TiB 23 TiB 79.25
MIN/MAX VAR: 0.74/1.14 STDDEV: 8.14
# for i in $(radosgw-admin bucket list | jq -r '.[]');
do radosgw-admin bucket stats --bucket=$i | jq '.usage
| ."rgw.main" | .size_kb' ; done | awk '{ SUM += $1}
END { print SUM/1024/1024/1024 }'
6.59098
# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
113 TiB 23 TiB 90 TiB 79.25
POOLS:
NAME ID USED
%USED MAX AVAIL OBJECTS
Primary-ubuntu-1 5 27 TiB
87.56 3.9 TiB 7302534
.users.uid 15 6.8 KiB
0 3.9 TiB 39
.users 16 335 B
0 3.9 TiB 20
.users.swift 17 14 B
0 3.9 TiB 1
* .rgw.buckets 19 15 TiB 79.88
3.9 TiB 8787763*
.users.email 22 0 B
0 3.9 TiB 0
.log 24 109 MiB
0 3.9 TiB 102301
.rgw.buckets.extra 37 0 B
0 2.6 TiB 0
.rgw.root 44 2.9 KiB
0 2.6 TiB 16
.rgw.meta 45 1.7 MiB
0 2.6 TiB 6249
.rgw.control 46 0 B
0 2.6 TiB 8
.rgw.gc 47 0 B
0 2.6 TiB 32
.usage 52 0 B
0 2.6 TiB 0
.intent-log 53 0 B
0 2.6 TiB 0
default.rgw.buckets.non-ec 54 0 B
0 2.6 TiB 0
.rgw.buckets.index 55 0 B
0 2.6 TiB 11485
.rgw 56 491 KiB
0 2.6 TiB 1686
Primary-ubuntu-1-ssd 57 1.2 TiB
92.39 105 GiB 379516
I am not too sure if the issue relates to the
BlueStore overhead as I would probably have seen the
discrepancy in my Primary-ubuntu-1 pool as well.
However, the data usage on Primary-ubuntu-1 pool seems
to be consistent with my expectations (precise numbers
to be verified soon). The issues seems to be only with
the .rgw-buckets pool where the "ceph df " output
shows 15TB of usage and the sum of all buckets in that
pool shows just over 6.5TB.
Cheers
Andrei
------------------------------------------------------------------------
*From: *"Igor Fedotov" <[email protected]>
*To: *"andrei" <[email protected]>, "ceph-users"
<[email protected]>
*Sent: *Tuesday, 2 July, 2019 10:58:54
*Subject: *Re: [ceph-users] troubleshooting space
usage
Hi Andrei,
The most obvious reason is space usage overhead
caused by BlueStore allocation granularity, e.g.
if bluestore_min_alloc_size is 64K and average
object size is 16K one will waste 48K per object
in average. This is rather a speculation so far as
we lack key the information about your cluster:
- Ceph version
- What are the main devices for OSD: hdd or ssd.
- BlueStore or FileStore.
- average RGW object size.
You might also want to collect and share
performance counter dumps (ceph daemon osd.N perf
dump) and "
" reports from a couple of your OSDs.
Thanks,
Igor
On 7/2/2019 11:43 AM, Andrei Mikhailovsky wrote:
Bump!
------------------------------------------------------------------------
*From: *"Andrei Mikhailovsky"
<[email protected]>
*To: *"ceph-users" <[email protected]>
*Sent: *Friday, 28 June, 2019 14:54:53
*Subject: *[ceph-users] troubleshooting
space usage
Hi
Could someone please explain / show how to
troubleshoot the space usage in Ceph and
how to reclaim the unused space?
I have a small cluster with 40 osds,
replica of 2, mainly used as a backend for
cloud stack as well as the S3 gateway. The
used space doesn't make any sense to me,
especially the rgw pool, so I am seeking help.
Here is what I found from the client:
Ceph -s shows the
usage: 89 TiB used, 24 TiB / 113 TiB avail
Ceph df shows:
Primary-ubuntu-1 5 27 TiB
90.11 3.0 TiB 7201098
Primary-ubuntu-1-ssd 57 1.2 TiB
89.62 143 GiB 359260
.rgw.buckets 19 15 TiB 83.73
3.0 TiB 8742222
the usage of the Primary-ubuntu-1 and
Primary-ubuntu-1-ssd is in line with my
expectations. However, the .rgw.buckets
pool seems to be using way too much. The
usage of all rgw buckets shows 6.5TB usage
(looking at the size_kb values from the
"radosgw-admin bucket stats"). I am trying
to figure out why .rgw.buckets is using
15TB of space instead of the 6.5TB as
shown from the bucket usage.
Thanks
Andrei
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com