Copying the ML, because I forgot to reply-all.

Reed

> On Apr 15, 2020, at 3:58 PM, Reed Dier <[email protected]> wrote:
> 
> The problem is almost certainly stemming from unbalanced OSD distribution 
> among your hosts, and assuming you are using a default 3x replication across 
> hosts crush rule set.
> 
> You are limited by your smallest bin size.
> 
> In this case you have a 750GB HDD as the only OSD on node1, so when it wants 
> 3 copies across 3 hosts, there are only ~750GB of space that can fulfill this 
> requirement.
> 
> Having lots of different size OSDs and differing OSDs in your topology is 
> going to lead to issues of under/over utilization.
> 
>> ID  CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF
>> -1          21.54213    root default
>> -3          0.75679     host node1
>> -5          5.39328     host node2
>> -10        15.39206     host node3
> 
> You either need to redistribute your OSDs across your hosts, or possibly 
> rethink your disk strategy.
> You could move osd.5 to node1, and osd.0 to node2, which would give you 
> roughly 6TiB of usable hdd space across your three nodes.
> 
> Reed
> 
>> On Apr 15, 2020, at 10:50 AM, Simon Sutter <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Hello everybody,
>> 
>> 
>> 
>> I'm very new to ceph and installed a testenvironment (nautilus).
>> 
>> The current goal of this cluster is, to be a short period backup.
>> 
>> For this goal we want to use older, mixed hardware, so I was thinking, for 
>> testing I will set up very unbalanced nodes (you can learn the most, from 
>> exceptional circumstances, right?).
>> 
>> I created for my cephfs two pools, one for metadata and one for storage data.
>> 
>> 
>> 
>> I have three nodes and the ceph osd tree looks like this:
>> ID  CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF
>> -1          21.54213    root default
>> -3            0.75679     host node1
>>  0   hdd  0.75679         osd.0      up  0.00636 1.00000
>> -5           5.39328     host node2
>>  1   hdd  2.66429         osd.1      up  0.65007 1.00000
>>  3   hdd  2.72899         osd.3      up  0.65007 1.00000
>> -10        15.39206     host node3
>>  5   hdd  7.27739         osd.5      up  1.00000 1.00000
>>  6   hdd  7.27739         osd.6      up  1.00000 1.00000
>>  2   ssd  0.38249         osd.2      up  1.00000 1.00000
>>  4   ssd  0.45479         osd.4      up  1.00000 1.00000
>> 
>> 
>> The PGs and thus the data is extremely unbalanced, you can see it in the 
>> ceph osd df overview:
>> ID CLASS WEIGHT  REWEIGHT SIZE    RAW USE DATA    OMAP    META     AVAIL   
>> %USE  VAR  PGS STATUS
>> 0   hdd 0.75679  0.00636 775 GiB 651 GiB 650 GiB  88 KiB     1.5 GiB 124 GiB 
>>   84.02   7.26 112     up
>> 1   hdd 2.66429  0.65007   2.7 TiB 497 GiB 496 GiB  88 KiB     1.2 GiB   2.2 
>> TiB   18.22   1.57  81     up
>> 3   hdd 2.72899  0.65007   2.7 TiB 505 GiB 504 GiB    8 KiB     1.3 GiB   
>> 2.2 TiB   18.07   1.56  88     up
>> 5   hdd 7.27739  1.00000   7.3 TiB 390 GiB 389 GiB    8 KiB     1.2 GiB   
>> 6.9 TiB     5.24   0.45  67     up
>> 6   hdd 7.27739  1.00000   7.3 TiB 467 GiB 465 GiB  64 KiB     1.3 GiB   6.8 
>> TiB     6.26   0.54  78     up
>> 2   ssd 0.38249   1.00000  392 GiB   14 GiB   13 GiB  11 KiB 1024 MiB 377 
>> GiB     3.68   0.32   2     up
>> 4   ssd 0.45479   1.00000  466 GiB   28 GiB   27 GiB    4 KiB 1024 MiB 438 
>> GiB     6.03    0.52   4     up
>>                    TOTAL  22 TiB 2.5 TiB 2.5 TiB 273 KiB  8.4 GiB  19 TiB 
>> 11.57
>> MIN/MAX VAR: 0.32/7.26  STDDEV: 6.87
>> 
>> To counteract this, I tried to turn on the balancer module.
>> 
>> The module is decreasing the reweight of the osd0 more and more, while ceph 
>> pg stat is telling me, there are more misplaced objects:
>> 
>> 144 pgs: 144 active+clean+remapped; 853 GiB data, 2.5 TiB used, 19 TiB / 22 
>> TiB avail; 30 MiB/s wr, 7 op/s; 242259/655140 objects misplaced (36.978%)
>> 
>> 
>> 
>> So my question is: is ceph supposed to do that?
>> Why are all those objects misplaced? Because of those 112 PGs on osd0?
>> Why are there 112 PGs on osd0? I did not set any pg settings except the 
>> number: 512
>> 
>> 
>> 
>> Thank you very much
>> Simon Sutter
>> _______________________________________________
>> ceph-users mailing list -- [email protected] <mailto:[email protected]>
>> To unsubscribe send an email to [email protected] 
>> <mailto:[email protected]>
> 

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to