Thanks for you swift reply.  Below is the requested information.

I understand the bit about not being able to reduce the pg count as we've come 
across this issue once before.  This is the reason I've been hesitant to make 
any changes there without being 100% certain of getting it right and the impact 
of these changes.  That, and the more I read about how to calculate this, the 
more confused I get.  As for the reweight, is that just a matter of "ceph osd 
reweight osd.3 1" once the other issues are sorted out (or perhaps start with a 
less dramatic change and work up)?

Also, presuming I need to change the pg/pgp num, would you be suggesting on 
pool 2 based on the below info (the pool with a few large files) or on pool 20 
(the pool with the most data but an average of about 250KB file size)?  I'm 
just completely confused as to what's caused this issue in the first place and 
how to go about fixing it.  On top of that, am I going to be able to increase 
the pg/pgp count with the cluster in a state of health_warn?  Just some posts 
I've read seem to indicate that the health state needs to be ok before this 
sort of thing can be changed (but I could be misunnderstanding what I'm 
reading).

Anyway, here's the info:

# ceph df
GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED
    28219G     11227G       15558G         55.13
POOLS:
    NAME                          ID     USED       %USED     MAX AVAIL     
OBJECTS
    rbd                           0           0         0          690G         
   0
    KUBERNETES                    1        122G     15.11          690G        
34188
    KUBERNETES_METADATA           2      49310k         0          690G         
1426
    default.rgw.control           11          0         0          690G         
   8
    default.rgw.data.root         12     20076k         0          690G        
54412
    default.rgw.gc                13          0         0          690G         
  32
    default.rgw.log               14          0         0          690G         
 127
    default.rgw.users.uid         15       4942         0          690G         
  15
    default.rgw.users.keys        16        126         0          690G         
   4
    default.rgw.users.swift       17        252         0          690G         
   8
    default.rgw.buckets.index     18          0         0          690G        
27206
    .rgw.root                     19       1588         0          690G         
   4
    default.rgw.buckets.data      20      7402G     91.47          690G     
30931617
    default.rgw.users.email       21          0         0          690G         
   0


# ceph osd pool ls detail
pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins 
pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
pool 1 'KUBERNETES' replicated size 2 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 100 pgp_num 100 last_change 17 flags hashpspool 
crash_replay_interval 45 stripe_width 0
pool 2 'KUBERNETES_METADATA' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 100 pgp_num 100 last_change 16 flags hashpspool 
stripe_width 0
pool 11 'default.rgw.control' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 4 pgp_num 4 last_change 68 flags hashpspool 
stripe_width 0
pool 12 'default.rgw.data.root' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 4 pgp_num 4 last_change 69 flags hashpspool 
stripe_width 0
pool 13 'default.rgw.gc' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 4 pgp_num 4 last_change 70 flags hashpspool 
stripe_width 0
pool 14 'default.rgw.log' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 4 pgp_num 4 last_change 71 flags hashpspool 
stripe_width 0
pool 15 'default.rgw.users.uid' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 4 pgp_num 4 last_change 72 flags hashpspool 
stripe_width 0
pool 16 'default.rgw.users.keys' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 4 pgp_num 4 last_change 73 flags hashpspool 
stripe_width 0
pool 17 'default.rgw.users.swift' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 4 pgp_num 4 last_change 74 flags hashpspool 
stripe_width 0
pool 18 'default.rgw.buckets.index' replicated size 2 min_size 1 crush_ruleset 
0 object_hash rjenkins pg_num 4 pgp_num 4 last_change 75 flags hashpspool 
stripe_width 0
pool 19 '.rgw.root' replicated size 2 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 4 pgp_num 4 last_change 76 flags hashpspool stripe_width 0
pool 20 'default.rgw.buckets.data' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 64 pgp_num 64 last_change 442 flags hashpspool 
stripe_width 0
pool 21 'default.rgw.users.email' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 16 pgp_num 16 last_change 260 flags hashpspool 
stripe_width 0

On Thu, 2020-10-29 at 07:05 +0000, Frank Schilder wrote:

Hi Mark,


it looks like you have some very large PGs. Also, you run with a quite low PG 
count, in particular, for the large pool. Please post the output of "ceph df" 
and "ceph osd pool ls detail" to see how much data is in each pool and some 
pool info. I guess you need to increase the PG count of the large pool to split 
PGs up and also reduce the impact of imbalance. When I look at this:


 3 1.37790  0.45013  1410G  1079G   259G 76.49 1.39  21

 4 1.37790  0.95001  1410G  1086G   253G 76.98 1.40  44


I would conclude that the PGs are too large, the reweight of 0.45 without much 
utilization effect indicates that. This weight will need to be rectified as 
well at some time.


You should be able to run with 100-200 PGs per OSD. Please be aware that PG 
planning requires caution as you cannot reduce the PG count of a pool in your 
version. You need to know how much data is in the pools right now and what the 
future plan is.


Best regards,

=================

Frank Schilder

AIT Risø Campus

Bygning 109, rum S14


________________________________________

From: Mark Johnson <

<mailto:ma...@iovox.com>

ma...@iovox.com

>

Sent: 29 October 2020 06:55:55

To:

<mailto:ceph-users@ceph.io>

ceph-users@ceph.io


Subject: [ceph-users] pgs stuck backfill_toofull


I've been struggling with this one for a few days now.  We had an OSD report as 
near full a few days ago.  Had this happen a couple of times before and a 
reweight-by-utilization has sorted it out in the past.  Tried the same again 
but this time we ended up with a couple of pgs in a state of backfill_toofull 
and a handful of misplaced objects as a result.


Tried doing the reweight a few more times and it's been moving data around.  We 
did have another osd trigger the near full alert but running the reweight a 
couple more times seems to have moved some of that data around a bit better.  
However, the original near_full osd doesn't seem to have changed much and the 
backfill_toofull pgs are still there.  I'd keep doing the 
reweight-by-utilization but I'm not sure if I'm heading down the right path and 
if it will eventually sort it out.


We have 14 pools, but the vast majority of data resides in just one of those 
pools (pool 20).  The pgs in the backfill state are in pool 2 (as far as I can 
tell).  That particular pool is used for some cephfs stuff and has a handful of 
large files in there (not sure if this is significant to the problem).


All up, our utilization is showing as 55.13% but some of our OSDs are showing 
as 76% in use with this one problem sitting at 85.02%.  Right now, I'm just not 
sure what the proper corrective action is.  The last couple of reweights I've 
run have been a bit more targetted in that I've set it to only function on two 
OSDs at a time.  If I run a test-reweight targetting only one osd, it does say 
it will reweight OSD 9 (the one at 85.02%).  I gather this will move data away 
from this OSD and potentially get it below the threshold.  However, at one 
point in the past couple of days, it's shown as no OSDs in a near full state, 
yet the two pgs in backfill_toofull didn't change.  So, that's why I'm not sure 
continually reweighting is going to solve this issue.


I'm a long way from knowledgable on Ceph so I'm not really sure what 
information is useful here.  Here's a bit of info on what I'm seeing.  Can 
provide anything else that might help.



Basically, we have a three node cluster but only two have OSDs.  The third is 
there simply to enable a quorum to be established.  The OSDs are evenly spread 
across these two needs and the configuration of each is identical.  We are 
running Jewel and are not in a position to upgrade at this stage.





# ceph --version

ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)



# ceph health detail

HEALTH_WARN 2 pgs backfill_toofull; 2 pgs stuck unclean; recovery 33/62099566 
objects misplaced (0.000%); 1 near full osd(s)

pg 2.52 is stuck unclean for 201822.031280, current state 
active+remapped+backfill_toofull, last acting [17,3]

pg 2.18 is stuck unclean for 202114.617682, current state 
active+remapped+backfill_toofull, last acting [18,2]

pg 2.18 is active+remapped+backfill_toofull, acting [18,2]

pg 2.52 is active+remapped+backfill_toofull, acting [17,3]

recovery 33/62099566 objects misplaced (0.000%)

osd.9 is near full at 85%



# ceph osd df

ID WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE  VAR  PGS

 2 1.37790  1.00000  1410G   842G   496G 59.75 1.08  33

 3 1.37790  0.45013  1410G  1079G   259G 76.49 1.39  21

 4 1.37790  0.95001  1410G  1086G   253G 76.98 1.40  44

 5 1.37790  1.00000  1410G   617G   722G 43.74 0.79  43

 6 1.37790  0.65009  1410G   616G   722G 43.69 0.79  39

 7 1.37790  0.95001  1410G   495G   844G 35.10 0.64  40

 8 1.37790  1.00000  1410G   732G   606G 51.93 0.94  52

 9 1.37790  0.70007  1410G  1199G   139G 85.02 1.54  37

10 1.37790  1.00000  1410G   611G   727G 43.35 0.79  41

11 1.37790  0.75006  1410G   495G   843G 35.11 0.64  32

 0 1.37790  1.00000  1410G   731G   608G 51.82 0.94  43

12 1.37790  1.00000  1410G   851G   487G 60.36 1.09  44

13 1.37790  1.00000  1410G   378G   960G 26.82 0.49  38

14 1.37790  1.00000  1410G   969G   370G 68.68 1.25  37

15 1.37790  1.00000  1410G   724G   614G 51.35 0.93  35

16 1.37790  1.00000  1410G   491G   847G 34.84 0.63  43

17 1.37790  1.00000  1410G   862G   476G 61.16 1.11  50

18 1.37790  0.80005  1410G  1083G   255G 76.78 1.39  26

19 1.37790  0.65009  1410G   963G   375G 68.29 1.24  23

20 1.37790  1.00000  1410G   724G   614G 51.38 0.93  42

              TOTAL 28219G 15557G 11227G 55.13

MIN/MAX VAR: 0.49/1.54  STDDEV: 15.57



# ceph pg ls backfill_toofull

pg_stat objects mip degr misp unf bytes log disklog state state_stamp v 
reported up up_primary acting acting_primary last_scrub scrub_stamp 
last_deep_scrub deep_scrub_stamp

2.18 9 0 0 18 0 0 3653 3653 active+remapped+backfill_toofull 2020-10-29 
05:31:20.429912 610'549153 656:390372 [9,12] 9 [18,2] 18 594'547482 2020-10-25 
20:28:39.680744 594'543841 2020-10-21 21:21:33.092868

2.52 15 0 0 15 0 0 4883 4883 active+remapped+backfill_toofull 2020-10-29 
05:31:28.277898 652'502085 656:367288 [17,9] 17 [17,3] 17 594'499108 2020-10-26 
11:06:48.417825 594'499108 2020-10-26 11:06:48.417825



pool : 17 18 19 11 20 21 12 13 0 14 1 15 2 16 | SUM

--------------------------------------------------------------------------------------------------------------------------------

osd.4 3 0 0 0 9 2 0 0 12 1 9 0 7 1 | 44

osd.17 1 0 0 0 7 3 1 0 8 1 17 1 11 0 | 50

osd.18 0 0 0 0 9 0 0 0 4 0 7 0 5 0 | 25

osd.5 0 0 0 2 5 1 1 0 5 0 16 0 11 2 | 43

osd.6 0 1 0 1 5 2 0 0 9 0 13 1 7 0 | 39

osd.19 0 0 1 0 8 2 0 1 2 0 6 0 3 0 | 23

osd.7 0 0 0 0 4 1 1 0 3 0 12 0 19 0 | 40

osd.8 0 1 0 0 6 3 0 2 10 1 13 1 15 0 | 52

osd.9 1 0 2 0 10 2 0 0 4 1 6 1 10 0 | 37

osd.10 0 0 1 1 5 2 0 1 7 0 12 0 11 1 | 41

osd.20 1 0 0 0 6 1 0 1 7 0 8 1 17 0 | 42

osd.11 0 0 0 0 4 1 1 1 5 0 11 0 9 0 | 32

osd.12 0 0 1 1 7 1 0 0 5 1 12 1 14 1 | 44

osd.13 0 2 0 0 3 1 0 0 10 1 11 0 10 0 | 38

osd.0 0 1 0 1 6 3 0 1 7 0 11 0 13 0 | 43

osd.14 1 0 0 0 8 1 1 0 4 1 12 0 9 0 | 37

osd.15 1 0 2 1 6 1 1 0 8 0 7 0 6 2 | 35

osd.2 0 2 1 0 7 2 1 0 7 1 4 1 6 0 | 32

osd.3 0 0 0 0 9 0 0 0 2 0 4 0 5 0 | 20

osd.16 0 1 0 1 4 3 1 1 9 0 9 1 12 1 | 43

--------------------------------------------------------------------------------------------------------------------------------

SUM : 8 8 8 8 128 32 8 8 128 8 200 8 200 8 |

_______________________________________________

ceph-users mailing list --

<mailto:ceph-users@ceph.io>

ceph-users@ceph.io


To unsubscribe send an email to

<mailto:ceph-users-le...@ceph.io>

ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to