[ceph-users] Re: OSDs are not utilized evenly

Denis Polom Wed, 02 Nov 2022 14:01:45 -0700

Hi Joseph,

thank you for answer. But if I'm looking correctly to 'ceph osd df'output I posted I see there are about 195 PGs per OSD.

There are 608 OSDs in the pool, which is the only data pool. What I havecalculated - PG calc says that PG number is fine.



On 11/1/22 14:03, Joseph Mundackal wrote:

If the GB per pg is high, the balancer module won't be able to help.

Your pg count per osd also looks low (30's), so increasing pgs perpool would help with both problems.


You can use the pg calculator to determine which pools need what

On Tue, Nov 1, 2022, 08:46 Denis Polom <denispo...@gmail.com> wrote:

    Hi

    I observed on my Ceph cluster running latest Pacific that same
    size OSDs
    are utilized differently even if balancer is running and reports
    status
    as perfectly balanced.

    {
         "active": true,
         "last_optimize_duration": "0:00:00.622467",
         "last_optimize_started": "Tue Nov  1 12:49:36 2022",
         "mode": "upmap",
         "optimize_result": "Unable to find further optimization, or
    pool(s)
    pg_num is decreasing, or distribution is already perfect",
         "plans": []
    }

    balancer settings for upmap are:

       mgr           advanced
    mgr/balancer/mode                               upmap
       mgr           advanced
    mgr/balancer/upmap_max_deviation                1
       mgr           advanced
    mgr/balancer/upmap_max_optimizations            20

    It's obvious that utilization is not same (difference is about
    1TB) from
    command `ceph osd df`. Following is just a partial output:

    ID   CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE  DATA OMAP
    META     AVAIL    %USE   VAR   PGS  STATUS
       0    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 3.0 MiB
    37 GiB  3.6 TiB  78.09  1.05  196      up
    124    hdd  18.00020   1.00000   16 TiB   12 TiB   12 TiB       0
    B   32
    GiB  4.7 TiB  71.20  0.96  195      up
    157    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 5.3
    MiB   35
    GiB  3.7 TiB  77.67  1.05  195      up
       1    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 2.0 MiB
    35 GiB  3.7 TiB  77.69  1.05  195      up
    243    hdd  18.00020   1.00000   16 TiB   12 TiB   12 TiB       0
    B   31
    GiB  4.7 TiB  71.16  0.96  195      up
    244    hdd  18.00020   1.00000   16 TiB   12 TiB   12 TiB       0
    B   31
    GiB  4.7 TiB  71.19  0.96  195      up
    245    hdd  18.00020   1.00000   16 TiB   12 TiB   12 TiB       0
    B   32
    GiB  4.7 TiB  71.55  0.96  196      up
    246    hdd  18.00020   1.00000   16 TiB   12 TiB   12 TiB       0
    B   31
    GiB  4.7 TiB  71.17  0.96  195      up
    249    hdd  18.00020   1.00000   16 TiB   12 TiB   12 TiB       0
    B   30
    GiB  4.7 TiB  71.18  0.96  195      up
    500    hdd  18.00020   1.00000   16 TiB   12 TiB   12 TiB       0
    B   30
    GiB  4.7 TiB  71.19  0.96  195      up
    501    hdd  18.00020   1.00000   16 TiB   12 TiB   12 TiB       0
    B   31
    GiB  4.7 TiB  71.57  0.96  196      up
    502    hdd  18.00020   1.00000   16 TiB   12 TiB   12 TiB       0
    B   31
    GiB  4.7 TiB  71.18  0.96  195      up
    532    hdd  18.00020   1.00000   16 TiB   12 TiB   12 TiB       0
    B   31
    GiB  4.7 TiB  71.16  0.96  195      up
    549    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 576
    KiB   36
    GiB  3.7 TiB  77.70  1.05  195      up
    550    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 3.8
    MiB   36
    GiB  3.7 TiB  77.67  1.05  195      up
    551    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 2.4
    MiB   35
    GiB  3.7 TiB  77.68  1.05  195      up
    552    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 5.5
    MiB   35
    GiB  3.7 TiB  77.69  1.05  195      up
    553    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 5.1
    MiB   37
    GiB  3.6 TiB  77.71  1.05  195      up
    554    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 967
    KiB   36
    GiB  3.6 TiB  77.71  1.05  195      up
    555    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 1.3
    MiB   36
    GiB  3.6 TiB  78.08  1.05  196      up
    556    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 4.7
    MiB   36
    GiB  3.6 TiB  78.10  1.05  196      up
    557    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 2.4
    MiB   36
    GiB  3.7 TiB  77.69  1.05  195      up
    558    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 4.5
    MiB   36
    GiB  3.6 TiB  77.72  1.05  195      up
    559    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 1.5
    MiB   35
    GiB  3.6 TiB  78.09  1.05  196      up
    560    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 5.2
    MiB   35
    GiB  3.7 TiB  77.69  1.05  195      up
    561    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 2.8
    MiB   35
    GiB  3.7 TiB  77.69  1.05  195      up
    562    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 1.0
    MiB   36
    GiB  3.7 TiB  77.68  1.05  195      up
    563    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 2.6
    MiB   36
    GiB  3.7 TiB  77.68  1.05  195      up
    564    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 5.1
    MiB   36
    GiB  3.6 TiB  78.09  1.05  196      up
    567    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 4.8
    MiB   36
    GiB  3.6 TiB  78.11  1.05  196      up
    568    hdd  18.00020   1.00000   16 TiB   13 TiB   13 TiB 5.2
    MiB   35
    GiB  3.7 TiB  77.68  1.05  195      up

    All OSDs are used by the same pool (EC)

    I have the same issue on another Ceph cluster with the same setup
    where
    I was able to make OSDs utilization same by changing reweight from
    1.00000  to lower on OSDs with higher utilization and I got a lot of
    free space:

    before changing reweight:

    --- RAW STORAGE ---
    CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
    hdd    3.1 PiB  510 TiB  2.6 PiB   2.6 PiB      83.77
    ssd    2.6 TiB  2.6 TiB   46 GiB    46 GiB       1.70
    TOTAL  3.1 PiB  513 TiB  2.6 PiB   2.6 PiB      83.70

    --- POOLS ---
    POOL                   ID   PGS   STORED  OBJECTS     USED %USED 
    MAX AVAIL
    cephfs_data             3  8192  2.1 PiB  555.63M  2.6 PiB 91.02 
      216 TiB
    cephfs_metadata         4   128  7.5 GiB  140.22k   22 GiB  0.87 
      851 GiB
    device_health_metrics   5     1  4.1 GiB    1.15k  8.3 GiB   0   
    130 TiB


    after changing reweight:
    --- RAW STORAGE ---
    CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
    hdd    3.1 PiB  522 TiB  2.6 PiB   2.6 PiB      83.38
    ssd    2.6 TiB  2.6 TiB   63 GiB    63 GiB       2.36
    TOTAL  3.1 PiB  525 TiB  2.6 PiB   2.6 PiB      83.31

    --- POOLS ---
    POOL                   ID   PGS   STORED  OBJECTS     USED %USED 
    MAX AVAIL
    cephfs_data             3  8192  2.1 PiB  555.63M  2.5 PiB 86.83 
      330 TiB
    cephfs_metadata         4   128  7.4 GiB  140.22k   22 GiB  0.87 
      846 GiB
    device_health_metrics   5     1  4.2 GiB    1.15k  8.4 GiB   0   
    198 TiB

    Free space I got is almost 5% what is about 100TB!

    This is just workaround and I'm not happy with keeping reweight
    with not
    default value permanently.

    Do you have any advice please, what settings can be adjusted or
    should
    be adjusted to keep OSDs utilization same? Because obviously balancer
    upmap, not even crush-compat are working correctly at least in my
    case.

    Many thanks!







    _______________________________________________
    ceph-users mailing list -- ceph-users@ceph.io
    To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: OSDs are not utilized evenly

Reply via email to