On Fri, Jul 28, 2017 at 05:43:14PM +0800, linghucongsong wrote:
>
>
> It look like the osd in your cluster is not all the same size.
>
> can you show ceph osd df output?
you're right, they're not.. here's the output:
[root@v1b ~]# ceph osd df tree
ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
-2 1.55995 - 1706G 883G 805G 51.78 2.55 0 root ssd
-9 0.39999 - 393G 221G 171G 56.30 2.78 0 host v1c-ssd
10 0.39999 1.00000 393G 221G 171G 56.30 2.78 98 osd.10
-10 0.59998 - 683G 275G 389G 40.39 1.99 0 host v1a-ssd
5 0.29999 1.00000 338G 151G 187G 44.77 2.21 65 osd.5
26 0.29999 1.00000 344G 124G 202G 36.07 1.78 52 osd.26
-11 0.34000 - 338G 219G 119G 64.68 3.19 0 host v1b-ssd
13 0.34000 1.00000 338G 219G 119G 64.68 3.19 96 osd.13
-7 0.21999 - 290G 166G 123G 57.43 2.83 0 host v1d-ssd
19 0.21999 1.00000 290G 166G 123G 57.43 2.83 73 osd.19
-1 39.29982 - 43658G 8312G 34787G 19.04 0.94 0 root default
-4 11.89995 - 12806G 2422G 10197G 18.92 0.93 0 host v1a
6 1.59999 1.00000 1833G 358G 1475G 19.53 0.96 366 osd.6
8 1.79999 1.00000 1833G 313G 1519G 17.11 0.84 370 osd.8
2 1.59999 1.00000 1833G 320G 1513G 17.46 0.86 331 osd.2
0 1.70000 1.00000 1804G 431G 1373G 23.90 1.18 359 osd.0
4 1.59999 1.00000 1833G 294G 1539G 16.07 0.79 360 osd.4
25 3.59999 1.00000 3667G 704G 2776G 19.22 0.95 745 osd.25
-5 10.39995 - 10914G 2154G 8573G 19.74 0.97 0 host v1b
1 1.59999 1.00000 1804G 350G 1454G 19.42 0.96 409 osd.1
3 1.79999 1.00000 1804G 360G 1444G 19.98 0.99 412 osd.3
9 1.59999 1.00000 1804G 331G 1473G 18.37 0.91 363 osd.9
11 1.79999 1.00000 1833G 367G 1465G 20.06 0.99 415 osd.11
24 3.59999 1.00000 3667G 744G 2736G 20.30 1.00 834 osd.24
-6 7.79996 - 9051G 1769G 7282G 19.54 0.96 0 host v1c
14 1.59999 1.00000 1804G 370G 1433G 20.54 1.01 442 osd.14
15 1.79999 1.00000 1833G 383G 1450G 20.92 1.03 447 osd.15
16 1.39999 1.00000 1804G 295G 1508G 16.38 0.81 355 osd.16
18 1.39999 1.00000 1804G 366G 1438G 20.29 1.00 381 osd.18
17 1.59999 1.00000 1804G 353G 1451G 19.57 0.97 429 osd.17
-3 9.19997 - 10885G 1965G 8733G 18.06 0.89 0 host v1d-sata
12 1.39999 1.00000 1804G 348G 1455G 19.32 0.95 365 osd.12
20 1.39999 1.00000 1804G 335G 1468G 18.60 0.92 371 osd.20
21 3.59999 1.00000 3667G 695G 2785G 18.97 0.94 871 osd.21
22 1.39999 1.00000 1804G 281G 1522G 15.63 0.77 326 osd.22
23 1.39999 1.00000 1804G 303G 1500G 16.83 0.83 321 osd.23
TOTAL 45365G 9195G 35592G 20.27
MIN/MAX VAR: 0.77/3.19 STDDEV: 14.69
apart from replacing OSDs, how can I help it?
>
>
> At 2017-07-28 17:24:29, "Nikola Ciprich" <[email protected]> wrote:
> >I forgot to add that OSD daemons really seem to be idle, no disk
> >activity, no CPU usage.. it just looks to me like some kind of
> >deadlock, as they were waiting for each other..
> >
> >and so I'm trying to get last 1.5% of misplaced / degraded PGs
> >for almost a week..
> >
> >
> >On Fri, Jul 28, 2017 at 10:56:02AM +0200, Nikola Ciprich wrote:
> >> Hi,
> >>
> >> I'm trying to find reason for strange recovery issues I'm seeing on
> >> our cluster..
> >>
> >> it's mostly idle, 4 node cluster with 26 OSDs evenly distributed
> >> across nodes. jewel 10.2.9
> >>
> >> the problem is that after some disk replaces and data moves, recovery
> >> is progressing extremely slowly.. pgs seem to be stuck in
> >> active+recovering+degraded
> >> state:
> >>
> >> [root@v1d ~]# ceph -s
> >> cluster a5efbc87-3900-4c42-a977-8c93f7aa8c33
> >> health HEALTH_WARN
> >> 159 pgs backfill_wait
> >> 4 pgs backfilling
> >> 259 pgs degraded
> >> 12 pgs recovering
> >> 113 pgs recovery_wait
> >> 215 pgs stuck degraded
> >> 266 pgs stuck unclean
> >> 140 pgs stuck undersized
> >> 151 pgs undersized
> >> recovery 37788/2327775 objects degraded (1.623%)
> >> recovery 23854/2327775 objects misplaced (1.025%)
> >> noout,noin flag(s) set
> >> monmap e21: 3 mons at
> >> {v1a=10.0.0.1:6789/0,v1b=10.0.0.2:6789/0,v1c=10.0.0.3:6789/0}
> >> election epoch 6160, quorum 0,1,2 v1a,v1b,v1c
> >> fsmap e817: 1/1/1 up {0=v1a=up:active}, 1 up:standby
> >> osdmap e76002: 26 osds: 26 up, 26 in; 185 remapped pgs
> >> flags noout,noin,sortbitwise,require_jewel_osds
> >> pgmap v80995844: 3200 pgs, 4 pools, 2876 GB data, 757 kobjects
> >> 9215 GB used, 35572 GB / 45365 GB avail
> >> 37788/2327775 objects degraded (1.623%)
> >> 23854/2327775 objects misplaced (1.025%)
> >> 2912 active+clean
> >> 130 active+undersized+degraded+remapped+wait_backfill
> >> 97 active+recovery_wait+degraded
> >> 29 active+remapped+wait_backfill
> >> 12 active+recovery_wait+undersized+degraded+remapped
> >> 6 active+recovering+degraded
> >> 5 active+recovering+undersized+degraded+remapped
> >> 4 active+undersized+degraded+remapped+backfilling
> >> 4 active+recovery_wait+degraded+remapped
> >> 1 active+recovering+degraded+remapped
> >> client io 2026 B/s rd, 146 kB/s wr, 9 op/s rd, 21 op/s wr
> >>
> >>
> >> when I restart affected OSDs, it bumps the recovery, but then another
> >> PGs get stuck.. All OSDs were restarted multiple times, none are even
> >> close to
> >> nearfull, I just cant find what I'm doing wrong..
> >>
> >> possibly related OSD options:
> >>
> >> osd max backfills = 4
> >> osd recovery max active = 15
> >> debug osd = 0/0
> >> osd op threads = 4
> >> osd backfill scan min = 4
> >> osd backfill scan max = 16
> >>
> >> Any hints would be greatly appreciated
> >>
> >> thanks
> >>
> >> nik
> >>
> >>
> >> --
> >> -------------------------------------
> >> Ing. Nikola CIPRICH
> >> LinuxBox.cz, s.r.o.
> >> 28.rijna 168, 709 00 Ostrava
> >>
> >> tel.: +420 591 166 214
> >> fax: +420 596 621 273
> >> mobil: +420 777 093 799
> >> www.linuxbox.cz
> >>
> >> mobil servis: +420 737 238 656
> >> email servis: [email protected]
> >> -------------------------------------
> >> _______________________________________________
> >> ceph-users mailing list
> >> [email protected]
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
> >
> >--
> >-------------------------------------
> >Ing. Nikola CIPRICH
> >LinuxBox.cz, s.r.o.
> >28.rijna 168, 709 00 Ostrava
> >
> >tel.: +420 591 166 214
> >fax: +420 596 621 273
> >mobil: +420 777 093 799
> >www.linuxbox.cz
> >
> >mobil servis: +420 737 238 656
> >email servis: [email protected]
> >-------------------------------------
> >_______________________________________________
> >ceph-users mailing list
> >[email protected]
> >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava
tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz
mobil servis: +420 737 238 656
email servis: [email protected]
-------------------------------------
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com