On Wed, Jul 12, 2023 at 1:26 AM Frank Schilder <fr...@dtu.dk> wrote:

Hi all,
>
> one problem solved, another coming up. For everyone ending up in the same
> situation, the trick seems to be to get all OSDs marked up and then allow
> recovery. Steps to take:
>
> - set noout, nodown, norebalance, norecover
> - wait patiently until all OSDs are shown as up
> - unset norebalance, norecover
> - wait wait wait, PGs will eventually become active as OSDs become
> responsive
> - unset nodown, noout
>

Nice work bringing the cluster back up.
Looking into an OSD log would give more detail about why they were
flapping. Are these HDDs? Are the block.dbs on flash?

Generally, I've found that on clusters having OSDs which are slow to boot
and flapping up and down, "nodown" is sufficient to recover from such
issues.

Cheers, Dan

______________________________________________________ Clyso GmbH | Ceph
Support and Consulting | https://www.clyso.com





>
> Now the new problem. I now have an ever growing list of OSDs listed as
> rebalancing, but nothing is actually rebalancing. How can I stop this
> growth and how can I get rid of this list:
>
> [root@gnosis ~]# ceph status
>   cluster:
>     id:     XXX
>     health: HEALTH_WARN
>             noout flag(s) set
>             Slow OSD heartbeats on back (longest 634775.858ms)
>             Slow OSD heartbeats on front (longest 635210.412ms)
>             1 pools nearfull
>
>   services:
>     mon: 5 daemons, quorum ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 (age 6m)
>     mgr: ceph-25(active, since 57m), standbys: ceph-26, ceph-01, ceph-02,
> ceph-03
>     mds: con-fs2:8 4 up:standby 8 up:active
>     osd: 1260 osds: 1258 up (since 24m), 1258 in (since 45m)
>          flags noout
>
>   data:
>     pools:   14 pools, 25065 pgs
>     objects: 1.97G objects, 3.5 PiB
>     usage:   4.4 PiB used, 8.7 PiB / 13 PiB avail
>     pgs:     25028 active+clean
>              30    active+clean+scrubbing+deep
>              7     active+clean+scrubbing
>
>   io:
>     client:   1.3 GiB/s rd, 718 MiB/s wr, 7.71k op/s rd, 2.54k op/s wr
>
>   progress:
>     Rebalancing after osd.135 marked in (1s)
>       [=====================.......]
>     Rebalancing after osd.69 marked in (2s)
>       [========================....]
>     Rebalancing after osd.75 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.173 marked in (2s)
>       [========================....]
>     Rebalancing after osd.42 marked in (1s)
>       [=============...............] (remaining: 2s)
>     Rebalancing after osd.104 marked in (2s)
>       [========================....]
>     Rebalancing after osd.82 marked in (2s)
>       [========================....]
>     Rebalancing after osd.107 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.19 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.67 marked in (2s)
>       [=====================.......]
>     Rebalancing after osd.46 marked in (2s)
>       [===================.........] (remaining: 1s)
>     Rebalancing after osd.123 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.66 marked in (2s)
>       [====================........]
>     Rebalancing after osd.12 marked in (2s)
>       [==============..............] (remaining: 2s)
>     Rebalancing after osd.95 marked in (2s)
>       [=====================.......]
>     Rebalancing after osd.134 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.14 marked in (1s)
>       [===================.........]
>     Rebalancing after osd.56 marked in (2s)
>       [=====================.......]
>     Rebalancing after osd.143 marked in (1s)
>       [========================....]
>     Rebalancing after osd.118 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.96 marked in (2s)
>       [========================....]
>     Rebalancing after osd.105 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.44 marked in (1s)
>       [=======.....................] (remaining: 5s)
>     Rebalancing after osd.41 marked in (1s)
>       [==============..............] (remaining: 1s)
>     Rebalancing after osd.9 marked in (2s)
>       [=...........................] (remaining: 37s)
>     Rebalancing after osd.58 marked in (2s)
>       [======......................] (remaining: 8s)
>     Rebalancing after osd.140 marked in (1s)
>       [=======================.....]
>     Rebalancing after osd.132 marked in (2s)
>       [========================....]
>     Rebalancing after osd.31 marked in (1s)
>       [=========================...]
>     Rebalancing after osd.110 marked in (2s)
>       [========================....]
>     Rebalancing after osd.21 marked in (2s)
>       [=========================...]
>     Rebalancing after osd.114 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.83 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.23 marked in (1s)
>       [=======================.....]
>     Rebalancing after osd.25 marked in (1s)
>       [==========================..]
>     Rebalancing after osd.147 marked in (2s)
>       [========================....]
>     Rebalancing after osd.62 marked in (1s)
>       [======================......]
>     Rebalancing after osd.57 marked in (2s)
>       [======================......]
>     Rebalancing after osd.61 marked in (2s)
>       [====================........]
>     Rebalancing after osd.71 marked in (2s)
>       [===================.........]
>     Rebalancing after osd.80 marked in (2s)
>       [======================......]
>     Rebalancing after osd.92 marked in (2s)
>       [=====================.......]
>     Rebalancing after osd.171 marked in (2s)
>       [========================....]
>     Rebalancing after osd.11 marked in (2s)
>       [===========.................] (remaining: 2s)
>     Rebalancing after osd.90 marked in (2s)
>       [====================........]
>     Rebalancing after osd.54 marked in (2s)
>       [====================........]
>     Rebalancing after osd.45 marked in (2s)
>       [===================.........] (remaining: 1s)
>     Rebalancing after osd.53 marked in (1s)
>       [====================........]
>     Rebalancing after osd.22 marked in (3s)
>       [=======================.....]
>     Rebalancing after osd.27 marked in (2s)
>       [========================....]
>     Rebalancing after osd.37 marked in (2s)
>       [===.........................] (remaining: 14s)
>     Rebalancing after osd.94 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.55 marked in (2s)
>       [=====.......................] (remaining: 10s)
>     Rebalancing after osd.35 marked in (2s)
>       [=...........................] (remaining: 31s)
>     Rebalancing after osd.43 marked in (2s)
>       [================............] (remaining: 2s)
>     Rebalancing after osd.13 marked in (2s)
>       [=============...............] (remaining: 2s)
>     Rebalancing after osd.79 marked in (2s)
>       [=========================...]
>     Rebalancing after osd.50 marked in (2s)
>       [======......................] (remaining: 7s)
>     Rebalancing after osd.33 marked in (1s)
>       [............................]
>     Rebalancing after osd.20 marked in (1s)
>       [=======================.....]
>     Rebalancing after osd.59 marked in (2s)
>       [=====================.......]
>     Rebalancing after osd.101 marked in (2s)
>       [======================......]
>     Rebalancing after osd.49 marked in (2s)
>       [=====.......................] (remaining: 9s)
>     Rebalancing after osd.36 marked in (2s)
>       [==..........................] (remaining: 20s)
>     Rebalancing after osd.133 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.29 marked in (2s)
>       [======================......]
>     Rebalancing after osd.8 marked in (2s)
>       [===.........................] (remaining: 14s)
>     Rebalancing after osd.16 marked in (2s)
>       [========================....]
>     Rebalancing after osd.38 marked in (2s)
>       [===========.................] (remaining: 2s)
>     Rebalancing after osd.68 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.130 marked in (2s)
>       [======================......]
>     Rebalancing after osd.117 marked in (2s)
>       [======================......]
>     Rebalancing after osd.155 marked in (2s)
>       [========================....]
>     Rebalancing after osd.10 marked in (2s)
>       [==============..............] (remaining: 1s)
>     Rebalancing after osd.141 marked in (1s)
>       [=======================.....]
>     Rebalancing after osd.52 marked in (2s)
>       [====================........] (remaining: 1s)
>     Rebalancing after osd.177 marked in (1s)
>       [=======================.....]
>     Rebalancing after osd.97 marked in (1s)
>       [=======================.....]
>     Rebalancing after osd.98 marked in (1s)
>       [======================......]
>     Rebalancing after osd.88 marked in (2s)
>       [=====================.......]
>     Rebalancing after osd.116 marked in (2s)
>       [========================....]
>     Rebalancing after osd.108 marked in (2s)
>       [======================......]
>     Rebalancing after osd.17 marked in (1s)
>       [=====================.......]
>     Rebalancing after osd.129 marked in (2s)
>       [====================........]
>     Rebalancing after osd.167 marked in (2s)
>       [======================......]
>     Rebalancing after osd.152 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.77 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.5 marked in (2s)
>       [========....................] (remaining: 5s)
>     Rebalancing after osd.121 marked in (1s)
>       [======================......]
>     Rebalancing after osd.26 marked in (2s)
>       [==========================..]
>     Rebalancing after osd.91 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.81 marked in (2s)
>       [========================....]
>     Rebalancing after osd.48 marked in (2s)
>       [=====.......................] (remaining: 9s)
>     Rebalancing after osd.32 marked in (2s)
>       [=====================.......]
>     Rebalancing after osd.125 marked in (2s)
>       [========================....]
>     Rebalancing after osd.111 marked in (2s)
>       [======================......]
>     Rebalancing after osd.151 marked in (2s)
>       [======================......]
>     Rebalancing after osd.39 marked in (2s)
>       [============................] (remaining: 2s)
>     Rebalancing after osd.136 marked in (2s)
>       [========================....]
>     Rebalancing after osd.112 marked in (1s)
>       [=========================...]
>     Rebalancing after osd.154 marked in (1s)
>       [=========================...]
>     Rebalancing after osd.64 marked in (2s)
>       [===================.........]
>     Rebalancing after osd.34 marked in (2s)
>       [............................] (remaining: 90s)
>     Rebalancing after osd.161 marked in (1s)
>       [========================....]
>     Rebalancing after osd.160 marked in (2s)
>       [=======================.....]
>     Rebalancing after osd.142 marked in (2s)
>       [=======================.....]
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Frank Schilder <fr...@dtu.dk>
> Sent: Wednesday, July 12, 2023 9:53 AM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Cluster down after network outage
>
> Hi all,
>
> we had a network outage tonight (power loss) and restored network in the
> morning. All OSDs were running during this period. After restoring network
> peering hell broke loose and the cluster has a hard time coming back up
> again. OSDs get marked down all the time and come back later. Peering never
> stops.
>
> Below is the current status, I had all OSDs shown as up for a while, but
> many were not responsive. Are there some flags that help bringing things up
> in a sequence that causes less overload on the system?
>
> [root@gnosis ~]# ceph status
>   cluster:
>     id:     XXX
>     health: HEALTH_WARN
>             2 clients failing to respond to capability release
>             6 MDSs report slow metadata IOs
>             3 MDSs report slow requests
>             nodown,noout,nobackfill,norecover flag(s) set
>             176 osds down
>             Slow OSD heartbeats on back (longest 551718.679ms)
>             Slow OSD heartbeats on front (longest 549598.330ms)
>             Reduced data availability: 8069 pgs inactive, 3786 pgs down,
> 3161 pgs peering, 1341 pgs stale
>             Degraded data redundancy: 1187354920/16402772667 objects
> degraded (7.239%), 6222 pgs degraded, 6231 pgs undersized
>             1 pools nearfull
>             17386 slow ops, oldest one blocked for 1811 sec, daemons
> [osd.1128,osd.1152,osd.1154,osd.12,osd.1227,osd.1244,osd.328,osd.354,osd.381,osd.4]...
> have slow ops.
>
>   services:
>     mon: 5 daemons, quorum ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 (age
> 28m)
>     mgr: ceph-25(active, since 30m), standbys: ceph-26, ceph-01, ceph-02,
> ceph-03
>     mds: con-fs2:8 4 up:standby 8 up:active
>     osd: 1260 osds: 1082 up (since 6m), 1258 in (since 18m); 266 remapped
> pgs
>          flags nodown,noout,nobackfill,norecover
>
>   data:
>     pools:   14 pools, 25065 pgs
>     objects: 1.91G objects, 3.4 PiB
>     usage:   3.1 PiB used, 6.0 PiB / 9.0 PiB avail
>     pgs:     0.626% pgs unknown
>              31.566% pgs not active
>              1187354920/16402772667 objects degraded (7.239%)
>              51/16402772667 objects misplaced (0.000%)
>              11706 active+clean
>              4752  active+undersized+degraded
>              3286  down
>              2702  peering
>              799   undersized+degraded+peered
>              464   stale+down
>              418   stale+active+undersized+degraded
>              214   remapped+peering
>              157   unknown
>              128   stale+peering
>              117   stale+remapped+peering
>              101   stale+undersized+degraded+peered
>              57    stale+active+undersized+degraded+remapped+backfilling
>              35    down+remapped
>              26    stale+undersized+degraded+remapped+backfilling+peered
>              23    undersized+degraded+remapped+backfilling+peered
>              14    active+clean+scrubbing+deep
>              9     stale+active+undersized+degraded+remapped+backfill_wait
>              7     active+recovering+undersized+degraded
>              7     stale+active+recovering+undersized+degraded
>              6     active+undersized+degraded+remapped+backfilling
>              6     active+undersized
>              5     active+undersized+degraded+remapped+backfill_wait
>              5     stale+remapped
>              4     stale+activating+undersized+degraded
>              3     active+undersized+remapped
>              3     stale+undersized+degraded+remapped+backfill_wait+peered
>              1     activating+undersized+degraded
>              1     activating+undersized+degraded+remapped
>              1     undersized+degraded+remapped+backfill_wait+peered
>              1     stale+active+clean
>              1     active+recovering
>              1     stale+down+remapped
>              1     undersized+peered
>              1     active+undersized+degraded+remapped
>              1     active+clean+scrubbing
>              1     active+clean+remapped
>              1     active+recovering+degraded
>
>   io:
>     client:   1.8 MiB/s rd, 18 MiB/s wr, 409 op/s rd, 796 op/s wr
>
> Thanks for any hints!
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to