[ceph-users] Re: pg xyz is stuck undersized for long time

Frank Schilder Mon, 09 Nov 2020 10:07:21 -0800

My PGs are healthy now, but the underlying problem itself is not fixed. I was 
interested if someone knew a fast fix to get the PGs complete right away.


The down OSDs have been shut down a long time ago and are sitting in a 
different crush root. It was 1 OSD in an HDD pool that I'm re-organising right 
now, which was temporarily down (1 out of the 275).

I should have mentioned that I know that a long-standing bug in ceph is the 
reason for this partial data loss (https://tracker.ceph.com/issues/46847). I 
thought I had a fully functional workaround, but it turned out that I was 
wrong. My workaround fixes all incomplete PGs, except PGs that are in the state 
"backfilling" at the time of OSD restart.

I will file a new tracker item as this looks like a catastrophic bug. Any 
cluster that is rebalancing, either after adding disks, increasing pg[p]_num on 
a pool or similar operations is in danger. You will find many threads related 
to this problem, but the actual underlying bug has never been addressed 
completely. Some people actually lost data due to this, in particular, EC pools 
can become damaged beyond repair. From all the threads I found, this seems to 
be the one and only long-standing bug in ceph/rados that can cause data loss. A 
lot of clusters are affected, people are mostly just lucky. Reports date back 
to Luminous all the way up to Nautilus.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Amudhan P <[email protected]>
Sent: 09 November 2020 03:16:40
To: Frank Schilder
Cc: ceph-users
Subject: Re: [ceph-users] pg xyz is stuck undersized for long time

Hi Frank,

You said only one OSD is down but in ceph status shows more than 20 OSD is down.

Regards,
Amudhan

On Sun 8 Nov, 2020, 12:13 AM Frank Schilder, 
<[email protected]<mailto:[email protected]>> wrote:
Hi all,

I moved the crush location of 8 OSDs and rebalancing went on happily (misplaced 
objects only). Today, osd.1 crashed, restarted and rejoined the cluster. 
However, it seems not to re-join some PGs it was a member of. I have now 
undersized PGs for no real reason I would believe:

PG_DEGRADED Degraded data redundancy: 52173/2268789087 objects degraded 
(0.002%), 2 pgs degraded, 7 pgs undersized
    pg 11.52 is stuck undersized for 663.929664, current state 
active+undersized+remapped+backfilling, last acting 
[237,60,2147483647,74,233,232,292,86]

The up and acting sets are:

    "up": [
        237,
        2,
        74,
        289,
        233,
        232,
        292,
        86
    ],
    "acting": [
        237,
        60,
        2147483647,
        74,
        233,
        232,
        292,
        86
    ],

How can I get the PG to complete peering and osd.1 to join? I have an 
unreasonable number of degraded objects where the missing part is on this OSD.

For completeness, here the cluster status:

# ceph status
  cluster:
    id:     ...
    health: HEALTH_ERR
            noout,norebalance flag(s) set
            1 large omap objects
            35815902/2268938858 objects misplaced (1.579%)
            Degraded data redundancy: 46122/2268938858 objects degraded 
(0.002%), 2 pgs degraded, 7 pgs undersized
            Degraded data redundancy (low space): 28 pgs backfill_toofull

  services:
    mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03
    mgr: ceph-01(active), standbys: ceph-03, ceph-02
    mds: con-fs2-1/1/1 up  {0=ceph-08=up:active}, 1 up:standby-replay
    osd: 299 osds: 275 up, 275 in; 301 remapped pgs
         flags noout,norebalance

  data:
    pools:   11 pools, 3215 pgs
    objects: 268.8 M objects, 675 TiB
    usage:   854 TiB used, 1.1 PiB / 1.9 PiB avail
    pgs:     46122/2268938858 objects degraded (0.002%)
             35815902/2268938858 objects misplaced (1.579%)
             2907 active+clean
             219  active+remapped+backfill_wait
             47   active+remapped+backfilling
             28   active+remapped+backfill_wait+backfill_toofull
             6    active+clean+scrubbing+deep
             5    active+undersized+remapped+backfilling
             2    active+undersized+degraded+remapped+backfilling
             1    active+clean+scrubbing

  io:
    client:   13 MiB/s rd, 196 MiB/s wr, 2.82 kop/s rd, 1.81 kop/s wr
    recovery: 57 MiB/s, 14 objects/s

Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- [email protected]<mailto:[email protected]>
To unsubscribe send an email to 
[email protected]<mailto:[email protected]>
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: pg xyz is stuck undersized for long time

Reply via email to