Re: [ceph-users] Major ceph disaster

Kevin Flöh Thu, 23 May 2019 05:48:16 -0700

thank you for this idea, it has improved the situation. Nevertheless,there are still 2 PGs in recovery_wait. ceph -s gives me:


  cluster:
    id:     23e72372-0d44-4cad-b24f-3641b14b86f4
    health: HEALTH_WARN
            3/125481112 objects unfound (0.000%)

Degraded data redundancy: 3/497011315 objects degraded(0.000%), 2 pgs degraded


  services:
    mon: 3 daemons, quorum ceph-node03,ceph-node01,ceph-node02
    mgr: ceph-node01(active), standbys: ceph-node01.etp.kit.edu

mds: cephfs-1/1/1 up {0=ceph-node03.etp.kit.edu=up:active}, 3up:standby

    osd: 96 osds: 96 up, 96 in

  data:
    pools:   2 pools, 4096 pgs
    objects: 125.48M objects, 259TiB
    usage:   370TiB used, 154TiB / 524TiB avail
    pgs:     3/497011315 objects degraded (0.000%)
             3/125481112 objects unfound (0.000%)
             4083 active+clean
             10   active+clean+scrubbing+deep
             2    active+recovery_wait+degraded
             1    active+clean+scrubbing

  io:
    client:   318KiB/s rd, 77.0KiB/s wr, 190op/s rd, 0op/s wr


and ceph health detail:

HEALTH_WARN 3/125481112 objects unfound (0.000%); Degraded dataredundancy: 3/497011315 objects degraded (0.000%), 2 p

gs degraded
OBJECT_UNFOUND 3/125481112 objects unfound (0.000%)
    pg 1.24c has 1 unfound objects
    pg 1.779 has 2 unfound objects

PG_DEGRADED Degraded data redundancy: 3/497011315 objects degraded(0.000%), 2 pgs degraded pg 1.24c is active+recovery_wait+degraded, acting [32,4,61,36], 1unfound pg 1.779 is active+recovery_wait+degraded, acting [50,4,77,62], 2unfound

also the status changed form HEALTH_ERR to HEALTH_WARN. We also did cephosd down for all OSDs of the degraded PGs. Do you have any furthersuggestions on how to proceed?


On 23.05.19 11:08 vorm., Dan van der Ster wrote:

I think those osds (1, 11, 21, 32, ...) need a little kick to re-peer
their degraded PGs.

Open a window with `watch ceph -s`, then in another window slowly do

     ceph osd down 1
     # then wait a minute or so for that osd.1 to re-peer fully.
     ceph osd down 11
     ...

Continue that for each of the osds with stuck requests, or until there
are no more recovery_wait/degraded PGs.

After each `ceph osd down...`, you should expect to see several PGs
re-peer, and then ideally the slow requests will disappear and the
degraded PGs will become active+clean.
If anything else happens, you should stop and let us know.


-- dan

On Thu, May 23, 2019 at 10:59 AM Kevin Flöh <kevin.fl...@kit.edu> wrote:

This is the current status of ceph:


    cluster:
      id:     23e72372-0d44-4cad-b24f-3641b14b86f4
      health: HEALTH_ERR
              9/125481144 objects unfound (0.000%)
              Degraded data redundancy: 9/497011417 objects degraded
(0.000%), 7 pgs degraded
              9 stuck requests are blocked > 4096 sec. Implicated osds
1,11,21,32,43,50,65

    services:
      mon: 3 daemons, quorum ceph-node03,ceph-node01,ceph-node02
      mgr: ceph-node01(active), standbys: ceph-node01.etp.kit.edu
      mds: cephfs-1/1/1 up  {0=ceph-node03.etp.kit.edu=up:active}, 3
up:standby
      osd: 96 osds: 96 up, 96 in

    data:
      pools:   2 pools, 4096 pgs
      objects: 125.48M objects, 259TiB
      usage:   370TiB used, 154TiB / 524TiB avail
      pgs:     9/497011417 objects degraded (0.000%)
               9/125481144 objects unfound (0.000%)
               4078 active+clean
               11   active+clean+scrubbing+deep
               7    active+recovery_wait+degraded

    io:
      client:   211KiB/s rd, 46.0KiB/s wr, 158op/s rd, 0op/s wr

On 23.05.19 10:54 vorm., Dan van der Ster wrote:

What's the full ceph status?
Normally recovery_wait just means that the relevant osd's are busy
recovering/backfilling another PG.

On Thu, May 23, 2019 at 10:53 AM Kevin Flöh <kevin.fl...@kit.edu> wrote:

Hi,

we have set the PGs to recover and now they are stuck in 
active+recovery_wait+degraded and instructing them to deep-scrub does not 
change anything. Hence, the rados report is empty. Is there a way to stop the 
recovery wait to start the deep-scrub and get the output? I guess the 
recovery_wait might be caused by missing objects. Do we need to delete them 
first to get the recovery going?

Kevin

On 22.05.19 6:03 nachm., Robert LeBlanc wrote:

On Wed, May 22, 2019 at 4:31 AM Kevin Flöh <kevin.fl...@kit.edu> wrote:

Hi,

thank you, it worked. The PGs are not incomplete anymore. Still we have
another problem, there are 7 PGs inconsistent and a cpeh pg repair is
not doing anything. I just get "instructing pg 1.5dd on osd.24 to
repair" and nothing happens. Does somebody know how we can get the PGs
to repair?

Regards,

Kevin

Kevin,

I just fixed an inconsistent PG yesterday. You will need to figure out why they 
are inconsistent. Do these steps and then we can figure out how to proceed.
1. Do a deep-scrub on each PG that is inconsistent. (This may fix some of them)
2. Print out the inconsistent report for each inconsistent PG. `rados 
list-inconsistent-obj <PG_NUM> --format=json-pretty`
3. You will want to look at the error messages and see if all the shards have 
the same data.

Robert LeBlanc


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Major ceph disaster

Reply via email to