Re: [ceph-users] pgs inconsistent

Ronny Aasen Fri, 16 Aug 2019 01:46:28 -0700

On 15.08.2019 16:38, huxia...@horebdata.cn wrote:

Dear folks,

I had a Ceph cluster with replication 2, 3 nodes, each node with 3 OSDs,on Luminous 12.2.12. Some days ago i had one OSD down (the disk is stillfine) due to some errors on rocksdb crash. I tried to restart that OSDbut failed. So I tried to rebalance but encountered PGs inconsistent.


what can i do to make the cluster working again?

thanks a lot for helping me out

Samuel

**********************************************************************************
# ceph -s
   cluster:
     id:     289e3afa-f188-49b0-9bea-1ab57cc2beb8
     health: HEALTH_ERR
             pauserd,pausewr,noout flag(s) set
             191444 scrub errors
             Possible data damage: 376 pgs inconsistent
   services:
     mon: 3 daemons, quorum horeb71,horeb72,horeb73
     mgr: horeb73(active), standbys: horeb71, horeb72
     osd: 9 osds: 8 up, 8 in
          flags pauserd,pausewr,noout
   data:
     pools:   1 pools, 1024 pgs
     objects: 524.29k objects, 1.99TiB
     usage:   3.67TiB used, 2.58TiB / 6.25TiB avail
     pgs:     645 active+clean
              376 active+clean+inconsistent
              3   active+clean+scrubbing+deep

that was a lot of inconsistent pg's. When you say replication = 2 do youmean you have 2 copies as in size=3 min-size=2 , or that you have size=2min-size=1 ?

the reason i ask is that min-size=1 is a well known way to get into lotsof problems. (one disk can accept a write alone, and before it isrecoverd/backfilled the drive can die)

if you have min-size=1 i would recommend you set min-size=2 as the firststep, to avoid creating more inconsistency while troubleshooting. if youhave the space for it in the cluster you should also set size=3

if you run "#ceph health detail" you will get a list of the pg's thatare inconsistent. check if there is a repeat offender osd in that listof pg's, and check that disk for issues. check dmesg and logs of theosd, and if there are smart errors.

You can try to repair the inconsistent pg's automagically by running thecommand "#ceph pg repair [pg id]" but make sure the hardware is goodfirst.



good luck
Ronny


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] pgs inconsistent

Reply via email to