Re: [ceph-users] Understanding incomplete PGs

Caspar Smit Fri, 05 Jul 2019 02:29:41 -0700

Kyle,

Was the cluster still backfilling when you removed osd 6 or did you only
check its utilization?


Running an EC pool with m=1 is a bad idea. EC pool min_size = k+1 so losing
a single OSD results in inaccessible data.
Your incomplete PG's are probably all EC pool pgs, please verify.

If the above statement is true, you could *temporarily* set min_size to 2
(on your EC pools) to get back access to your data again but this is a very
dangerous action. Losing another OSD during this period results in actual
data loss.

Kind regards,
Caspar Smit

Op vr 5 jul. 2019 om 01:17 schreef Kyle <[email protected]>:

> Hello,
>
> I'm working with a small ceph cluster (about 10TB, 7-9 OSDs, all Bluestore
> on
> lvm) and recently ran into a problem with 17 pgs marked as incomplete
> after
> adding/removing OSDs.
>
> Here's the sequence of events:
> 1. 7 osds in the cluster, health is OK, all pgs are active+clean
> 2. 3 new osds on a new host are added, lots of backfilling in progress
> 3. osd 6 needs to be removed, so we do "ceph osd crush reweight osd.6 0"
> 4. after a few hours we see "min osd.6 with 0 pgs" from "ceph osd
> utilization"
> 5. ceph osd out 6
> 6. systemctl stop ceph-osd@6
> 7. the drive backing osd 6 is pulled and wiped
> 8. backfilling has now finished all pgs are active+clean except for 17
> incomplete pgs
>
> From reading the docs, it sounds like there has been unrecoverable data
> loss
> in those 17 pgs. That raises some questions for me:
>
> Was "ceph osd utilization" only showing a goal of 0 pgs allocated instead
> of
> the current actual allocation?
>
> Why is there data loss from a single osd being removed? Shouldn't that be
> recoverable?
> All pools in the cluster are either replicated 3 or erasure-coded k=2,m=1
> with
> default "host" failure domain. They shouldn't suffer data loss with a
> single
> osd being removed even if there were no reweighting beforehand. Does the
> backfilling temporarily reduce data durability in some way?
>
> Is there a way to see which pgs actually have data on a given osd?
>
> I attached an example of one of the incomplete pgs.
>
> Thanks for any help,
>
> Kyle_______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Understanding incomplete PGs

Reply via email to