Looks similar to a problem I had after a several OSDs crashed while
trimming snapshots. In my case, the primary OSD thought the snapshot was
gone, but some of the replicas are still there, so scrubbing flags it.

First I purged all snapshots and then ran ceph pg repair on the
problematic placement groups. The first time I encountered this, that
action was sufficient to repair the problem. The second time however, I
ended up having to manually remove the snapshot objects.

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-June/027431.html

Once I had done that, repair the placement group fixed the issue.

-Steve

On 11/16/2018 04:00 AM, Marc Roos wrote:
>  
>
> I am not sure that is going to work, because I have this error quite 
> some time, from before I added the 4th node. And on the 3 node cluster 
> it was:
>  
> osdmap e18970 pg 17.36 (17.36) -> up [9,0,12] acting [9,0,12]
>
> If I understand correctly what you intent to do, moving the data around. 
> This was sort of accomplished by adding the 4th node.
>
>
>
> -----Original Message-----
> From: Frank Yu [mailto:flyxia...@gmail.com] 
> Sent: vrijdag 16 november 2018 3:51
> To: Marc Roos
> Cc: ceph-users
> Subject: Re: [ceph-users] pg 17.36 is active+clean+inconsistent head 
> expected clone 1 missing?
>
> try to restart osd.29, then use pg repair. If this doesn't work or it 
> appear again after a while, scan your HDD which used for osd.29, maybe 
> there is bad sector of your disks, just replace the disk with new one.
>
>
>
> On Thu, Nov 15, 2018 at 5:00 PM Marc Roos <m.r...@f1-outsourcing.eu> 
> wrote:
>
>
>        
>       Forgot, these are bluestore osds
>       
>       
>       
>       -----Original Message-----
>       From: Marc Roos 
>       Sent: donderdag 15 november 2018 9:59
>       To: ceph-users
>       Subject: [ceph-users] pg 17.36 is active+clean+inconsistent head 
>       expected clone 1 missing?
>       
>       
>       
>       I thought I will give it another try, asking again here since there 
> is 
>       another thread current. I am having this error since a year or so.
>       
>       This I of course already tried:
>       ceph pg deep-scrub 17.36
>       ceph pg repair 17.36
>       
>       
>       [@c01 ~]# rados list-inconsistent-obj 17.36 
>       {"epoch":24363,"inconsistents":[]}
>       
>       
>       [@c01 ~]# ceph pg map 17.36
>       osdmap e24380 pg 17.36 (17.36) -> up [29,12,6] acting [29,12,6]
>       
>       
>       [@c04 ceph]# zgrep ERR ceph-osd.29.log*gz
>       ceph-osd.29.log-20181114.gz:2018-11-13 14:19:55.766604 7f25a05b1700 
> -1
>       log_channel(cluster) log [ERR] : deep-scrub 17.36 
>       17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:head 
> expected 
>       clone 17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4 1 
> missing
>       ceph-osd.29.log-20181114.gz:2018-11-13 14:24:55.943454 7f25a05b1700 
> -1
>       log_channel(cluster) log [ERR] : 17.36 deep-scrub 1 errors
>       
>       
>       _______________________________________________
>       ceph-users mailing list
>       ceph-users@lists.ceph.com
>       http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>       
>       
>       _______________________________________________
>       ceph-users mailing list
>       ceph-users@lists.ceph.com
>       http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>       
>
>
>

-- 
Steve Anthony
LTS HPC Senior Analyst
Lehigh University
sma...@lehigh.edu


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to