Re: [ceph-users] Ceph Recovery Assistance, pgs stuck peering

2016-03-08 Thread David Zafman
I expected it to return to osd.36. Oh, if you set "noout" during this process then the pg won't move around when you down osd.36. I expected osd.36 to go down and back up quickly. Also, the pg 10.4f is the same situation, so try the same thing on osd.6. David On 3/8/16 1:05 PM, Ben Hines

Re: [ceph-users] Ceph Recovery Assistance, pgs stuck peering

2016-03-08 Thread Ben Hines
After making that setting, the pg appeared to start peering but then it actually changed the primary OSD to osd.100 - then went incomplete again. Perhaps it did that because another OSD had more data? I presume i need to set that value on each osd where the pg hops to. -Ben On Tue, Mar 8, 2016

Re: [ceph-users] Ceph Recovery Assistance, pgs stuck peering

2016-03-08 Thread David Zafman
Ben, I haven't look at everything in your message, but pg 12.7a1 has lost data because of writes that went only to osd.73. The way to recover this is to force recovery to ignore this fact and go with whatever data you have on the remaining OSDs. I assume that having min_size 1, having

[ceph-users] Ceph Recovery Assistance, pgs stuck peering

2016-03-07 Thread Ben Hines
Howdy, I was hoping someone could help me recover a couple pgs which are causing problems in my cluster. If we aren't able to resolve this soon, we may have to just destroy them and lose some data. Recovery has so far been unsuccessful. Data loss would probably cause some here to reconsider Ceph