Resolved.

After all of the involved OSDs had been down for a while, I brought them
back up and issued another ceph pg repair. We are clean now.

On Sun, Jun 25, 2017 at 11:54 PM, Brady Deetz <[email protected]> wrote:

> I should have mentioned, I'm running ceph jewel 10.2.7
>
> On Sun, Jun 25, 2017 at 11:46 PM, Brady Deetz <[email protected]> wrote:
>
>> Over the course of the past year, I've had 3 instances where I had to
>> manually repair an object due to size. In this case, I was immediately
>> disappointed to discover what I think is evidence of only 1 of 3 replicas
>> good. It got worse when a segfault occurred I attempted to flush the
>> journal for one of the seemingly bad replicas.
>>
>> Below is a segfault from ceph-osd -i 160 --flush-journal
>> https://pastebin.com/GQkCn9T9
>>
>> More logs and command history can be found here:
>> https://pastebin.com/5knjNTd0
>>
>> So far, I've copied the object file to a tmp backup location, set noout,
>> stopped the osd service for the associated osds for that pg, flushed the
>> journals, and made a second copy of the objects post flush.
>>
>> Any help would be greatly appreciated.
>>
>> I'm considering just deleting the 2 known bad files and attempting a ceph
>> pg repair. But, I'm not really sure that will work with only 1 good replica.
>>
>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to