Resolved. After all of the involved OSDs had been down for a while, I brought them back up and issued another ceph pg repair. We are clean now.
On Sun, Jun 25, 2017 at 11:54 PM, Brady Deetz <[email protected]> wrote: > I should have mentioned, I'm running ceph jewel 10.2.7 > > On Sun, Jun 25, 2017 at 11:46 PM, Brady Deetz <[email protected]> wrote: > >> Over the course of the past year, I've had 3 instances where I had to >> manually repair an object due to size. In this case, I was immediately >> disappointed to discover what I think is evidence of only 1 of 3 replicas >> good. It got worse when a segfault occurred I attempted to flush the >> journal for one of the seemingly bad replicas. >> >> Below is a segfault from ceph-osd -i 160 --flush-journal >> https://pastebin.com/GQkCn9T9 >> >> More logs and command history can be found here: >> https://pastebin.com/5knjNTd0 >> >> So far, I've copied the object file to a tmp backup location, set noout, >> stopped the osd service for the associated osds for that pg, flushed the >> journals, and made a second copy of the objects post flush. >> >> Any help would be greatly appreciated. >> >> I'm considering just deleting the 2 known bad files and attempting a ceph >> pg repair. But, I'm not really sure that will work with only 1 good replica. >> > >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
