I am running 10.2.0-0ubuntu0.16.04.1. I've run into a problem w/ cephfs metadata pool. Specifically I have a pg w/ an 'unfound' object.
But i can't figure out which since when i run: ceph pg 12.94 list_unfound it hangs (as does ceph pg 12.94 query). I know its in the cephfs metadata pool since I run: ceph pg ls-by-pool cephfs_metadata |egrep "pg_stat|12\\.94" and it shows it there: pg_stat objects mip degr misp unf bytes log disklog state state_stamp v reported up up_primary acting acting_primary last_scrub scrub_stamp last_deep_scrub deep_scrub_stamp 12.94 231 1 1 0 1 90 3092 3092 active+recovering+degraded 2016-05-18 23:49:15.718772 8957'386130 9472:367098 [1,4] 1 [1,4] 1 8935'385144 2016-05-18 10:46:46.123526 8337'379527 2016-05-14 22:37:05.974367 OK, so what is hanging, and how can i get it to unhang so i can run a 'mark_unfound_lost' on it? pg 12.94 is on osd.0 ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 5.48996 root default -2 0.89999 host nubo-1 0 0.89999 osd.0 up 1.00000 1.00000 -3 0.89999 host nubo-2 1 0.89999 osd.1 up 1.00000 1.00000 -4 0.89999 host nubo-3 2 0.89999 osd.2 up 1.00000 1.00000 -5 0.92999 host nubo-19 3 0.92999 osd.3 up 1.00000 1.00000 -6 0.92999 host nubo-20 4 0.92999 osd.4 up 1.00000 1.00000 -7 0.92999 host nubo-21 5 0.92999 osd.5 up 1.00000 1.00000 I cranked the logging on osd.0. I see a lot of messages, but nothing interesting. I've double checked all nodes can ping each other. I've run 'xfs_repair' on the underlying xfs storage to check for issues (there were none). Can anyone suggest how to uncrack this hang so i can try and repair this system?
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com