Re: [ceph-users] Stuck PGs blocked_by non-existent OSDs
Thanks Sam, I'll take a look. Seems sensible enough and worth a shot. We'll probably call it a day after this and flatten in, but I'm wondering if it's possible some rbd devices may miss these pg's and could be exportable? Will have a tinker! On Wed, Mar 11, 2015 at 7:06 PM, Samuel Just sj...@redhat.com wrote: For each of those pgs, you'll need to identify the pg copy you want to be the winner and either 1) Remove all of the other ones using ceph-objectstore-tool and hopefully the winner you left alone will allow the pg to recover and go active. 2) Export the winner using ceph-objectstore-tool, use ceph-objectstore-tool to delete *all* copies of the pg, use force_create_pg to recreate the pg empty, use ceph-objectstore-tool to do a rados import on the exported pg copy. Also, the pgs which are still down still have replicas which need to be brought back or marked lost. -Sam On 03/11/2015 07:29 AM, joel.merr...@gmail.com wrote: I'd like to not have to null them if possible, there's nothing outlandishly valuable, its more the time to reprovision (users have stuff on there, mainly testing but I have a nasty feeling some users won't have backed up their test instances). When you say complicated and fragile, could you expand? Thanks again! Joel On Wed, Mar 11, 2015 at 1:21 PM, Samuel Just sj...@redhat.com wrote: Ok, you lost all copies from an interval where the pgs went active. The recovery from this is going to be complicated and fragile. Are the pools valuable? -Sam On 03/11/2015 03:35 AM, joel.merr...@gmail.com wrote: For clarity too, I've tried to drop the min_size before as suggested, doesn't make a difference unfortunately On Wed, Mar 11, 2015 at 9:50 AM, joel.merr...@gmail.com joel.merr...@gmail.com wrote: Sure thing, n.b. I increased pg count to see if it would help. Alas not. :) Thanks again! health_detail https://gist.github.com/199bab6d3a9fe30fbcae osd_dump https://gist.github.com/499178c542fa08cc33bb osd_tree https://gist.github.com/02b62b2501cbd684f9b2 Random selected queries: queries/0.19.query https://gist.github.com/f45fea7c85d6e665edf8 queries/1.a1.query https://gist.github.com/dd68fbd5e862f94eb3be queries/7.100.query https://gist.github.com/d4fd1fb030c6f2b5e678 queries/7.467.query https://gist.github.com/05dbcdc9ee089bd52d0c On Tue, Mar 10, 2015 at 2:49 PM, Samuel Just sj...@redhat.com wrote: Yeah, get a ceph pg query on one of the stuck ones. -Sam On Tue, 2015-03-10 at 14:41 +, joel.merr...@gmail.com wrote: Stuck unclean and stuck inactive. I can fire up a full query and health dump somewhere useful if you want (full pg query info on ones listed in health detail, tree, osd dump etc). There were blocked_by operations that no longer exist after doing the OSD addition. Side note, spent some time yesterday writing some bash to do this programatically (might be useful to others, will throw on github) On Tue, Mar 10, 2015 at 1:41 PM, Samuel Just sj...@redhat.com wrote: What do you mean by unblocked but still stuck? -Sam On Mon, 2015-03-09 at 22:54 +, joel.merr...@gmail.com wrote: On Mon, Mar 9, 2015 at 2:28 PM, Samuel Just sj...@redhat.com wrote: You'll probably have to recreate osds with the same ids (empty ones), let them boot, stop them, and mark them lost. There is a feature in the tracker to improve this behavior: http://tracker.ceph.com/issues/10976 -Sam Thanks Sam, I've readded the OSDs, they became unblocked but there are still the same number of pgs stuck. I looked at them in some more detail and it seems they all have num_bytes='0'. Tried a repair too, for good measure. Still nothing I'm afraid. Does this mean some underlying catastrophe has happened and they are never going to recover? Following on, would that cause data loss. There are no missing objects and I'm hoping there's appropriate checksumming / replicas to balance that out, but now I'm not so sure. Thanks again, Joel -- $ echo kpfmAdpoofdufevq/dp/vl | perl -pe 's/(.)/chr(ord($1)-1)/ge' -- $ echo kpfmAdpoofdufevq/dp/vl | perl -pe 's/(.)/chr(ord($1)-1)/ge' ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Stuck PGs blocked_by non-existent OSDs
For each of those pgs, you'll need to identify the pg copy you want to be the winner and either 1) Remove all of the other ones using ceph-objectstore-tool and hopefully the winner you left alone will allow the pg to recover and go active. 2) Export the winner using ceph-objectstore-tool, use ceph-objectstore-tool to delete *all* copies of the pg, use force_create_pg to recreate the pg empty, use ceph-objectstore-tool to do a rados import on the exported pg copy. Also, the pgs which are still down still have replicas which need to be brought back or marked lost. -Sam On 03/11/2015 07:29 AM, joel.merr...@gmail.com wrote: I'd like to not have to null them if possible, there's nothing outlandishly valuable, its more the time to reprovision (users have stuff on there, mainly testing but I have a nasty feeling some users won't have backed up their test instances). When you say complicated and fragile, could you expand? Thanks again! Joel On Wed, Mar 11, 2015 at 1:21 PM, Samuel Just sj...@redhat.com wrote: Ok, you lost all copies from an interval where the pgs went active. The recovery from this is going to be complicated and fragile. Are the pools valuable? -Sam On 03/11/2015 03:35 AM, joel.merr...@gmail.com wrote: For clarity too, I've tried to drop the min_size before as suggested, doesn't make a difference unfortunately On Wed, Mar 11, 2015 at 9:50 AM, joel.merr...@gmail.com joel.merr...@gmail.com wrote: Sure thing, n.b. I increased pg count to see if it would help. Alas not. :) Thanks again! health_detail https://gist.github.com/199bab6d3a9fe30fbcae osd_dump https://gist.github.com/499178c542fa08cc33bb osd_tree https://gist.github.com/02b62b2501cbd684f9b2 Random selected queries: queries/0.19.query https://gist.github.com/f45fea7c85d6e665edf8 queries/1.a1.query https://gist.github.com/dd68fbd5e862f94eb3be queries/7.100.query https://gist.github.com/d4fd1fb030c6f2b5e678 queries/7.467.query https://gist.github.com/05dbcdc9ee089bd52d0c On Tue, Mar 10, 2015 at 2:49 PM, Samuel Just sj...@redhat.com wrote: Yeah, get a ceph pg query on one of the stuck ones. -Sam On Tue, 2015-03-10 at 14:41 +, joel.merr...@gmail.com wrote: Stuck unclean and stuck inactive. I can fire up a full query and health dump somewhere useful if you want (full pg query info on ones listed in health detail, tree, osd dump etc). There were blocked_by operations that no longer exist after doing the OSD addition. Side note, spent some time yesterday writing some bash to do this programatically (might be useful to others, will throw on github) On Tue, Mar 10, 2015 at 1:41 PM, Samuel Just sj...@redhat.com wrote: What do you mean by unblocked but still stuck? -Sam On Mon, 2015-03-09 at 22:54 +, joel.merr...@gmail.com wrote: On Mon, Mar 9, 2015 at 2:28 PM, Samuel Just sj...@redhat.com wrote: You'll probably have to recreate osds with the same ids (empty ones), let them boot, stop them, and mark them lost. There is a feature in the tracker to improve this behavior: http://tracker.ceph.com/issues/10976 -Sam Thanks Sam, I've readded the OSDs, they became unblocked but there are still the same number of pgs stuck. I looked at them in some more detail and it seems they all have num_bytes='0'. Tried a repair too, for good measure. Still nothing I'm afraid. Does this mean some underlying catastrophe has happened and they are never going to recover? Following on, would that cause data loss. There are no missing objects and I'm hoping there's appropriate checksumming / replicas to balance that out, but now I'm not so sure. Thanks again, Joel -- $ echo kpfmAdpoofdufevq/dp/vl | perl -pe 's/(.)/chr(ord($1)-1)/ge' ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Stuck PGs blocked_by non-existent OSDs
For clarity too, I've tried to drop the min_size before as suggested, doesn't make a difference unfortunately On Wed, Mar 11, 2015 at 9:50 AM, joel.merr...@gmail.com joel.merr...@gmail.com wrote: Sure thing, n.b. I increased pg count to see if it would help. Alas not. :) Thanks again! health_detail https://gist.github.com/199bab6d3a9fe30fbcae osd_dump https://gist.github.com/499178c542fa08cc33bb osd_tree https://gist.github.com/02b62b2501cbd684f9b2 Random selected queries: queries/0.19.query https://gist.github.com/f45fea7c85d6e665edf8 queries/1.a1.query https://gist.github.com/dd68fbd5e862f94eb3be queries/7.100.query https://gist.github.com/d4fd1fb030c6f2b5e678 queries/7.467.query https://gist.github.com/05dbcdc9ee089bd52d0c On Tue, Mar 10, 2015 at 2:49 PM, Samuel Just sj...@redhat.com wrote: Yeah, get a ceph pg query on one of the stuck ones. -Sam On Tue, 2015-03-10 at 14:41 +, joel.merr...@gmail.com wrote: Stuck unclean and stuck inactive. I can fire up a full query and health dump somewhere useful if you want (full pg query info on ones listed in health detail, tree, osd dump etc). There were blocked_by operations that no longer exist after doing the OSD addition. Side note, spent some time yesterday writing some bash to do this programatically (might be useful to others, will throw on github) On Tue, Mar 10, 2015 at 1:41 PM, Samuel Just sj...@redhat.com wrote: What do you mean by unblocked but still stuck? -Sam On Mon, 2015-03-09 at 22:54 +, joel.merr...@gmail.com wrote: On Mon, Mar 9, 2015 at 2:28 PM, Samuel Just sj...@redhat.com wrote: You'll probably have to recreate osds with the same ids (empty ones), let them boot, stop them, and mark them lost. There is a feature in the tracker to improve this behavior: http://tracker.ceph.com/issues/10976 -Sam Thanks Sam, I've readded the OSDs, they became unblocked but there are still the same number of pgs stuck. I looked at them in some more detail and it seems they all have num_bytes='0'. Tried a repair too, for good measure. Still nothing I'm afraid. Does this mean some underlying catastrophe has happened and they are never going to recover? Following on, would that cause data loss. There are no missing objects and I'm hoping there's appropriate checksumming / replicas to balance that out, but now I'm not so sure. Thanks again, Joel -- $ echo kpfmAdpoofdufevq/dp/vl | perl -pe 's/(.)/chr(ord($1)-1)/ge' -- $ echo kpfmAdpoofdufevq/dp/vl | perl -pe 's/(.)/chr(ord($1)-1)/ge' ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Stuck PGs blocked_by non-existent OSDs
Ok, you lost all copies from an interval where the pgs went active. The recovery from this is going to be complicated and fragile. Are the pools valuable? -Sam On 03/11/2015 03:35 AM, joel.merr...@gmail.com wrote: For clarity too, I've tried to drop the min_size before as suggested, doesn't make a difference unfortunately On Wed, Mar 11, 2015 at 9:50 AM, joel.merr...@gmail.com joel.merr...@gmail.com wrote: Sure thing, n.b. I increased pg count to see if it would help. Alas not. :) Thanks again! health_detail https://gist.github.com/199bab6d3a9fe30fbcae osd_dump https://gist.github.com/499178c542fa08cc33bb osd_tree https://gist.github.com/02b62b2501cbd684f9b2 Random selected queries: queries/0.19.query https://gist.github.com/f45fea7c85d6e665edf8 queries/1.a1.query https://gist.github.com/dd68fbd5e862f94eb3be queries/7.100.query https://gist.github.com/d4fd1fb030c6f2b5e678 queries/7.467.query https://gist.github.com/05dbcdc9ee089bd52d0c On Tue, Mar 10, 2015 at 2:49 PM, Samuel Just sj...@redhat.com wrote: Yeah, get a ceph pg query on one of the stuck ones. -Sam On Tue, 2015-03-10 at 14:41 +, joel.merr...@gmail.com wrote: Stuck unclean and stuck inactive. I can fire up a full query and health dump somewhere useful if you want (full pg query info on ones listed in health detail, tree, osd dump etc). There were blocked_by operations that no longer exist after doing the OSD addition. Side note, spent some time yesterday writing some bash to do this programatically (might be useful to others, will throw on github) On Tue, Mar 10, 2015 at 1:41 PM, Samuel Just sj...@redhat.com wrote: What do you mean by unblocked but still stuck? -Sam On Mon, 2015-03-09 at 22:54 +, joel.merr...@gmail.com wrote: On Mon, Mar 9, 2015 at 2:28 PM, Samuel Just sj...@redhat.com wrote: You'll probably have to recreate osds with the same ids (empty ones), let them boot, stop them, and mark them lost. There is a feature in the tracker to improve this behavior: http://tracker.ceph.com/issues/10976 -Sam Thanks Sam, I've readded the OSDs, they became unblocked but there are still the same number of pgs stuck. I looked at them in some more detail and it seems they all have num_bytes='0'. Tried a repair too, for good measure. Still nothing I'm afraid. Does this mean some underlying catastrophe has happened and they are never going to recover? Following on, would that cause data loss. There are no missing objects and I'm hoping there's appropriate checksumming / replicas to balance that out, but now I'm not so sure. Thanks again, Joel -- $ echo kpfmAdpoofdufevq/dp/vl | perl -pe 's/(.)/chr(ord($1)-1)/ge' ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Stuck PGs blocked_by non-existent OSDs
What do you mean by unblocked but still stuck? -Sam On Mon, 2015-03-09 at 22:54 +, joel.merr...@gmail.com wrote: On Mon, Mar 9, 2015 at 2:28 PM, Samuel Just sj...@redhat.com wrote: You'll probably have to recreate osds with the same ids (empty ones), let them boot, stop them, and mark them lost. There is a feature in the tracker to improve this behavior: http://tracker.ceph.com/issues/10976 -Sam Thanks Sam, I've readded the OSDs, they became unblocked but there are still the same number of pgs stuck. I looked at them in some more detail and it seems they all have num_bytes='0'. Tried a repair too, for good measure. Still nothing I'm afraid. Does this mean some underlying catastrophe has happened and they are never going to recover? Following on, would that cause data loss. There are no missing objects and I'm hoping there's appropriate checksumming / replicas to balance that out, but now I'm not so sure. Thanks again, Joel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Stuck PGs blocked_by non-existent OSDs
Stuck unclean and stuck inactive. I can fire up a full query and health dump somewhere useful if you want (full pg query info on ones listed in health detail, tree, osd dump etc). There were blocked_by operations that no longer exist after doing the OSD addition. Side note, spent some time yesterday writing some bash to do this programatically (might be useful to others, will throw on github) On Tue, Mar 10, 2015 at 1:41 PM, Samuel Just sj...@redhat.com wrote: What do you mean by unblocked but still stuck? -Sam On Mon, 2015-03-09 at 22:54 +, joel.merr...@gmail.com wrote: On Mon, Mar 9, 2015 at 2:28 PM, Samuel Just sj...@redhat.com wrote: You'll probably have to recreate osds with the same ids (empty ones), let them boot, stop them, and mark them lost. There is a feature in the tracker to improve this behavior: http://tracker.ceph.com/issues/10976 -Sam Thanks Sam, I've readded the OSDs, they became unblocked but there are still the same number of pgs stuck. I looked at them in some more detail and it seems they all have num_bytes='0'. Tried a repair too, for good measure. Still nothing I'm afraid. Does this mean some underlying catastrophe has happened and they are never going to recover? Following on, would that cause data loss. There are no missing objects and I'm hoping there's appropriate checksumming / replicas to balance that out, but now I'm not so sure. Thanks again, Joel -- $ echo kpfmAdpoofdufevq/dp/vl | perl -pe 's/(.)/chr(ord($1)-1)/ge' ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Stuck PGs blocked_by non-existent OSDs
Yeah, get a ceph pg query on one of the stuck ones. -Sam On Tue, 2015-03-10 at 14:41 +, joel.merr...@gmail.com wrote: Stuck unclean and stuck inactive. I can fire up a full query and health dump somewhere useful if you want (full pg query info on ones listed in health detail, tree, osd dump etc). There were blocked_by operations that no longer exist after doing the OSD addition. Side note, spent some time yesterday writing some bash to do this programatically (might be useful to others, will throw on github) On Tue, Mar 10, 2015 at 1:41 PM, Samuel Just sj...@redhat.com wrote: What do you mean by unblocked but still stuck? -Sam On Mon, 2015-03-09 at 22:54 +, joel.merr...@gmail.com wrote: On Mon, Mar 9, 2015 at 2:28 PM, Samuel Just sj...@redhat.com wrote: You'll probably have to recreate osds with the same ids (empty ones), let them boot, stop them, and mark them lost. There is a feature in the tracker to improve this behavior: http://tracker.ceph.com/issues/10976 -Sam Thanks Sam, I've readded the OSDs, they became unblocked but there are still the same number of pgs stuck. I looked at them in some more detail and it seems they all have num_bytes='0'. Tried a repair too, for good measure. Still nothing I'm afraid. Does this mean some underlying catastrophe has happened and they are never going to recover? Following on, would that cause data loss. There are no missing objects and I'm hoping there's appropriate checksumming / replicas to balance that out, but now I'm not so sure. Thanks again, Joel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Stuck PGs blocked_by non-existent OSDs
You'll probably have to recreate osds with the same ids (empty ones), let them boot, stop them, and mark them lost. There is a feature in the tracker to improve this behavior: http://tracker.ceph.com/issues/10976 -Sam On Mon, 2015-03-09 at 12:24 +, joel.merr...@gmail.com wrote: Hi, I'm trying to fix an issue within 0.93 on our internal cloud related to incomplete pg's (yes, I realise the folly of having the dev release - it's a not-so-test env now, so I need to recover this really). I'll detail the current outage info; 72 initial (now 65) OSDs 6 nodes * Update to 0.92 from Giant. * Fine for a day * MDS outage overnight and subsequent node failure * Massive increase in RAM utilisation (10G per OSD!) * More failure * OSD's 'out' to try to alleviate new large cluster requirements and a couple died under additional load * 'superfluous and faulty' OSD's rm, auth keys deleted * RAM added to nodes (96GB each - serving 10-12 OSDs) * Ugrade to 0.93 * Fix broken journals due to 0.92 update * No more missing objects or degredation So, that brings me to today, I still have 73/2264 PGs listed as stuck incomplete/inactive. I also have requests that are blocked. Upon querying said placement groups, I notice that they are 'blocked_by' non-existent OSDs (ones I have removed due to issues). I have no way to tell them the OSD is lost (as it'a already been removed, both from osdmap and crushmap). Exporting the crushmap shows non-existant OSDs as deviceN (i.e. device36 for the removed osd.36) Deleting those and reimporting crush map makes no affect Some further pg detail - https://gist.github.com/joelio/cecca9b48aca6d44451b So I'm stuck, I can't recover the pg's as I can't remove a non-existent OSD that the PG think's blocking it. Help graciously accepted! Joel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Stuck PGs blocked_by non-existent OSDs
On Mon, Mar 9, 2015 at 2:28 PM, Samuel Just sj...@redhat.com wrote: You'll probably have to recreate osds with the same ids (empty ones), let them boot, stop them, and mark them lost. There is a feature in the tracker to improve this behavior: http://tracker.ceph.com/issues/10976 -Sam Thanks Sam, I've readded the OSDs, they became unblocked but there are still the same number of pgs stuck. I looked at them in some more detail and it seems they all have num_bytes='0'. Tried a repair too, for good measure. Still nothing I'm afraid. Does this mean some underlying catastrophe has happened and they are never going to recover? Following on, would that cause data loss. There are no missing objects and I'm hoping there's appropriate checksumming / replicas to balance that out, but now I'm not so sure. Thanks again, Joel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com