Thanks Marc,

When I next have physical access to the cluster, I’ll add some more OSDs. Would 
that cause the hanging though?

No takers on the bluestore salvage?

thanks,
rik.

> On 20 Jan 2019, at 20:36, Marc Roos <[email protected]> wrote:
> 
> 
> If you have a backfillfull, no pg's will be able to migrate. 
> Better is to just add harddrives, because at least one of your osd's is 
> to full.
> 
> I know you can set the backfillfull ratio's with commands like these
> ceph tell osd.* injectargs '--mon_osd_full_ratio=0.970000'
> ceph tell osd.* injectargs '--mon_osd_backfillfull_ratio=0.950000'
> 
> ceph tell osd.* injectargs '--mon_osd_full_ratio=0.950000'
> ceph tell osd.* injectargs '--mon_osd_backfillfull_ratio=0.900000'
> 
> Or maybe decrease the weight of the full osd, check the osds with 'ceph 
> osd status' and make sure your nodes have even distribution of the 
> storage.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Rik [mailto:[email protected]] 
> Sent: zondag 20 januari 2019 8:47
> To: [email protected]
> Subject: [ceph-users] Salvage CEPHFS after lost PG
> 
> Hi all,
> 
> 
> 
> 
> I'm looking for some suggestions on how to do something inappropriate. 
> 
> 
> 
> 
> In a nutshell, I've lost the WAL/DB for three bluestore OSDs on a small 
> cluster and, as a result of those three OSDs going offline, I've lost a 
> placement group (7.a7). How I achieved this feat is an embarrassing 
> mistake, which I don't think has bearing on my question.
> 
> 
> 
> 
> The OSDs were created a few months ago with ceph-deploy:
> 
> /usr/local/bin/ceph-deploy --overwrite-conf osd create --bluestore 
> --data /dev/vdc1 --block-db /dev/vdf1 ceph-a
> 
> 
> 
> 
> With the 3 OSDs out, I'm sitting at OSD_BACKFILLFULL.
> 
> 
> 
> 
> First, the PG 7.a7 belongs to the data pool, rather than the metadata 
> pool and if I run "cephfs-data-scan pg_files / 7.a7" then I get a list 
> of 4149 files/objects but then it hangs. I don't understand why this 
> would hang if it's only the data pool which is impacted (since pg_files 
> only operates on the metadata pool?).
> 
> 
> 
> 
> The ceph-log shows:
> 
> cluster [WRN] slow request 30.894832 seconds old, received at 2019-01-20 
> 18:00:12.555398: client_request(client.25017730:21
> 
> 8006 lookup #0x10001c8ce15/000001 2019-01-20 18:00:12.550421 
> caller_uid=0, caller_gid=0{}) currently failed to rdlock, waiting
> 
> 
> 
> 
> Is the hang perhaps related to the OSD_BACKFILLFULL? If so, I could add 
> some completely new OSDs to fix that problem. I have held off doing that 
> for now as that will trigger a whole lot of data movement which might be 
> unnecessary.
> 
> 
> 
> 
> Or is the hang indeed related to the missing PG?
> 
> 
> 
> 
> Second, if I try to copy files out of the CEPHFS filesystem, I get a few 
> hundred files and then it too hangs. None of the files I’m attempting 
> to copy are listed in the pg_files output (although since the pg_files 
> hangs, perhaps it hadn't got to those files yet). Again, should I not be 
> able to access files which are not associated with the a missing data 
> pool PG?
> 
> 
> 
> 
> Lastly, I want to know if there is some way to recreate the WAL/DB while 
> leaving the OSD data intact and/or fool one of the OSDs into thinking 
> everything is OK, allowing it to serve up the data it has in the missing 
> PG.
> 
> 
> 
> 
> From reading the mailing list and documentation, I know that this is not 
> a "safe" operation:
> 
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021713.html
> 
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-January/024268.html
> 
> 
> 
> 
> However, my current status indicates an unusable CEPHFS and limited 
> access to the data. I'd like to get as much data off it as possible and 
> then I expect to have to recreate it. With a combination of the backups 
> I have and what I can salvage from the cluster, I should hopefully have 
> most of what I need.
> 
> 
> 
> 
> I know what I *should* have done, but now I'm at this point, I know I'm 
> asking for something which would never be required on a properly-run 
> cluster.
> 
> 
> 
> 
> If it really is not possible to get the (possibly corrupt) PG back 
> again, can I get the cluster back so the remainder of the files are 
> accessible?
> 
> 
> 
> 
> Currently running mimic 13.2.4 on all nodes.
> 
> 
> 
> 
> Status:
> 
> $ ceph health detail - 
> https://gist.github.com/kawaja/f59d231179b3186748eca19aae26bcd4
> 
> $ ceph fs get main - 
> https://gist.github.com/kawaja/a7ab0b285d53dee6a950a4310be4fa5a
> 
> 
> 
> 
> Any advice on where I could go from here would be greatly appreciated.
> 
> 
> 
> 
> thanks,
> 
> rik.
> 
> 
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to