Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Sage Weil
On Sun, 3 Feb 2019, Philippe Van Hecke wrote: > Hello, > I'am working for BELNET the Belgian Natioanal Research Network > > We currently a manage a luminous ceph cluster on ubuntu 16.04 > with 144 hdd osd spread across two data centers with 6 osd nodes > on each datacenter. Osd(s) are 4 TB sata

[ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
Hello, I'am working for BELNET the Belgian Natioanal Research Network We currently a manage a luminous ceph cluster on ubuntu 16.04 with 144 hdd osd spread across two data centers with 6 osd nodes on each datacenter. Osd(s) are 4 TB sata disk. Last week we had a network incident and the link

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
From: Sage Weil Sent: 03 February 2019 18:25 To: Philippe Van Hecke Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. On Sun, 3 Feb 2019, Philippe Van Hecke wrote: > Hello, > I'am working

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Sage Weil
On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > Hi Sage, First of all tanks for your help > > Please find here > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 > the osd log with debug info for osd.49. and indeed if all buggy osd can > restart that can may be

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Sage Weil
On Mon, 4 Feb 2019, Sage Weil wrote: > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > Hi Sage, First of all tanks for your help > > > > Please find here > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 Something caused the version number on this PG to reset,

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
Hi Sage, I try to make the following. ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt but this rise exception find here

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
Sage, Not during the network flap or before flap , but after i had already tried the ceph-objectstore-tool remove export with no possibility to do it. And conf file never had the "ignore_les" option. I was even not aware of the existence of this option and seem that it preferable to forget

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Sage Weil
On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > Hi Sage, > > I try to make the following. > > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal > /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug > --file /tmp/export-pg/18.182

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
result of ceph pg ls | grep 11.118 11.118 9788 00 0 0 40817837568 1584 1584 active+clean 2019-02-01 12:48:41.343228 70238'19811673 70493:34596887 [121,24]121 [121,24]121 69295'19811665

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Sage Weil
On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > result of ceph pg ls | grep 11.118 > > 11.118 9788 00 0 0 40817837568 > 1584 1584 active+clean 2019-02-01 > 12:48:41.343228 70238'19811673 70493:34596887 [121,24]

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
oot@ls-node-5-lcl:~# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op remove --debug --force 2> ceph-objectstore-tool-export-remove.txt marking collection for removal setting '_remove' omap key finish_remove_pgs

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
So i restarted the osd but he stop after some time. But this is an effect on the cluster and cluster is on a partial recovery process. please find here log file of osd 49 after this restart https://filesender.belnet.be/?s=download=8c9c39f2-36f6-43f7-bebb-175679d27a22 Kr Philippe.