Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-10 Thread Philippe Van Hecke
Hi, 

Sorry for late reaction. With the help of Sage
we finally recover our cluster. 

How we have recover ? 

It seem that due to the network flaps , some pg(s) of two
of our pools was not in good state. before doing thing well
i tried many things i see in the list and manipulate pg without
using ceph-objectstore-tool. This probably didn't help us
and conduct to some lost of data.

So with the pressure to come back operational situation we decided to 
remove one of the two pools with problematics pg. This pools
was mainly used for rbd image for our internal kvm infrastructure
for which we had backup for most vm. Before removing the pool, 
we tried to extract most images as we can. Many was completly corrupt,
but for many others we were able to extract 99% of the content and a fsck
at os level let us get data.

After removed this pool there are still some pg in bad state for customer 
facing pool. 
The problem was that those pg(s) was blocked by osd(s) that didn't want to join 
again
the cluser. To solve this, we created an empty osd with weight of 0.0 

We were able to extract the pg from the faulty osd(s)
and inject them into the freshly create osd using the import / export command 
of 
the ceph-objectstore-tool. 

After that the cluster completly recover but with still osd(s) that didn't want 
to join the cluster.
But as data of those osd are not needed any more we decided that we will 
restart it 
from scratch.

What we learned of this experience.

- Ensure that you network is rock solid. ( ceph realy dislike very unstable 
network)
  avoid layer 2 internconnection between your DC. and have a flat layer2 
network.
- keep calm and first let the time to the cluster to do theire job. ( can take 
some time)
- never manipulate pg without using ceph-objectstore-tool or you will be in 
trouble.
- have spare disk on some node of the cluster to be able to have empty osd to 
make some
  recovery.

I would like again thanks the comunity and Sage in particular to save us from 
a complete disaster.

Kr

Philippe.

From: Philippe Van Hecke
Sent: 04 February 2019 07:27
To: Sage Weil
Cc: ceph-users@lists.ceph.com; Belnet Services; ceph-de...@vger.kernel.org
Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
assistance.

Sage,

Not during the network flap or before flap , but after i had already tried the
ceph-objectstore-tool remove export with no possibility to do it.

And conf file never had the "ignore_les" option. I was even not aware of the 
existence of this option and seem that it preferable to forget that it inform 
me about it immediately :-)

Kr
Philippe.


On Mon, 4 Feb 2019, Sage Weil wrote:
> On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > Hi Sage, First of all tanks for your help
> >
> > Please find here  
> > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9

Something caused the version number on this PG to reset, from something
like 54146'56789376 to 67932'2.  Was there any operator intervention in
the cluster before or during the network flapping?  Or did someone by
chance set the (very dangerous!) ignore_les option in ceph.conf?

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-04 Thread Philippe Van Hecke
"num_read": 1274,
"num_read_kb": 33808,
"num_write": 1388,
"num_write_kb": 42956,
"num_scrub_errors": 0,
"num_shallow_scrub_errors": 0,
"num_deep_scrub_errors": 0,
"num_objects_recovered": 0,
"num_bytes_recovered": 0,
"num_keys_recovered": 0,
"num_objects_omap": 0,
"num_objects_hit_set_archive": 0,
"num_bytes_hit_set_archive": 0,
"num_flush": 0,
"num_flush_kb": 0,
"num_evict": 0,
"num_evict_kb": 0,
"num_promote": 0,
"num_flush_mode_high": 0,
"num_flush_mode_low": 0,
"num_evict_mode_some": 0,
"num_evict_mode_full": 0,
"num_objects_pinned": 0,
"num_legacy_snapsets": 0
},
"up": [
64
],
"acting": [
64
],
"blocked_by": [],
"up_primary": 64,
"acting_primary": 64
},
"empty": 0,
"dne": 0,
"incomplete": 0,
"last_epoch_started": 70656,
"hit_set_history": {
"current_last_update": "0'0",
"history": []
}
},
"peer_info": [],
"recovery_state": [
{
"name": "Started/Primary/Active",
"enter_time": "2019-02-04 07:57:08.762037",
"might_have_unfound": [
{
"osd": "9",
"status": "osd is down"
},
{
"osd": "29",
"status": "osd is down"
},
{
"osd": "49",
"status": "osd is down"
},
{
"osd": "51",
"status": "osd is down"
        },
    {
        "osd": "63",
"status": "osd is down"
},
{
"osd": "92",
"status": "osd is down"
}
],
"recovery_progress": {
"backfill_targets": [],
"waiting_on_backfill": [],
"last_backfill_started": "MIN",
"backfill_info": {
"begin": "MIN",
"end": "MIN",
"objects": []
},
"peer_backfill_info": [],
"backfills_in_flight": [],
"recovering": [],
"pg_backend": {
"pull_from_peer": [],
"pushing": []
}
},
"scrub": {
"scrubber.epoch_start": "0",
"scrubber.active": false,
"scrubber.state": "INACTIVE",
"scrubber.start": "MIN",
"scrubber.end": "MIN",
"scrubber.subset_last_update": "0'0",
"scrubber.deep": false,
"scrubber.seed": 0,
"scrubber.waiting_on": 0,
"scrubber.waiting_on_whom": []
}
},
{
"name": "Started",
"enter_time": "2019-02-04 07:57:08.220064"
}
],
"agent_state": {}
}

For 11.ac i will try wath you propose and keep you informed but i am a litle 
bit anxious to lose another healthy osd.

Kr


From: Sage Weil 
Sent: 04 February 2019 09:20
To: Philippe Van Hecke
Cc: ceph-users@lists.ceph.com; Belnet Services
Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
assistance.

On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> So i restarted the osd but he stop after some time. But this is an effect on 
> the cluster and cluster is on a partial recovery process.
>
> please find here log file of osd 49 after this restart
> https://filesender.belnet.be/?s=download=8c9c39f2-36f6-43f7-bebb-175679d27a22

It's the same PG 11.182 hitting the same assert when it tries to recover
to that OSD.  I think the problem will go away once there has been some
write traffic, but it may be tricky to prevent it from doing any recovery
until then.

I just noticed you pasted the wrong 'pg ls' result before:

> > result of  ceph pg ls | grep 11.118
> >
> > 11.118 9788  00 0   0 40817837568 
> > 1584 1584 active+clean 2019-02-01 
> > 12:48:41.343228  70238'19811673  70493:34596887  [121,24]121  
> > [121,24]121  69295'19811665 2019-02-01 12:48:41.343144  
> > 66131'19810044 2019-01-30 11:44:36.006505

What does 11.182 look like?

We can try something slighty different.  From before it looked like your
only 'incomplete' pg was 11.ac (ceph pg ls incomplete), and the needed
state is either on osd.49 or osd.63.  On osd.49, do ceph-objectstore-tool
--op export on that pg, and then find an otherwise healthy OSD (that
doesn't have 11.ac), stop it, and ceph-objectstore-tool --op import it
there.  When you start it up, 11.ac will hopefull peer and recover.  (Or,
alternatively, osd.63 may have the needed state.)

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-04 Thread Sage Weil
On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> So i restarted the osd but he stop after some time. But this is an effect on 
> the cluster and cluster is on a partial recovery process.
> 
> please find here log file of osd 49 after this restart 
> https://filesender.belnet.be/?s=download=8c9c39f2-36f6-43f7-bebb-175679d27a22

It's the same PG 11.182 hitting the same assert when it tries to recover 
to that OSD.  I think the problem will go away once there has been some 
write traffic, but it may be tricky to prevent it from doing any recovery 
until then.

I just noticed you pasted the wrong 'pg ls' result before:

> > result of  ceph pg ls | grep 11.118
> >
> > 11.118 9788  00 0   0 40817837568 
> > 1584 1584 active+clean 2019-02-01 
> > 12:48:41.343228  70238'19811673  70493:34596887  [121,24]121  
> > [121,24]121  69295'19811665 2019-02-01 12:48:41.343144  
> > 66131'19810044 2019-01-30 11:44:36.006505

What does 11.182 look like?

We can try something slighty different.  From before it looked like your 
only 'incomplete' pg was 11.ac (ceph pg ls incomplete), and the needed 
state is either on osd.49 or osd.63.  On osd.49, do ceph-objectstore-tool 
--op export on that pg, and then find an otherwise healthy OSD (that 
doesn't have 11.ac), stop it, and ceph-objectstore-tool --op import it 
there.  When you start it up, 11.ac will hopefull peer and recover.  (Or, 
alternatively, osd.63 may have the needed state.)

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-04 Thread Philippe Van Hecke
Hi,
Seem that the recovery process stop and get back to the same situation as 
before.
I hope that the log can provide more info. Any way thanks already for your 
assistance.

Kr

Philippe.


From: Philippe Van Hecke
Sent: 04 February 2019 07:53
To: Sage Weil
Cc: ceph-users@lists.ceph.com; Belnet Services
Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
assistance.

So i restarted the osd but he stop after some time. But this is an effect on 
the cluster and cluster is on a partial recovery process.

please find here log file of osd 49 after this restart
https://filesender.belnet.be/?s=download=8c9c39f2-36f6-43f7-bebb-175679d27a22

Kr

Philippe.


From: Philippe Van Hecke
Sent: 04 February 2019 07:42
To: Sage Weil
Cc: ceph-users@lists.ceph.com; Belnet Services
Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
assistance.

oot@ls-node-5-lcl:~# ceph-objectstore-tool --data-path 
/var/lib/ceph/osd/ceph-49/ --journal /var/lib/ceph/osd/ceph-49/journal --pgid 
11.182 --op remove --debug --force  2> ceph-objectstore-tool-export-remove.txt
 marking collection for removal
setting '_remove' omap key
finish_remove_pgs 11.182_head removing 11.182
Remove successful

So now i suppose i restart the osd and see



From: Sage Weil 
Sent: 04 February 2019 07:37
To: Philippe Van Hecke
Cc: ceph-users@lists.ceph.com; Belnet Services
Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
assistance.

On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> result of  ceph pg ls | grep 11.118
>
> 11.118 9788  00 0   0 40817837568 
> 1584 1584 active+clean 2019-02-01 
> 12:48:41.343228  70238'19811673  70493:34596887  [121,24]121  
> [121,24]121  69295'19811665 2019-02-01 12:48:41.343144  
> 66131'19810044 2019-01-30 11:44:36.006505
>
> cp done.
>
> So i can make  ceph-objecstore-tool --op remove command ?

yep!


>
> 
> From: Sage Weil 
> Sent: 04 February 2019 07:26
> To: Philippe Van Hecke
> Cc: ceph-users@lists.ceph.com; Belnet Services
> Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
> assistance.
>
> On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > Hi Sage,
> >
> > I try to make the following.
> >
> > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal 
> > /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug 
> > --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt
> > but this rise exception
> >
> > find here  
> > https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33
> >  file ceph-objectstore-tool-export-remove.txt
>
> In that case,  cp --preserve=all
> /var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then
> use the ceph-objecstore-tool --op remove command.  But first confirm that
> 'ceph pg ls' shows the PG as active.
>
> sage
>
>
>  > > Kr
> >
> > Philippe.
> >
> > ________________
> > From: Sage Weil 
> > Sent: 04 February 2019 06:59
> > To: Philippe Van Hecke
> > Cc: ceph-users@lists.ceph.com; Belnet Services
> > Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
> > assistance.
> >
> > On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > > Hi Sage, First of all tanks for your help
> > >
> > > Please find here  
> > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9
> > > the osd log with debug info for osd.49. and indeed if all buggy osd can 
> > > restart that can may be solve the issue.
> > > But i also happy that you confirm my understanding that in the worst case 
> > > removing pool can also resolve the problem even in this case i lose data  
> > > but finish with a working cluster.
> >
> > If PGs are damaged, removing the pool would be part of getting to
> > HEALTH_OK, but you'd probably also need to remove any problematic PGs that
> > are preventing the OSD starting.
> >
> > But keep in mind that (1) i see 3 PGs that don't peer spread across pools
> > 11 and 12; not sure which one you are considering deleting.  Also (2) if
> > one pool isn't fully available it generall won't be a problem for other
> > pools, as long as the osds start.  And doing ceph-objectstore-tool
> > export-remove is a pretty safe way to move any problem PGs out of the way
> > to get your OSDs starting--just make sure you hold onto that backup/export
> > because you may need it later!
> >
> > > PS: don't know and don't want to open debat about top/bottom posting but 
> > > would like to know the preference of this list :-)
> >
> > No preference :)
> >
> > sage
> >
> >
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
So i restarted the osd but he stop after some time. But this is an effect on 
the cluster and cluster is on a partial recovery process.

please find here log file of osd 49 after this restart 
https://filesender.belnet.be/?s=download=8c9c39f2-36f6-43f7-bebb-175679d27a22

Kr

Philippe.


From: Philippe Van Hecke
Sent: 04 February 2019 07:42
To: Sage Weil
Cc: ceph-users@lists.ceph.com; Belnet Services
Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
assistance.

oot@ls-node-5-lcl:~# ceph-objectstore-tool --data-path 
/var/lib/ceph/osd/ceph-49/ --journal /var/lib/ceph/osd/ceph-49/journal --pgid 
11.182 --op remove --debug --force  2> ceph-objectstore-tool-export-remove.txt
 marking collection for removal
setting '_remove' omap key
finish_remove_pgs 11.182_head removing 11.182
Remove successful

So now i suppose i restart the osd and see



From: Sage Weil 
Sent: 04 February 2019 07:37
To: Philippe Van Hecke
Cc: ceph-users@lists.ceph.com; Belnet Services
Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
assistance.

On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> result of  ceph pg ls | grep 11.118
>
> 11.118 9788  00 0   0 40817837568 
> 1584 1584 active+clean 2019-02-01 
> 12:48:41.343228  70238'19811673  70493:34596887  [121,24]121  
> [121,24]121  69295'19811665 2019-02-01 12:48:41.343144  
> 66131'19810044 2019-01-30 11:44:36.006505
>
> cp done.
>
> So i can make  ceph-objecstore-tool --op remove command ?

yep!


>
> 
> From: Sage Weil 
> Sent: 04 February 2019 07:26
> To: Philippe Van Hecke
> Cc: ceph-users@lists.ceph.com; Belnet Services
> Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
> assistance.
>
> On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > Hi Sage,
> >
> > I try to make the following.
> >
> > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal 
> > /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug 
> > --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt
> > but this rise exception
> >
> > find here  
> > https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33
> >  file ceph-objectstore-tool-export-remove.txt
>
> In that case,  cp --preserve=all
> /var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then
> use the ceph-objecstore-tool --op remove command.  But first confirm that
> 'ceph pg ls' shows the PG as active.
>
> sage
>
>
>  > > Kr
> >
> > Philippe.
> >
> > ________________
> > From: Sage Weil 
> > Sent: 04 February 2019 06:59
> > To: Philippe Van Hecke
> > Cc: ceph-users@lists.ceph.com; Belnet Services
> > Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
> > assistance.
> >
> > On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > > Hi Sage, First of all tanks for your help
> > >
> > > Please find here  
> > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9
> > > the osd log with debug info for osd.49. and indeed if all buggy osd can 
> > > restart that can may be solve the issue.
> > > But i also happy that you confirm my understanding that in the worst case 
> > > removing pool can also resolve the problem even in this case i lose data  
> > > but finish with a working cluster.
> >
> > If PGs are damaged, removing the pool would be part of getting to
> > HEALTH_OK, but you'd probably also need to remove any problematic PGs that
> > are preventing the OSD starting.
> >
> > But keep in mind that (1) i see 3 PGs that don't peer spread across pools
> > 11 and 12; not sure which one you are considering deleting.  Also (2) if
> > one pool isn't fully available it generall won't be a problem for other
> > pools, as long as the osds start.  And doing ceph-objectstore-tool
> > export-remove is a pretty safe way to move any problem PGs out of the way
> > to get your OSDs starting--just make sure you hold onto that backup/export
> > because you may need it later!
> >
> > > PS: don't know and don't want to open debat about top/bottom posting but 
> > > would like to know the preference of this list :-)
> >
> > No preference :)
> >
> > sage
> >
> >
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
oot@ls-node-5-lcl:~# ceph-objectstore-tool --data-path 
/var/lib/ceph/osd/ceph-49/ --journal /var/lib/ceph/osd/ceph-49/journal --pgid 
11.182 --op remove --debug --force  2> ceph-objectstore-tool-export-remove.txt 
 marking collection for removal
setting '_remove' omap key
finish_remove_pgs 11.182_head removing 11.182
Remove successful

So now i suppose i restart the osd and see 



From: Sage Weil 
Sent: 04 February 2019 07:37
To: Philippe Van Hecke
Cc: ceph-users@lists.ceph.com; Belnet Services
Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
assistance.

On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> result of  ceph pg ls | grep 11.118
>
> 11.118 9788  00 0   0 40817837568 
> 1584 1584 active+clean 2019-02-01 
> 12:48:41.343228  70238'19811673  70493:34596887  [121,24]121  
> [121,24]121  69295'19811665 2019-02-01 12:48:41.343144  
> 66131'19810044 2019-01-30 11:44:36.006505
>
> cp done.
>
> So i can make  ceph-objecstore-tool --op remove command ?

yep!


>
> 
> From: Sage Weil 
> Sent: 04 February 2019 07:26
> To: Philippe Van Hecke
> Cc: ceph-users@lists.ceph.com; Belnet Services
> Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
> assistance.
>
> On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > Hi Sage,
> >
> > I try to make the following.
> >
> > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal 
> > /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug 
> > --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt
> > but this rise exception
> >
> > find here  
> > https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33
> >  file ceph-objectstore-tool-export-remove.txt
>
> In that case,  cp --preserve=all
> /var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then
> use the ceph-objecstore-tool --op remove command.  But first confirm that
> 'ceph pg ls' shows the PG as active.
>
> sage
>
>
>  > > Kr
> >
> > Philippe.
> >
> > ________________
> > From: Sage Weil 
> > Sent: 04 February 2019 06:59
> > To: Philippe Van Hecke
> > Cc: ceph-users@lists.ceph.com; Belnet Services
> > Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
> > assistance.
> >
> > On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > > Hi Sage, First of all tanks for your help
> > >
> > > Please find here  
> > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9
> > > the osd log with debug info for osd.49. and indeed if all buggy osd can 
> > > restart that can may be solve the issue.
> > > But i also happy that you confirm my understanding that in the worst case 
> > > removing pool can also resolve the problem even in this case i lose data  
> > > but finish with a working cluster.
> >
> > If PGs are damaged, removing the pool would be part of getting to
> > HEALTH_OK, but you'd probably also need to remove any problematic PGs that
> > are preventing the OSD starting.
> >
> > But keep in mind that (1) i see 3 PGs that don't peer spread across pools
> > 11 and 12; not sure which one you are considering deleting.  Also (2) if
> > one pool isn't fully available it generall won't be a problem for other
> > pools, as long as the osds start.  And doing ceph-objectstore-tool
> > export-remove is a pretty safe way to move any problem PGs out of the way
> > to get your OSDs starting--just make sure you hold onto that backup/export
> > because you may need it later!
> >
> > > PS: don't know and don't want to open debat about top/bottom posting but 
> > > would like to know the preference of this list :-)
> >
> > No preference :)
> >
> > sage
> >
> >
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Sage Weil
On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> result of  ceph pg ls | grep 11.118
> 
> 11.118 9788  00 0   0 40817837568 
> 1584 1584 active+clean 2019-02-01 
> 12:48:41.343228  70238'19811673  70493:34596887  [121,24]121  
> [121,24]121  69295'19811665 2019-02-01 12:48:41.343144  
> 66131'19810044 2019-01-30 11:44:36.006505
> 
> cp done.
> 
> So i can make  ceph-objecstore-tool --op remove command ?

yep!


>   
> 
> From: Sage Weil 
> Sent: 04 February 2019 07:26
> To: Philippe Van Hecke
> Cc: ceph-users@lists.ceph.com; Belnet Services
> Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
> assistance.
> 
> On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > Hi Sage,
> >
> > I try to make the following.
> >
> > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal 
> > /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug 
> > --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt
> > but this rise exception
> >
> > find here  
> > https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33
> >  file ceph-objectstore-tool-export-remove.txt
> 
> In that case,  cp --preserve=all
> /var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then
> use the ceph-objecstore-tool --op remove command.  But first confirm that
> 'ceph pg ls' shows the PG as active.
> 
> sage
> 
> 
>  > > Kr
> >
> > Philippe.
> >
> > ________________
> > From: Sage Weil 
> > Sent: 04 February 2019 06:59
> > To: Philippe Van Hecke
> > Cc: ceph-users@lists.ceph.com; Belnet Services
> > Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
> > assistance.
> >
> > On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > > Hi Sage, First of all tanks for your help
> > >
> > > Please find here  
> > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9
> > > the osd log with debug info for osd.49. and indeed if all buggy osd can 
> > > restart that can may be solve the issue.
> > > But i also happy that you confirm my understanding that in the worst case 
> > > removing pool can also resolve the problem even in this case i lose data  
> > > but finish with a working cluster.
> >
> > If PGs are damaged, removing the pool would be part of getting to
> > HEALTH_OK, but you'd probably also need to remove any problematic PGs that
> > are preventing the OSD starting.
> >
> > But keep in mind that (1) i see 3 PGs that don't peer spread across pools
> > 11 and 12; not sure which one you are considering deleting.  Also (2) if
> > one pool isn't fully available it generall won't be a problem for other
> > pools, as long as the osds start.  And doing ceph-objectstore-tool
> > export-remove is a pretty safe way to move any problem PGs out of the way
> > to get your OSDs starting--just make sure you hold onto that backup/export
> > because you may need it later!
> >
> > > PS: don't know and don't want to open debat about top/bottom posting but 
> > > would like to know the preference of this list :-)
> >
> > No preference :)
> >
> > sage
> >
> >
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
result of  ceph pg ls | grep 11.118

11.118 9788  00 0   0 40817837568 1584  
   1584 active+clean 2019-02-01 12:48:41.343228  
70238'19811673  70493:34596887  [121,24]121  [121,24]121  
69295'19811665 2019-02-01 12:48:41.343144  66131'19810044 2019-01-30 
11:44:36.006505

cp done.

So i can make  ceph-objecstore-tool --op remove command ?
  

From: Sage Weil 
Sent: 04 February 2019 07:26
To: Philippe Van Hecke
Cc: ceph-users@lists.ceph.com; Belnet Services
Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
assistance.

On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> Hi Sage,
>
> I try to make the following.
>
> ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal 
> /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug 
> --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt
> but this rise exception
>
> find here  
> https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33
>  file ceph-objectstore-tool-export-remove.txt

In that case,  cp --preserve=all
/var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then
use the ceph-objecstore-tool --op remove command.  But first confirm that
'ceph pg ls' shows the PG as active.

sage


 > > Kr
>
> Philippe.
>
> 
> From: Sage Weil 
> Sent: 04 February 2019 06:59
> To: Philippe Van Hecke
> Cc: ceph-users@lists.ceph.com; Belnet Services
> Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
> assistance.
>
> On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > Hi Sage, First of all tanks for your help
> >
> > Please find here  
> > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9
> > the osd log with debug info for osd.49. and indeed if all buggy osd can 
> > restart that can may be solve the issue.
> > But i also happy that you confirm my understanding that in the worst case 
> > removing pool can also resolve the problem even in this case i lose data  
> > but finish with a working cluster.
>
> If PGs are damaged, removing the pool would be part of getting to
> HEALTH_OK, but you'd probably also need to remove any problematic PGs that
> are preventing the OSD starting.
>
> But keep in mind that (1) i see 3 PGs that don't peer spread across pools
> 11 and 12; not sure which one you are considering deleting.  Also (2) if
> one pool isn't fully available it generall won't be a problem for other
> pools, as long as the osds start.  And doing ceph-objectstore-tool
> export-remove is a pretty safe way to move any problem PGs out of the way
> to get your OSDs starting--just make sure you hold onto that backup/export
> because you may need it later!
>
> > PS: don't know and don't want to open debat about top/bottom posting but 
> > would like to know the preference of this list :-)
>
> No preference :)
>
> sage
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
Sage,

Not during the network flap or before flap , but after i had already tried the 
ceph-objectstore-tool remove export with no possibility to do it.

And conf file never had the "ignore_les" option. I was even not aware of the 
existence of this option and seem that it preferable to forget that it inform 
me about it immediately :-)

Kr
Philippe.


On Mon, 4 Feb 2019, Sage Weil wrote:
> On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > Hi Sage, First of all tanks for your help
> >
> > Please find here  
> > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9

Something caused the version number on this PG to reset, from something
like 54146'56789376 to 67932'2.  Was there any operator intervention in
the cluster before or during the network flapping?  Or did someone by
chance set the (very dangerous!) ignore_les option in ceph.conf?

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
Hi Sage,

I try to make the following.

ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal 
/var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug 
--file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt
but this rise exception 

find here  
https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33
 file ceph-objectstore-tool-export-remove.txt

Kr

Philippe.


From: Sage Weil 
Sent: 04 February 2019 06:59
To: Philippe Van Hecke
Cc: ceph-users@lists.ceph.com; Belnet Services
Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
assistance.

On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> Hi Sage, First of all tanks for your help
>
> Please find here  
> https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9
> the osd log with debug info for osd.49. and indeed if all buggy osd can 
> restart that can may be solve the issue.
> But i also happy that you confirm my understanding that in the worst case 
> removing pool can also resolve the problem even in this case i lose data  but 
> finish with a working cluster.

If PGs are damaged, removing the pool would be part of getting to
HEALTH_OK, but you'd probably also need to remove any problematic PGs that
are preventing the OSD starting.

But keep in mind that (1) i see 3 PGs that don't peer spread across pools
11 and 12; not sure which one you are considering deleting.  Also (2) if
one pool isn't fully available it generall won't be a problem for other
pools, as long as the osds start.  And doing ceph-objectstore-tool
export-remove is a pretty safe way to move any problem PGs out of the way
to get your OSDs starting--just make sure you hold onto that backup/export
because you may need it later!

> PS: don't know and don't want to open debat about top/bottom posting but 
> would like to know the preference of this list :-)

No preference :)

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Sage Weil
On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> Hi Sage,
> 
> I try to make the following.
> 
> ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal 
> /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug 
> --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt
> but this rise exception 
> 
> find here  
> https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33
>  file ceph-objectstore-tool-export-remove.txt

In that case,  cp --preserve=all 
/var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then 
use the ceph-objecstore-tool --op remove command.  But first confirm that 
'ceph pg ls' shows the PG as active.

sage


 > > Kr
> 
> Philippe.
> 
> 
> From: Sage Weil 
> Sent: 04 February 2019 06:59
> To: Philippe Van Hecke
> Cc: ceph-users@lists.ceph.com; Belnet Services
> Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
> assistance.
> 
> On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > Hi Sage, First of all tanks for your help
> >
> > Please find here  
> > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9
> > the osd log with debug info for osd.49. and indeed if all buggy osd can 
> > restart that can may be solve the issue.
> > But i also happy that you confirm my understanding that in the worst case 
> > removing pool can also resolve the problem even in this case i lose data  
> > but finish with a working cluster.
> 
> If PGs are damaged, removing the pool would be part of getting to
> HEALTH_OK, but you'd probably also need to remove any problematic PGs that
> are preventing the OSD starting.
> 
> But keep in mind that (1) i see 3 PGs that don't peer spread across pools
> 11 and 12; not sure which one you are considering deleting.  Also (2) if
> one pool isn't fully available it generall won't be a problem for other
> pools, as long as the osds start.  And doing ceph-objectstore-tool
> export-remove is a pretty safe way to move any problem PGs out of the way
> to get your OSDs starting--just make sure you hold onto that backup/export
> because you may need it later!
> 
> > PS: don't know and don't want to open debat about top/bottom posting but 
> > would like to know the preference of this list :-)
> 
> No preference :)
> 
> sage
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Sage Weil
On Mon, 4 Feb 2019, Sage Weil wrote:
> On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > Hi Sage, First of all tanks for your help
> > 
> > Please find here  
> > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9

Something caused the version number on this PG to reset, from something 
like 54146'56789376 to 67932'2.  Was there any operator intervention in 
the cluster before or during the network flapping?  Or did someone by 
chance set the (very dangerous!) ignore_les option in ceph.conf?

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Sage Weil
On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> Hi Sage, First of all tanks for your help
> 
> Please find here  
> https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9
> the osd log with debug info for osd.49. and indeed if all buggy osd can 
> restart that can may be solve the issue.
> But i also happy that you confirm my understanding that in the worst case 
> removing pool can also resolve the problem even in this case i lose data  but 
> finish with a working cluster.

If PGs are damaged, removing the pool would be part of getting to 
HEALTH_OK, but you'd probably also need to remove any problematic PGs that 
are preventing the OSD starting.

But keep in mind that (1) i see 3 PGs that don't peer spread across pools 
11 and 12; not sure which one you are considering deleting.  Also (2) if 
one pool isn't fully available it generall won't be a problem for other 
pools, as long as the osds start.  And doing ceph-objectstore-tool 
export-remove is a pretty safe way to move any problem PGs out of the way 
to get your OSDs starting--just make sure you hold onto that backup/export 
because you may need it later!

> PS: don't know and don't want to open debat about top/bottom posting but 
> would like to know the preference of this list :-)

No preference :)

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke



From: Sage Weil 
Sent: 03 February 2019 18:25
To: Philippe Van Hecke
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Luminous cluster in very bad state need some 
assistance.

On Sun, 3 Feb 2019, Philippe Van Hecke wrote:
> Hello,
> I'am working for BELNET the Belgian Natioanal Research Network
>
> We currently a manage a luminous ceph cluster on ubuntu 16.04
> with 144 hdd osd spread across two data centers with 6 osd nodes
> on each datacenter. Osd(s) are 4 TB sata disk.
>
> Last week we had a network incident and the link between our 2 DC
> begin to flap due top spt flap. This let our ceph
> cluster in a very bad state with many pg stuck in different state.
> I let the cluster the time to recover , but some osd doesn't restart.
> I have read and try different stuff found in this mailing list but
> this had the effect to be in worst situation because all my osds began to 
> falling down one  due to some bad pg.
>
> I then try the solution describ by our grec coleagues
> https://blog.noc.grnet.gr/2016/10/18/surviving-a-ceph-cluster-outage-the-hard-way/
>
> So i put a set noout and noscrub nodeep-scrub to osd that seem to freeze the 
> situation.
>
> The cluster is only used to provide rbd disk to our cloud-compute and 
> cloud-storage solution
> and to our internal kvm vm
>
> It seem that only some pool are affected by unclean/unknown/unfound object
>
> And all is working well for other pool ( may be some speed issue )
>
> I can confirm that data on affected pool are completly corrupted.
>
> You can find here 
> https://filesender.belnet.be/?s=download=1fac6b04-dd35-46f7-b4a8-c851cfa06379
> a tgz file with a maximum information i can dump to give an overview
> of the current state of the cluster.
>
> So i have 2 questions.
>
> Does removing affected pools w with stuck pg associated will remove the 
> deffect pg ?

Yes, but don't do that yet!  From a quick look this looks like it can be
worked around.

First question is why you're hitting the assert on e.g. osd.49

 0> 2019-02-01 09:23:36.963503 7fb548859e00 -1
/build/ceph-12.2.5/src/osd/PGLog.h: In function 'static void
PGLog::read_log_and_missing(ObjectStore*, coll_t, coll_t, ghobject_t,
const pg_info_t&, PGLog::IndexedLog&, missing_type&, bool,
std::ostringstream&, bool, bool*, const DoutPrefixProvider*,
std::set >*, bool) [with missing_type =
pg_missing_set; std::ostringstream =
std::__cxx11::basic_ostringstream]' thread 7fb548859e00 time
2019-02-01 09:23:36.961237
/build/ceph-12.2.5/src/osd/PGLog.h: 1354: FAILED
assert(last_e.version.version < e.version.version)

If you can set debug osd = 20 on that osd, start it, and ceph-post-file
the log, that would be helpful.  12.2.5 is a pretty old luminous release,
but I don't see this in the tracker, so a log would be great.

Your priority is probably to get the pools active, though.  For osd.49,
the problematic pg is 11.182, which your pg ls output shows as online and
undersized but usable.  You can use ceph-objectstore-tool --op
export-remove to make a backup and remove it from the osd.49 and then that
osd will likely start up.

If you look at 11.ac, your only incomplete pg in pool 11, the
query says

"down_osds_we_would_probe": [
49,
63
],

..so if you get that OSD up that PG should peer.

In pool 12, you have 12.14d

"down_osds_we_would_probe": [
9,
51
],

osd.51 won't start due to the same assert but on pg 15.246, and hte pg ls
shows that pg is undersized but active, so doing the same --op
export-remove on that osd will hopefully let it start.  I'm guessing the
same will work on the other 12.* pg, but see if it works on 11.182 first
so that pool will be completely up and available.

Let us know how it goes!

sage


Hi Sage, First of all tanks for your help

Please find here  
https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9
the osd log with debug info for osd.49. and indeed if all buggy osd can restart 
that can may be solve the issue.
But i also happy that you confirm my understanding that in the worst case 
removing pool can also resolve the problem even in this case i lose data  but 
finish with a working cluster.

Kr
Philippe

PS: don't know and don't want to open debat about top/bottom posting but would 
like to know the preference of this list :-)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Sage Weil
On Sun, 3 Feb 2019, Philippe Van Hecke wrote:
> Hello,
> I'am working for BELNET the Belgian Natioanal Research Network
> 
> We currently a manage a luminous ceph cluster on ubuntu 16.04
> with 144 hdd osd spread across two data centers with 6 osd nodes
> on each datacenter. Osd(s) are 4 TB sata disk.
> 
> Last week we had a network incident and the link between our 2 DC
> begin to flap due top spt flap. This let our ceph
> cluster in a very bad state with many pg stuck in different state.
> I let the cluster the time to recover , but some osd doesn't restart.
> I have read and try different stuff found in this mailing list but
> this had the effect to be in worst situation because all my osds began to 
> falling down one  due to some bad pg. 
> 
> I then try the solution describ by our grec coleagues 
> https://blog.noc.grnet.gr/2016/10/18/surviving-a-ceph-cluster-outage-the-hard-way/
> 
> So i put a set noout and noscrub nodeep-scrub to osd that seem to freeze the 
> situation.
> 
> The cluster is only used to provide rbd disk to our cloud-compute and 
> cloud-storage solution 
> and to our internal kvm vm 
> 
> It seem that only some pool are affected by unclean/unknown/unfound object 
> 
> And all is working well for other pool ( may be some speed issue )
> 
> I can confirm that data on affected pool are completly corrupted.
> 
> You can find here 
> https://filesender.belnet.be/?s=download=1fac6b04-dd35-46f7-b4a8-c851cfa06379
>   
> a tgz file with a maximum information i can dump to give an overview
> of the current state of the cluster.
> 
> So i have 2 questions.
> 
> Does removing affected pools w with stuck pg associated will remove the 
> deffect pg ? 

Yes, but don't do that yet!  From a quick look this looks like it can be 
worked around.

First question is why you're hitting the assert on e.g. osd.49

 0> 2019-02-01 09:23:36.963503 7fb548859e00 -1 
/build/ceph-12.2.5/src/osd/PGLog.h: In function 'static void 
PGLog::read_log_and_missing(ObjectStore*, coll_t, coll_t, ghobject_t, 
const pg_info_t&, PGLog::IndexedLog&, missing_type&, bool, 
std::ostringstream&, bool, bool*, const DoutPrefixProvider*, 
std::set >*, bool) [with missing_type = 
pg_missing_set; std::ostringstream = 
std::__cxx11::basic_ostringstream]' thread 7fb548859e00 time 
2019-02-01 09:23:36.961237
/build/ceph-12.2.5/src/osd/PGLog.h: 1354: FAILED 
assert(last_e.version.version < e.version.version)

If you can set debug osd = 20 on that osd, start it, and ceph-post-file 
the log, that would be helpful.  12.2.5 is a pretty old luminous release, 
but I don't see this in the tracker, so a log would be great.

Your priority is probably to get the pools active, though.  For osd.49, 
the problematic pg is 11.182, which your pg ls output shows as online and 
undersized but usable.  You can use ceph-objectstore-tool --op 
export-remove to make a backup and remove it from the osd.49 and then that 
osd will likely start up.

If you look at 11.ac, your only incomplete pg in pool 11, the 
query says

"down_osds_we_would_probe": [
49,
63
],

..so if you get that OSD up that PG should peer.

In pool 12, you have 12.14d

"down_osds_we_would_probe": [
9,
51
],

osd.51 won't start due to the same assert but on pg 15.246, and hte pg ls 
shows that pg is undersized but active, so doing the same --op 
export-remove on that osd will hopefully let it start.  I'm guessing the 
same will work on the other 12.* pg, but see if it works on 11.182 first 
so that pool will be completely up and available.

Let us know how it goes!

sage



> If not i am completly lost and will like to know if somes expert  can assist 
> us even not for free.
> 
> If yes you can contact me by mail at phili...@belnet.be.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com