Re: [ceph-users] Luminous cluster in very bad state need some assistance.
Hi, Sorry for late reaction. With the help of Sage we finally recover our cluster. How we have recover ? It seem that due to the network flaps , some pg(s) of two of our pools was not in good state. before doing thing well i tried many things i see in the list and manipulate pg without using ceph-objectstore-tool. This probably didn't help us and conduct to some lost of data. So with the pressure to come back operational situation we decided to remove one of the two pools with problematics pg. This pools was mainly used for rbd image for our internal kvm infrastructure for which we had backup for most vm. Before removing the pool, we tried to extract most images as we can. Many was completly corrupt, but for many others we were able to extract 99% of the content and a fsck at os level let us get data. After removed this pool there are still some pg in bad state for customer facing pool. The problem was that those pg(s) was blocked by osd(s) that didn't want to join again the cluser. To solve this, we created an empty osd with weight of 0.0 We were able to extract the pg from the faulty osd(s) and inject them into the freshly create osd using the import / export command of the ceph-objectstore-tool. After that the cluster completly recover but with still osd(s) that didn't want to join the cluster. But as data of those osd are not needed any more we decided that we will restart it from scratch. What we learned of this experience. - Ensure that you network is rock solid. ( ceph realy dislike very unstable network) avoid layer 2 internconnection between your DC. and have a flat layer2 network. - keep calm and first let the time to the cluster to do theire job. ( can take some time) - never manipulate pg without using ceph-objectstore-tool or you will be in trouble. - have spare disk on some node of the cluster to be able to have empty osd to make some recovery. I would like again thanks the comunity and Sage in particular to save us from a complete disaster. Kr Philippe. From: Philippe Van Hecke Sent: 04 February 2019 07:27 To: Sage Weil Cc: ceph-users@lists.ceph.com; Belnet Services; ceph-de...@vger.kernel.org Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. Sage, Not during the network flap or before flap , but after i had already tried the ceph-objectstore-tool remove export with no possibility to do it. And conf file never had the "ignore_les" option. I was even not aware of the existence of this option and seem that it preferable to forget that it inform me about it immediately :-) Kr Philippe. On Mon, 4 Feb 2019, Sage Weil wrote: > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > Hi Sage, First of all tanks for your help > > > > Please find here > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 Something caused the version number on this PG to reset, from something like 54146'56789376 to 67932'2. Was there any operator intervention in the cluster before or during the network flapping? Or did someone by chance set the (very dangerous!) ignore_les option in ceph.conf? sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
"num_read": 1274, "num_read_kb": 33808, "num_write": 1388, "num_write_kb": 42956, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0 }, "up": [ 64 ], "acting": [ 64 ], "blocked_by": [], "up_primary": 64, "acting_primary": 64 }, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 70656, "hit_set_history": { "current_last_update": "0'0", "history": [] } }, "peer_info": [], "recovery_state": [ { "name": "Started/Primary/Active", "enter_time": "2019-02-04 07:57:08.762037", "might_have_unfound": [ { "osd": "9", "status": "osd is down" }, { "osd": "29", "status": "osd is down" }, { "osd": "49", "status": "osd is down" }, { "osd": "51", "status": "osd is down" }, { "osd": "63", "status": "osd is down" }, { "osd": "92", "status": "osd is down" } ], "recovery_progress": { "backfill_targets": [], "waiting_on_backfill": [], "last_backfill_started": "MIN", "backfill_info": { "begin": "MIN", "end": "MIN", "objects": [] }, "peer_backfill_info": [], "backfills_in_flight": [], "recovering": [], "pg_backend": { "pull_from_peer": [], "pushing": [] } }, "scrub": { "scrubber.epoch_start": "0", "scrubber.active": false, "scrubber.state": "INACTIVE", "scrubber.start": "MIN", "scrubber.end": "MIN", "scrubber.subset_last_update": "0'0", "scrubber.deep": false, "scrubber.seed": 0, "scrubber.waiting_on": 0, "scrubber.waiting_on_whom": [] } }, { "name": "Started", "enter_time": "2019-02-04 07:57:08.220064" } ], "agent_state": {} } For 11.ac i will try wath you propose and keep you informed but i am a litle bit anxious to lose another healthy osd. Kr From: Sage Weil Sent: 04 February 2019 09:20 To: Philippe Van Hecke Cc: ceph-users@lists.ceph.com; Belnet Services Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > So i restarted the osd but he stop after some time. But this is an effect on > the cluster and cluster is on a partial recovery process. > > please find here log file of osd 49 after this restart > https://filesender.belnet.be/?s=download=8c9c39f2-36f6-43f7-bebb-175679d27a22 It's the same PG 11.182 hitting the same assert when it tries to recover to that OSD. I think the problem will go away once there has been some write traffic, but it may be tricky to prevent it from doing any recovery until then. I just noticed you pasted the wrong 'pg ls' result before: > > result of ceph pg ls | grep 11.118 > > > > 11.118 9788 00 0 0 40817837568 > > 1584 1584 active+clean 2019-02-01 > > 12:48:41.343228 70238'19811673 70493:34596887 [121,24]121 > > [121,24]121 69295'19811665 2019-02-01 12:48:41.343144 > > 66131'19810044 2019-01-30 11:44:36.006505 What does 11.182 look like? We can try something slighty different. From before it looked like your only 'incomplete' pg was 11.ac (ceph pg ls incomplete), and the needed state is either on osd.49 or osd.63. On osd.49, do ceph-objectstore-tool --op export on that pg, and then find an otherwise healthy OSD (that doesn't have 11.ac), stop it, and ceph-objectstore-tool --op import it there. When you start it up, 11.ac will hopefull peer and recover. (Or, alternatively, osd.63 may have the needed state.) sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > So i restarted the osd but he stop after some time. But this is an effect on > the cluster and cluster is on a partial recovery process. > > please find here log file of osd 49 after this restart > https://filesender.belnet.be/?s=download=8c9c39f2-36f6-43f7-bebb-175679d27a22 It's the same PG 11.182 hitting the same assert when it tries to recover to that OSD. I think the problem will go away once there has been some write traffic, but it may be tricky to prevent it from doing any recovery until then. I just noticed you pasted the wrong 'pg ls' result before: > > result of ceph pg ls | grep 11.118 > > > > 11.118 9788 00 0 0 40817837568 > > 1584 1584 active+clean 2019-02-01 > > 12:48:41.343228 70238'19811673 70493:34596887 [121,24]121 > > [121,24]121 69295'19811665 2019-02-01 12:48:41.343144 > > 66131'19810044 2019-01-30 11:44:36.006505 What does 11.182 look like? We can try something slighty different. From before it looked like your only 'incomplete' pg was 11.ac (ceph pg ls incomplete), and the needed state is either on osd.49 or osd.63. On osd.49, do ceph-objectstore-tool --op export on that pg, and then find an otherwise healthy OSD (that doesn't have 11.ac), stop it, and ceph-objectstore-tool --op import it there. When you start it up, 11.ac will hopefull peer and recover. (Or, alternatively, osd.63 may have the needed state.) sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
Hi, Seem that the recovery process stop and get back to the same situation as before. I hope that the log can provide more info. Any way thanks already for your assistance. Kr Philippe. From: Philippe Van Hecke Sent: 04 February 2019 07:53 To: Sage Weil Cc: ceph-users@lists.ceph.com; Belnet Services Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. So i restarted the osd but he stop after some time. But this is an effect on the cluster and cluster is on a partial recovery process. please find here log file of osd 49 after this restart https://filesender.belnet.be/?s=download=8c9c39f2-36f6-43f7-bebb-175679d27a22 Kr Philippe. From: Philippe Van Hecke Sent: 04 February 2019 07:42 To: Sage Weil Cc: ceph-users@lists.ceph.com; Belnet Services Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. oot@ls-node-5-lcl:~# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op remove --debug --force 2> ceph-objectstore-tool-export-remove.txt marking collection for removal setting '_remove' omap key finish_remove_pgs 11.182_head removing 11.182 Remove successful So now i suppose i restart the osd and see From: Sage Weil Sent: 04 February 2019 07:37 To: Philippe Van Hecke Cc: ceph-users@lists.ceph.com; Belnet Services Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > result of ceph pg ls | grep 11.118 > > 11.118 9788 00 0 0 40817837568 > 1584 1584 active+clean 2019-02-01 > 12:48:41.343228 70238'19811673 70493:34596887 [121,24]121 > [121,24]121 69295'19811665 2019-02-01 12:48:41.343144 > 66131'19810044 2019-01-30 11:44:36.006505 > > cp done. > > So i can make ceph-objecstore-tool --op remove command ? yep! > > > From: Sage Weil > Sent: 04 February 2019 07:26 > To: Philippe Van Hecke > Cc: ceph-users@lists.ceph.com; Belnet Services > Subject: Re: [ceph-users] Luminous cluster in very bad state need some > assistance. > > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > Hi Sage, > > > > I try to make the following. > > > > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal > > /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug > > --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt > > but this rise exception > > > > find here > > https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33 > > file ceph-objectstore-tool-export-remove.txt > > In that case, cp --preserve=all > /var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then > use the ceph-objecstore-tool --op remove command. But first confirm that > 'ceph pg ls' shows the PG as active. > > sage > > > > > Kr > > > > Philippe. > > > > ________________ > > From: Sage Weil > > Sent: 04 February 2019 06:59 > > To: Philippe Van Hecke > > Cc: ceph-users@lists.ceph.com; Belnet Services > > Subject: Re: [ceph-users] Luminous cluster in very bad state need some > > assistance. > > > > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > > Hi Sage, First of all tanks for your help > > > > > > Please find here > > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 > > > the osd log with debug info for osd.49. and indeed if all buggy osd can > > > restart that can may be solve the issue. > > > But i also happy that you confirm my understanding that in the worst case > > > removing pool can also resolve the problem even in this case i lose data > > > but finish with a working cluster. > > > > If PGs are damaged, removing the pool would be part of getting to > > HEALTH_OK, but you'd probably also need to remove any problematic PGs that > > are preventing the OSD starting. > > > > But keep in mind that (1) i see 3 PGs that don't peer spread across pools > > 11 and 12; not sure which one you are considering deleting. Also (2) if > > one pool isn't fully available it generall won't be a problem for other > > pools, as long as the osds start. And doing ceph-objectstore-tool > > export-remove is a pretty safe way to move any problem PGs out of the way > > to get your OSDs starting--just make sure you hold onto that backup/export > > because you may need it later! > > > > > PS: don't know and don't want to open debat about top/bottom posting but > > > would like to know the preference of this list :-) > > > > No preference :) > > > > sage > > > > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
So i restarted the osd but he stop after some time. But this is an effect on the cluster and cluster is on a partial recovery process. please find here log file of osd 49 after this restart https://filesender.belnet.be/?s=download=8c9c39f2-36f6-43f7-bebb-175679d27a22 Kr Philippe. From: Philippe Van Hecke Sent: 04 February 2019 07:42 To: Sage Weil Cc: ceph-users@lists.ceph.com; Belnet Services Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. oot@ls-node-5-lcl:~# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op remove --debug --force 2> ceph-objectstore-tool-export-remove.txt marking collection for removal setting '_remove' omap key finish_remove_pgs 11.182_head removing 11.182 Remove successful So now i suppose i restart the osd and see From: Sage Weil Sent: 04 February 2019 07:37 To: Philippe Van Hecke Cc: ceph-users@lists.ceph.com; Belnet Services Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > result of ceph pg ls | grep 11.118 > > 11.118 9788 00 0 0 40817837568 > 1584 1584 active+clean 2019-02-01 > 12:48:41.343228 70238'19811673 70493:34596887 [121,24]121 > [121,24]121 69295'19811665 2019-02-01 12:48:41.343144 > 66131'19810044 2019-01-30 11:44:36.006505 > > cp done. > > So i can make ceph-objecstore-tool --op remove command ? yep! > > > From: Sage Weil > Sent: 04 February 2019 07:26 > To: Philippe Van Hecke > Cc: ceph-users@lists.ceph.com; Belnet Services > Subject: Re: [ceph-users] Luminous cluster in very bad state need some > assistance. > > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > Hi Sage, > > > > I try to make the following. > > > > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal > > /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug > > --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt > > but this rise exception > > > > find here > > https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33 > > file ceph-objectstore-tool-export-remove.txt > > In that case, cp --preserve=all > /var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then > use the ceph-objecstore-tool --op remove command. But first confirm that > 'ceph pg ls' shows the PG as active. > > sage > > > > > Kr > > > > Philippe. > > > > ________________ > > From: Sage Weil > > Sent: 04 February 2019 06:59 > > To: Philippe Van Hecke > > Cc: ceph-users@lists.ceph.com; Belnet Services > > Subject: Re: [ceph-users] Luminous cluster in very bad state need some > > assistance. > > > > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > > Hi Sage, First of all tanks for your help > > > > > > Please find here > > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 > > > the osd log with debug info for osd.49. and indeed if all buggy osd can > > > restart that can may be solve the issue. > > > But i also happy that you confirm my understanding that in the worst case > > > removing pool can also resolve the problem even in this case i lose data > > > but finish with a working cluster. > > > > If PGs are damaged, removing the pool would be part of getting to > > HEALTH_OK, but you'd probably also need to remove any problematic PGs that > > are preventing the OSD starting. > > > > But keep in mind that (1) i see 3 PGs that don't peer spread across pools > > 11 and 12; not sure which one you are considering deleting. Also (2) if > > one pool isn't fully available it generall won't be a problem for other > > pools, as long as the osds start. And doing ceph-objectstore-tool > > export-remove is a pretty safe way to move any problem PGs out of the way > > to get your OSDs starting--just make sure you hold onto that backup/export > > because you may need it later! > > > > > PS: don't know and don't want to open debat about top/bottom posting but > > > would like to know the preference of this list :-) > > > > No preference :) > > > > sage > > > > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
oot@ls-node-5-lcl:~# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op remove --debug --force 2> ceph-objectstore-tool-export-remove.txt marking collection for removal setting '_remove' omap key finish_remove_pgs 11.182_head removing 11.182 Remove successful So now i suppose i restart the osd and see From: Sage Weil Sent: 04 February 2019 07:37 To: Philippe Van Hecke Cc: ceph-users@lists.ceph.com; Belnet Services Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > result of ceph pg ls | grep 11.118 > > 11.118 9788 00 0 0 40817837568 > 1584 1584 active+clean 2019-02-01 > 12:48:41.343228 70238'19811673 70493:34596887 [121,24]121 > [121,24]121 69295'19811665 2019-02-01 12:48:41.343144 > 66131'19810044 2019-01-30 11:44:36.006505 > > cp done. > > So i can make ceph-objecstore-tool --op remove command ? yep! > > > From: Sage Weil > Sent: 04 February 2019 07:26 > To: Philippe Van Hecke > Cc: ceph-users@lists.ceph.com; Belnet Services > Subject: Re: [ceph-users] Luminous cluster in very bad state need some > assistance. > > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > Hi Sage, > > > > I try to make the following. > > > > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal > > /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug > > --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt > > but this rise exception > > > > find here > > https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33 > > file ceph-objectstore-tool-export-remove.txt > > In that case, cp --preserve=all > /var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then > use the ceph-objecstore-tool --op remove command. But first confirm that > 'ceph pg ls' shows the PG as active. > > sage > > > > > Kr > > > > Philippe. > > > > ________________ > > From: Sage Weil > > Sent: 04 February 2019 06:59 > > To: Philippe Van Hecke > > Cc: ceph-users@lists.ceph.com; Belnet Services > > Subject: Re: [ceph-users] Luminous cluster in very bad state need some > > assistance. > > > > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > > Hi Sage, First of all tanks for your help > > > > > > Please find here > > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 > > > the osd log with debug info for osd.49. and indeed if all buggy osd can > > > restart that can may be solve the issue. > > > But i also happy that you confirm my understanding that in the worst case > > > removing pool can also resolve the problem even in this case i lose data > > > but finish with a working cluster. > > > > If PGs are damaged, removing the pool would be part of getting to > > HEALTH_OK, but you'd probably also need to remove any problematic PGs that > > are preventing the OSD starting. > > > > But keep in mind that (1) i see 3 PGs that don't peer spread across pools > > 11 and 12; not sure which one you are considering deleting. Also (2) if > > one pool isn't fully available it generall won't be a problem for other > > pools, as long as the osds start. And doing ceph-objectstore-tool > > export-remove is a pretty safe way to move any problem PGs out of the way > > to get your OSDs starting--just make sure you hold onto that backup/export > > because you may need it later! > > > > > PS: don't know and don't want to open debat about top/bottom posting but > > > would like to know the preference of this list :-) > > > > No preference :) > > > > sage > > > > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > result of ceph pg ls | grep 11.118 > > 11.118 9788 00 0 0 40817837568 > 1584 1584 active+clean 2019-02-01 > 12:48:41.343228 70238'19811673 70493:34596887 [121,24]121 > [121,24]121 69295'19811665 2019-02-01 12:48:41.343144 > 66131'19810044 2019-01-30 11:44:36.006505 > > cp done. > > So i can make ceph-objecstore-tool --op remove command ? yep! > > > From: Sage Weil > Sent: 04 February 2019 07:26 > To: Philippe Van Hecke > Cc: ceph-users@lists.ceph.com; Belnet Services > Subject: Re: [ceph-users] Luminous cluster in very bad state need some > assistance. > > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > Hi Sage, > > > > I try to make the following. > > > > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal > > /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug > > --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt > > but this rise exception > > > > find here > > https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33 > > file ceph-objectstore-tool-export-remove.txt > > In that case, cp --preserve=all > /var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then > use the ceph-objecstore-tool --op remove command. But first confirm that > 'ceph pg ls' shows the PG as active. > > sage > > > > > Kr > > > > Philippe. > > > > ________________ > > From: Sage Weil > > Sent: 04 February 2019 06:59 > > To: Philippe Van Hecke > > Cc: ceph-users@lists.ceph.com; Belnet Services > > Subject: Re: [ceph-users] Luminous cluster in very bad state need some > > assistance. > > > > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > > Hi Sage, First of all tanks for your help > > > > > > Please find here > > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 > > > the osd log with debug info for osd.49. and indeed if all buggy osd can > > > restart that can may be solve the issue. > > > But i also happy that you confirm my understanding that in the worst case > > > removing pool can also resolve the problem even in this case i lose data > > > but finish with a working cluster. > > > > If PGs are damaged, removing the pool would be part of getting to > > HEALTH_OK, but you'd probably also need to remove any problematic PGs that > > are preventing the OSD starting. > > > > But keep in mind that (1) i see 3 PGs that don't peer spread across pools > > 11 and 12; not sure which one you are considering deleting. Also (2) if > > one pool isn't fully available it generall won't be a problem for other > > pools, as long as the osds start. And doing ceph-objectstore-tool > > export-remove is a pretty safe way to move any problem PGs out of the way > > to get your OSDs starting--just make sure you hold onto that backup/export > > because you may need it later! > > > > > PS: don't know and don't want to open debat about top/bottom posting but > > > would like to know the preference of this list :-) > > > > No preference :) > > > > sage > > > > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
result of ceph pg ls | grep 11.118 11.118 9788 00 0 0 40817837568 1584 1584 active+clean 2019-02-01 12:48:41.343228 70238'19811673 70493:34596887 [121,24]121 [121,24]121 69295'19811665 2019-02-01 12:48:41.343144 66131'19810044 2019-01-30 11:44:36.006505 cp done. So i can make ceph-objecstore-tool --op remove command ? From: Sage Weil Sent: 04 February 2019 07:26 To: Philippe Van Hecke Cc: ceph-users@lists.ceph.com; Belnet Services Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > Hi Sage, > > I try to make the following. > > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal > /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug > --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt > but this rise exception > > find here > https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33 > file ceph-objectstore-tool-export-remove.txt In that case, cp --preserve=all /var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then use the ceph-objecstore-tool --op remove command. But first confirm that 'ceph pg ls' shows the PG as active. sage > > Kr > > Philippe. > > > From: Sage Weil > Sent: 04 February 2019 06:59 > To: Philippe Van Hecke > Cc: ceph-users@lists.ceph.com; Belnet Services > Subject: Re: [ceph-users] Luminous cluster in very bad state need some > assistance. > > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > Hi Sage, First of all tanks for your help > > > > Please find here > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 > > the osd log with debug info for osd.49. and indeed if all buggy osd can > > restart that can may be solve the issue. > > But i also happy that you confirm my understanding that in the worst case > > removing pool can also resolve the problem even in this case i lose data > > but finish with a working cluster. > > If PGs are damaged, removing the pool would be part of getting to > HEALTH_OK, but you'd probably also need to remove any problematic PGs that > are preventing the OSD starting. > > But keep in mind that (1) i see 3 PGs that don't peer spread across pools > 11 and 12; not sure which one you are considering deleting. Also (2) if > one pool isn't fully available it generall won't be a problem for other > pools, as long as the osds start. And doing ceph-objectstore-tool > export-remove is a pretty safe way to move any problem PGs out of the way > to get your OSDs starting--just make sure you hold onto that backup/export > because you may need it later! > > > PS: don't know and don't want to open debat about top/bottom posting but > > would like to know the preference of this list :-) > > No preference :) > > sage > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
Sage, Not during the network flap or before flap , but after i had already tried the ceph-objectstore-tool remove export with no possibility to do it. And conf file never had the "ignore_les" option. I was even not aware of the existence of this option and seem that it preferable to forget that it inform me about it immediately :-) Kr Philippe. On Mon, 4 Feb 2019, Sage Weil wrote: > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > Hi Sage, First of all tanks for your help > > > > Please find here > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 Something caused the version number on this PG to reset, from something like 54146'56789376 to 67932'2. Was there any operator intervention in the cluster before or during the network flapping? Or did someone by chance set the (very dangerous!) ignore_les option in ceph.conf? sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
Hi Sage, I try to make the following. ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt but this rise exception find here https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33 file ceph-objectstore-tool-export-remove.txt Kr Philippe. From: Sage Weil Sent: 04 February 2019 06:59 To: Philippe Van Hecke Cc: ceph-users@lists.ceph.com; Belnet Services Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > Hi Sage, First of all tanks for your help > > Please find here > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 > the osd log with debug info for osd.49. and indeed if all buggy osd can > restart that can may be solve the issue. > But i also happy that you confirm my understanding that in the worst case > removing pool can also resolve the problem even in this case i lose data but > finish with a working cluster. If PGs are damaged, removing the pool would be part of getting to HEALTH_OK, but you'd probably also need to remove any problematic PGs that are preventing the OSD starting. But keep in mind that (1) i see 3 PGs that don't peer spread across pools 11 and 12; not sure which one you are considering deleting. Also (2) if one pool isn't fully available it generall won't be a problem for other pools, as long as the osds start. And doing ceph-objectstore-tool export-remove is a pretty safe way to move any problem PGs out of the way to get your OSDs starting--just make sure you hold onto that backup/export because you may need it later! > PS: don't know and don't want to open debat about top/bottom posting but > would like to know the preference of this list :-) No preference :) sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > Hi Sage, > > I try to make the following. > > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal > /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug > --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt > but this rise exception > > find here > https://filesender.belnet.be/?s=download=e2b1fdbc-0739-423f-9d97-0bd258843a33 > file ceph-objectstore-tool-export-remove.txt In that case, cp --preserve=all /var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then use the ceph-objecstore-tool --op remove command. But first confirm that 'ceph pg ls' shows the PG as active. sage > > Kr > > Philippe. > > > From: Sage Weil > Sent: 04 February 2019 06:59 > To: Philippe Van Hecke > Cc: ceph-users@lists.ceph.com; Belnet Services > Subject: Re: [ceph-users] Luminous cluster in very bad state need some > assistance. > > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > Hi Sage, First of all tanks for your help > > > > Please find here > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 > > the osd log with debug info for osd.49. and indeed if all buggy osd can > > restart that can may be solve the issue. > > But i also happy that you confirm my understanding that in the worst case > > removing pool can also resolve the problem even in this case i lose data > > but finish with a working cluster. > > If PGs are damaged, removing the pool would be part of getting to > HEALTH_OK, but you'd probably also need to remove any problematic PGs that > are preventing the OSD starting. > > But keep in mind that (1) i see 3 PGs that don't peer spread across pools > 11 and 12; not sure which one you are considering deleting. Also (2) if > one pool isn't fully available it generall won't be a problem for other > pools, as long as the osds start. And doing ceph-objectstore-tool > export-remove is a pretty safe way to move any problem PGs out of the way > to get your OSDs starting--just make sure you hold onto that backup/export > because you may need it later! > > > PS: don't know and don't want to open debat about top/bottom posting but > > would like to know the preference of this list :-) > > No preference :) > > sage > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
On Mon, 4 Feb 2019, Sage Weil wrote: > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > Hi Sage, First of all tanks for your help > > > > Please find here > > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 Something caused the version number on this PG to reset, from something like 54146'56789376 to 67932'2. Was there any operator intervention in the cluster before or during the network flapping? Or did someone by chance set the (very dangerous!) ignore_les option in ceph.conf? sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > Hi Sage, First of all tanks for your help > > Please find here > https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 > the osd log with debug info for osd.49. and indeed if all buggy osd can > restart that can may be solve the issue. > But i also happy that you confirm my understanding that in the worst case > removing pool can also resolve the problem even in this case i lose data but > finish with a working cluster. If PGs are damaged, removing the pool would be part of getting to HEALTH_OK, but you'd probably also need to remove any problematic PGs that are preventing the OSD starting. But keep in mind that (1) i see 3 PGs that don't peer spread across pools 11 and 12; not sure which one you are considering deleting. Also (2) if one pool isn't fully available it generall won't be a problem for other pools, as long as the osds start. And doing ceph-objectstore-tool export-remove is a pretty safe way to move any problem PGs out of the way to get your OSDs starting--just make sure you hold onto that backup/export because you may need it later! > PS: don't know and don't want to open debat about top/bottom posting but > would like to know the preference of this list :-) No preference :) sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
From: Sage Weil Sent: 03 February 2019 18:25 To: Philippe Van Hecke Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. On Sun, 3 Feb 2019, Philippe Van Hecke wrote: > Hello, > I'am working for BELNET the Belgian Natioanal Research Network > > We currently a manage a luminous ceph cluster on ubuntu 16.04 > with 144 hdd osd spread across two data centers with 6 osd nodes > on each datacenter. Osd(s) are 4 TB sata disk. > > Last week we had a network incident and the link between our 2 DC > begin to flap due top spt flap. This let our ceph > cluster in a very bad state with many pg stuck in different state. > I let the cluster the time to recover , but some osd doesn't restart. > I have read and try different stuff found in this mailing list but > this had the effect to be in worst situation because all my osds began to > falling down one due to some bad pg. > > I then try the solution describ by our grec coleagues > https://blog.noc.grnet.gr/2016/10/18/surviving-a-ceph-cluster-outage-the-hard-way/ > > So i put a set noout and noscrub nodeep-scrub to osd that seem to freeze the > situation. > > The cluster is only used to provide rbd disk to our cloud-compute and > cloud-storage solution > and to our internal kvm vm > > It seem that only some pool are affected by unclean/unknown/unfound object > > And all is working well for other pool ( may be some speed issue ) > > I can confirm that data on affected pool are completly corrupted. > > You can find here > https://filesender.belnet.be/?s=download=1fac6b04-dd35-46f7-b4a8-c851cfa06379 > a tgz file with a maximum information i can dump to give an overview > of the current state of the cluster. > > So i have 2 questions. > > Does removing affected pools w with stuck pg associated will remove the > deffect pg ? Yes, but don't do that yet! From a quick look this looks like it can be worked around. First question is why you're hitting the assert on e.g. osd.49 0> 2019-02-01 09:23:36.963503 7fb548859e00 -1 /build/ceph-12.2.5/src/osd/PGLog.h: In function 'static void PGLog::read_log_and_missing(ObjectStore*, coll_t, coll_t, ghobject_t, const pg_info_t&, PGLog::IndexedLog&, missing_type&, bool, std::ostringstream&, bool, bool*, const DoutPrefixProvider*, std::set >*, bool) [with missing_type = pg_missing_set; std::ostringstream = std::__cxx11::basic_ostringstream]' thread 7fb548859e00 time 2019-02-01 09:23:36.961237 /build/ceph-12.2.5/src/osd/PGLog.h: 1354: FAILED assert(last_e.version.version < e.version.version) If you can set debug osd = 20 on that osd, start it, and ceph-post-file the log, that would be helpful. 12.2.5 is a pretty old luminous release, but I don't see this in the tracker, so a log would be great. Your priority is probably to get the pools active, though. For osd.49, the problematic pg is 11.182, which your pg ls output shows as online and undersized but usable. You can use ceph-objectstore-tool --op export-remove to make a backup and remove it from the osd.49 and then that osd will likely start up. If you look at 11.ac, your only incomplete pg in pool 11, the query says "down_osds_we_would_probe": [ 49, 63 ], ..so if you get that OSD up that PG should peer. In pool 12, you have 12.14d "down_osds_we_would_probe": [ 9, 51 ], osd.51 won't start due to the same assert but on pg 15.246, and hte pg ls shows that pg is undersized but active, so doing the same --op export-remove on that osd will hopefully let it start. I'm guessing the same will work on the other 12.* pg, but see if it works on 11.182 first so that pool will be completely up and available. Let us know how it goes! sage Hi Sage, First of all tanks for your help Please find here https://filesender.belnet.be/?s=download=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 the osd log with debug info for osd.49. and indeed if all buggy osd can restart that can may be solve the issue. But i also happy that you confirm my understanding that in the worst case removing pool can also resolve the problem even in this case i lose data but finish with a working cluster. Kr Philippe PS: don't know and don't want to open debat about top/bottom posting but would like to know the preference of this list :-) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous cluster in very bad state need some assistance.
On Sun, 3 Feb 2019, Philippe Van Hecke wrote: > Hello, > I'am working for BELNET the Belgian Natioanal Research Network > > We currently a manage a luminous ceph cluster on ubuntu 16.04 > with 144 hdd osd spread across two data centers with 6 osd nodes > on each datacenter. Osd(s) are 4 TB sata disk. > > Last week we had a network incident and the link between our 2 DC > begin to flap due top spt flap. This let our ceph > cluster in a very bad state with many pg stuck in different state. > I let the cluster the time to recover , but some osd doesn't restart. > I have read and try different stuff found in this mailing list but > this had the effect to be in worst situation because all my osds began to > falling down one due to some bad pg. > > I then try the solution describ by our grec coleagues > https://blog.noc.grnet.gr/2016/10/18/surviving-a-ceph-cluster-outage-the-hard-way/ > > So i put a set noout and noscrub nodeep-scrub to osd that seem to freeze the > situation. > > The cluster is only used to provide rbd disk to our cloud-compute and > cloud-storage solution > and to our internal kvm vm > > It seem that only some pool are affected by unclean/unknown/unfound object > > And all is working well for other pool ( may be some speed issue ) > > I can confirm that data on affected pool are completly corrupted. > > You can find here > https://filesender.belnet.be/?s=download=1fac6b04-dd35-46f7-b4a8-c851cfa06379 > > a tgz file with a maximum information i can dump to give an overview > of the current state of the cluster. > > So i have 2 questions. > > Does removing affected pools w with stuck pg associated will remove the > deffect pg ? Yes, but don't do that yet! From a quick look this looks like it can be worked around. First question is why you're hitting the assert on e.g. osd.49 0> 2019-02-01 09:23:36.963503 7fb548859e00 -1 /build/ceph-12.2.5/src/osd/PGLog.h: In function 'static void PGLog::read_log_and_missing(ObjectStore*, coll_t, coll_t, ghobject_t, const pg_info_t&, PGLog::IndexedLog&, missing_type&, bool, std::ostringstream&, bool, bool*, const DoutPrefixProvider*, std::set >*, bool) [with missing_type = pg_missing_set; std::ostringstream = std::__cxx11::basic_ostringstream]' thread 7fb548859e00 time 2019-02-01 09:23:36.961237 /build/ceph-12.2.5/src/osd/PGLog.h: 1354: FAILED assert(last_e.version.version < e.version.version) If you can set debug osd = 20 on that osd, start it, and ceph-post-file the log, that would be helpful. 12.2.5 is a pretty old luminous release, but I don't see this in the tracker, so a log would be great. Your priority is probably to get the pools active, though. For osd.49, the problematic pg is 11.182, which your pg ls output shows as online and undersized but usable. You can use ceph-objectstore-tool --op export-remove to make a backup and remove it from the osd.49 and then that osd will likely start up. If you look at 11.ac, your only incomplete pg in pool 11, the query says "down_osds_we_would_probe": [ 49, 63 ], ..so if you get that OSD up that PG should peer. In pool 12, you have 12.14d "down_osds_we_would_probe": [ 9, 51 ], osd.51 won't start due to the same assert but on pg 15.246, and hte pg ls shows that pg is undersized but active, so doing the same --op export-remove on that osd will hopefully let it start. I'm guessing the same will work on the other 12.* pg, but see if it works on 11.182 first so that pool will be completely up and available. Let us know how it goes! sage > If not i am completly lost and will like to know if somes expert can assist > us even not for free. > > If yes you can contact me by mail at phili...@belnet.be. > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com