Re: [ceph-users] backfill_toofull, but OSDs not full
ceph 0.80.1 The same quesiton. I have deleted 1/4 data, but the problem didn't disappear Does anyone have other way to solve it? At 2015-01-10 05:31:30,Udo Lembke ulem...@polarzone.de wrote: Hi, I had an similiar effect two weeks ago - 1PG backfill_toofull and due reweighting and delete there was enough free space but the rebuild process stopped after a while. After stop and start ceph on the second node, the rebuild process runs without trouble and the backfill_toofull are gone. This happens with firefly. Udo On 09.01.2015 21:29, c3 wrote: In this case the root cause was half denied reservations. http://tracker.ceph.com/issues/9626 This stopped backfills since, those listed as backfilling were actually half denied and doing nothing. The toofull status is not checked until a free backfill slot happens, so everything was just stuck. Interestingly, the toofull was created by other backfills which were not stoppped. http://tracker.ceph.com/issues/9594 Quite the log jam to clear. Quoting Craig Lewis cle...@centraldesktop.com: What was the osd_backfill_full_ratio? That's the config that controls backfill_toofull. By default, it's 85%. The mon_osd_*_ratio affect the ceph status. I've noticed that it takes a while for backfilling to restart after changing osd_backfill_full_ratio. Backfilling usually restarts for me in 10-15 minutes. Some PGs will stay in that state until the cluster is nearly done recoverying. I've only seen backfill_toofull happen after the OSD exceeds the ratio (so it's reactive, no proactive). Mine usually happen when I'm rebalancing a nearfull cluster, and an OSD backfills itself toofull. On Mon, Jan 5, 2015 at 11:32 AM, c3 ceph-us...@lopkop.com wrote: Hi, I am wondering how a PG gets marked backfill_toofull. I reweighted several OSDs using ceph osd crush reweight. As expected, PG began moving around (backfilling). Some PGs got marked +backfilling (~10), some +wait_backfill (~100). But some are marked +backfill_toofull. My OSDs are between 25% and 72% full. Looking at ceph pg dump, I can find the backfill_toofull PGs and verified the OSDs involved are less than 72% full. Do backfill reservations include a size? Are these OSDs projected to be toofull, once the current backfilling complete? Some of the backfill_toofull and backfilling point to the same OSDs. I did adjust the full ratios, but that did not change the backfill_toofull status. ceph tell mon.\* injectargs '--mon_osd_full_ratio 0.95' ceph tell osd.\* injectargs '--osd_backfill_full_ratio 0.92' ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] backfill_toofull, but OSDs not full
Hi, I had an similiar effect two weeks ago - 1PG backfill_toofull and due reweighting and delete there was enough free space but the rebuild process stopped after a while. After stop and start ceph on the second node, the rebuild process runs without trouble and the backfill_toofull are gone. This happens with firefly. Udo On 09.01.2015 21:29, c3 wrote: In this case the root cause was half denied reservations. http://tracker.ceph.com/issues/9626 This stopped backfills since, those listed as backfilling were actually half denied and doing nothing. The toofull status is not checked until a free backfill slot happens, so everything was just stuck. Interestingly, the toofull was created by other backfills which were not stoppped. http://tracker.ceph.com/issues/9594 Quite the log jam to clear. Quoting Craig Lewis cle...@centraldesktop.com: What was the osd_backfill_full_ratio? That's the config that controls backfill_toofull. By default, it's 85%. The mon_osd_*_ratio affect the ceph status. I've noticed that it takes a while for backfilling to restart after changing osd_backfill_full_ratio. Backfilling usually restarts for me in 10-15 minutes. Some PGs will stay in that state until the cluster is nearly done recoverying. I've only seen backfill_toofull happen after the OSD exceeds the ratio (so it's reactive, no proactive). Mine usually happen when I'm rebalancing a nearfull cluster, and an OSD backfills itself toofull. On Mon, Jan 5, 2015 at 11:32 AM, c3 ceph-us...@lopkop.com wrote: Hi, I am wondering how a PG gets marked backfill_toofull. I reweighted several OSDs using ceph osd crush reweight. As expected, PG began moving around (backfilling). Some PGs got marked +backfilling (~10), some +wait_backfill (~100). But some are marked +backfill_toofull. My OSDs are between 25% and 72% full. Looking at ceph pg dump, I can find the backfill_toofull PGs and verified the OSDs involved are less than 72% full. Do backfill reservations include a size? Are these OSDs projected to be toofull, once the current backfilling complete? Some of the backfill_toofull and backfilling point to the same OSDs. I did adjust the full ratios, but that did not change the backfill_toofull status. ceph tell mon.\* injectargs '--mon_osd_full_ratio 0.95' ceph tell osd.\* injectargs '--osd_backfill_full_ratio 0.92' ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] backfill_toofull, but OSDs not full
What was the osd_backfill_full_ratio? That's the config that controls backfill_toofull. By default, it's 85%. The mon_osd_*_ratio affect the ceph status. I've noticed that it takes a while for backfilling to restart after changing osd_backfill_full_ratio. Backfilling usually restarts for me in 10-15 minutes. Some PGs will stay in that state until the cluster is nearly done recoverying. I've only seen backfill_toofull happen after the OSD exceeds the ratio (so it's reactive, no proactive). Mine usually happen when I'm rebalancing a nearfull cluster, and an OSD backfills itself toofull. On Mon, Jan 5, 2015 at 11:32 AM, c3 ceph-us...@lopkop.com wrote: Hi, I am wondering how a PG gets marked backfill_toofull. I reweighted several OSDs using ceph osd crush reweight. As expected, PG began moving around (backfilling). Some PGs got marked +backfilling (~10), some +wait_backfill (~100). But some are marked +backfill_toofull. My OSDs are between 25% and 72% full. Looking at ceph pg dump, I can find the backfill_toofull PGs and verified the OSDs involved are less than 72% full. Do backfill reservations include a size? Are these OSDs projected to be toofull, once the current backfilling complete? Some of the backfill_toofull and backfilling point to the same OSDs. I did adjust the full ratios, but that did not change the backfill_toofull status. ceph tell mon.\* injectargs '--mon_osd_full_ratio 0.95' ceph tell osd.\* injectargs '--osd_backfill_full_ratio 0.92' ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] backfill_toofull, but OSDs not full
In this case the root cause was half denied reservations. http://tracker.ceph.com/issues/9626 This stopped backfills since, those listed as backfilling were actually half denied and doing nothing. The toofull status is not checked until a free backfill slot happens, so everything was just stuck. Interestingly, the toofull was created by other backfills which were not stoppped. http://tracker.ceph.com/issues/9594 Quite the log jam to clear. Quoting Craig Lewis cle...@centraldesktop.com: What was the osd_backfill_full_ratio? That's the config that controls backfill_toofull. By default, it's 85%. The mon_osd_*_ratio affect the ceph status. I've noticed that it takes a while for backfilling to restart after changing osd_backfill_full_ratio. Backfilling usually restarts for me in 10-15 minutes. Some PGs will stay in that state until the cluster is nearly done recoverying. I've only seen backfill_toofull happen after the OSD exceeds the ratio (so it's reactive, no proactive). Mine usually happen when I'm rebalancing a nearfull cluster, and an OSD backfills itself toofull. On Mon, Jan 5, 2015 at 11:32 AM, c3 ceph-us...@lopkop.com wrote: Hi, I am wondering how a PG gets marked backfill_toofull. I reweighted several OSDs using ceph osd crush reweight. As expected, PG began moving around (backfilling). Some PGs got marked +backfilling (~10), some +wait_backfill (~100). But some are marked +backfill_toofull. My OSDs are between 25% and 72% full. Looking at ceph pg dump, I can find the backfill_toofull PGs and verified the OSDs involved are less than 72% full. Do backfill reservations include a size? Are these OSDs projected to be toofull, once the current backfilling complete? Some of the backfill_toofull and backfilling point to the same OSDs. I did adjust the full ratios, but that did not change the backfill_toofull status. ceph tell mon.\* injectargs '--mon_osd_full_ratio 0.95' ceph tell osd.\* injectargs '--osd_backfill_full_ratio 0.92' ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com