Re: [ceph-users] recovery process stops

Craig Lewis Tue, 21 Oct 2014 08:29:54 -0700

That will fix itself over time.  remapped just means that Ceph is moving
the data around.  It's normal to see PGs in the remapped and/or backfilling
state after OSD restarts.


They should go down steadily over time.  How long depends on how much data
is in the PGs, how fast your hardware is, how many OSDs are affected, and
how much you allow recovery to impact cluster performance.  Mine currently
take about 20 minutes per PG.  If all 47 are on the same OSD, it'll be a
while.  If they're evenly split between multiple OSDs, parallelism will
speed that up.

On Tue, Oct 21, 2014 at 1:22 AM, Harald Rößler <[email protected]>
wrote:

> Hi all,
>
> thank you for your support, now the file system is not degraded any more.
> Now I have a minus degrading :-)
>
> 2014-10-21 10:15:22.303139 mon.0 [INF] pgmap v43376478: 3328 pgs: 3281
> active+clean, 47 active+remapped; 1609 GB data, 5022 GB used, 1155 GB /
> 6178 GB avail; 8034B/s rd, 3548KB/s wr, 161op/s; -1638/1329293 degraded
> (-0.123%)
>
> but ceph reports me a health HEALTH_WARN 47 pgs stuck unclean; recovery
> -1638/1329293 degraded (-0.123%)
>
> I think this warning is reported because there are 47 active+remapped
> objects, some ideas how to fix that now?
>
> Kind Regards
> Harald Roessler
>
>
> Am 21.10.2014 um 01:03 schrieb Craig Lewis <[email protected]>:
>
> I've been in a state where reweight-by-utilization was deadlocked (not the
> daemons, but the remap scheduling).  After successive osd reweight
> commands, two OSDs wanted to swap PGs, but they were both toofull.  I ended
> up temporarily increasing mon_osd_nearfull_ratio to 0.87.  That removed the
> impediment, and everything finished remapping.  Everything went smoothly,
> and I changed it back when all the remapping finished.
>
> Just be careful if you need to get close to mon_osd_full_ratio.  Ceph does
> greater-than on these percentages, not greater-than-equal.  You really
> don't want the disks to get greater-than mon_osd_full_ratio, because all
> external IO will stop until you resolve that.
>
>
> On Mon, Oct 20, 2014 at 10:18 AM, Leszek Master <[email protected]> wrote:
>
>> You can set lower weight on full osds, or try changing the
>> osd_near_full_ratio parameter in your cluster from 85 to for example 89.
>> But i don't know what can go wrong when you do that.
>>
>>
>> 2014-10-20 17:12 GMT+02:00 Wido den Hollander <[email protected]>:
>>
>>> On 10/20/2014 05:10 PM, Harald Rößler wrote:
>>> > yes, tomorrow I will get the replacement of the failed disk, to get a
>>> new node with many disk will take a few days.
>>> > No other idea?
>>> >
>>>
>>> If the disks are all full, then, no.
>>>
>>> Sorry to say this, but it came down to poor capacity management. Never
>>> let any disk in your cluster fill over 80% to prevent these situations.
>>>
>>> Wido
>>>
>>> > Harald Rößler
>>> >
>>> >
>>> >> Am 20.10.2014 um 16:45 schrieb Wido den Hollander <[email protected]>:
>>> >>
>>> >> On 10/20/2014 04:43 PM, Harald Rößler wrote:
>>> >>> Yes, I had some OSD which was near full, after that I tried to fix
>>> the problem with "ceph osd reweight-by-utilization", but this does not
>>> help. After that I set the near full ratio to 88% with the idea that the
>>> remapping would fix the issue. Also a restart of the OSD doesn’t help. At
>>> the same time I had a hardware failure of on disk. :-(. After that failure
>>> the recovery process start at "degraded ~ 13%“ and stops at 7%.
>>> >>> Honestly I am scared in the moment I am doing the wrong operation.
>>> >>>
>>> >>
>>> >> Any chance of adding a new node with some fresh disks? Seems like you
>>> >> are operating on the storage capacity limit of the nodes and that your
>>> >> only remedy would be adding more spindles.
>>> >>
>>> >> Wido
>>> >>
>>> >>> Regards
>>> >>> Harald Rößler
>>> >>>
>>> >>>
>>> >>>
>>> >>>> Am 20.10.2014 um 14:51 schrieb Wido den Hollander <[email protected]>:
>>> >>>>
>>> >>>> On 10/20/2014 02:45 PM, Harald Rößler wrote:
>>> >>>>> Dear All
>>> >>>>>
>>> >>>>> I have in them moment a issue with my cluster. The recovery
>>> process stops.
>>> >>>>>
>>> >>>>
>>> >>>> See this: 2 active+degraded+remapped+backfill_toofull
>>> >>>>
>>> >>>> 156 pgs backfill_toofull
>>> >>>>
>>> >>>> You have one or more OSDs which are to full and that causes
>>> recovery to
>>> >>>> stop.
>>> >>>>
>>> >>>> If you add more capacity to the cluster recovery will continue and
>>> finish.
>>> >>>>
>>> >>>>> ceph -s
>>> >>>>>  health HEALTH_WARN 188 pgs backfill; 156 pgs backfill_toofull; 4
>>> pgs backfilling; 55 pgs degraded; 49 pgs recovery_wait; 297 pgs stuck
>>> unclean; recovery 111487/1488290 degraded (7.491%)
>>> >>>>>  monmap e2: 3 mons at {0=
>>> 10.99.10.10:6789/0,12=10.99.10.22:6789/0,6=10.99.10.16:6789/0},
>>> election epoch 332, quorum 0,1,2 0,12,6
>>> >>>>>  osdmap e6748: 24 osds: 23 up, 23 in
>>> >>>>>   pgmap v43314672: 3328 pgs: 3031 active+clean, 43
>>> active+remapped+wait_backfill, 3 active+degraded+wait_backfill, 96
>>> active+remapped+wait_backfill+backfill_toofull, 31 active+recovery_wait, 19
>>> active+degraded+wait_backfill+backfill_toofull, 36 active+remapped, 3
>>> active+remapped+backfilling, 18 active+remapped+backfill_toofull, 6
>>> active+degraded+remapped+wait_backfill, 15 active+recovery_wait+remapped,
>>> 21 active+degraded+remapped+wait_backfill+backfill_toofull, 1
>>> active+recovery_wait+degraded, 1 active+degraded+remapped+backfilling, 2
>>> active+degraded+remapped+backfill_toofull, 2
>>> active+recovery_wait+degraded+remapped; 1698 GB data, 5206 GB used, 971 GB
>>> / 6178 GB avail; 24382B/s rd, 12411KB/s wr, 320op/s; 111487/1488290
>>> degraded (7.491%)
>>> >>>>>
>>> >>>>>
>>> >>>>> I have tried to restart all OSD in the cluster, but does not help
>>> to finish the recovery of the cluster.
>>> >>>>>
>>> >>>>> Have someone any idea
>>> >>>>>
>>> >>>>> Kind Regards
>>> >>>>> Harald Rößler
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> _______________________________________________
>>> >>>>> ceph-users mailing list
>>> >>>>> [email protected]
>>> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> Wido den Hollander
>>> >>>> Ceph consultant and trainer
>>> >>>> 42on B.V.
>>> >>>>
>>> >>>> Phone: +31 (0)20 700 9902
>>> >>>> Skype: contact42on
>>> >>>> _______________________________________________
>>> >>>> ceph-users mailing list
>>> >>>> [email protected]
>>> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >>>
>>> >>
>>> >>
>>> >> --
>>> >> Wido den Hollander
>>> >> Ceph consultant and trainer
>>> >> 42on B.V.
>>> >>
>>> >> Phone: +31 (0)20 700 9902
>>> >> Skype: contact42on
>>> >
>>>
>>>
>>> --
>>> Wido den Hollander
>>> Ceph consultant and trainer
>>> 42on B.V.
>>>
>>> Phone: +31 (0)20 700 9902
>>> Skype: contact42on
>>> _______________________________________________
>>> ceph-users mailing list
>>> [email protected]
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>> 2014-10-20 17:12 GMT+02:00 Wido den Hollander <[email protected]>:
>>
>>> On 10/20/2014 05:10 PM, Harald Rößler wrote:
>>> > yes, tomorrow I will get the replacement of the failed disk, to get a
>>> new node with many disk will take a few days.
>>> > No other idea?
>>> >
>>>
>>> If the disks are all full, then, no.
>>>
>>> Sorry to say this, but it came down to poor capacity management. Never
>>> let any disk in your cluster fill over 80% to prevent these situations.
>>>
>>> Wido
>>>
>>> > Harald Rößler
>>> >
>>> >
>>> >> Am 20.10.2014 um 16:45 schrieb Wido den Hollander <[email protected]>:
>>> >>
>>> >> On 10/20/2014 04:43 PM, Harald Rößler wrote:
>>> >>> Yes, I had some OSD which was near full, after that I tried to fix
>>> the problem with "ceph osd reweight-by-utilization", but this does not
>>> help. After that I set the near full ratio to 88% with the idea that the
>>> remapping would fix the issue. Also a restart of the OSD doesn’t help. At
>>> the same time I had a hardware failure of on disk. :-(. After that failure
>>> the recovery process start at "degraded ~ 13%“ and stops at 7%.
>>> >>> Honestly I am scared in the moment I am doing the wrong operation.
>>> >>>
>>> >>
>>> >> Any chance of adding a new node with some fresh disks? Seems like you
>>> >> are operating on the storage capacity limit of the nodes and that your
>>> >> only remedy would be adding more spindles.
>>> >>
>>> >> Wido
>>> >>
>>> >>> Regards
>>> >>> Harald Rößler
>>> >>>
>>> >>>
>>> >>>
>>> >>>> Am 20.10.2014 um 14:51 schrieb Wido den Hollander <[email protected]>:
>>> >>>>
>>> >>>> On 10/20/2014 02:45 PM, Harald Rößler wrote:
>>> >>>>> Dear All
>>> >>>>>
>>> >>>>> I have in them moment a issue with my cluster. The recovery
>>> process stops.
>>> >>>>>
>>> >>>>
>>> >>>> See this: 2 active+degraded+remapped+backfill_toofull
>>> >>>>
>>> >>>> 156 pgs backfill_toofull
>>> >>>>
>>> >>>> You have one or more OSDs which are to full and that causes
>>> recovery to
>>> >>>> stop.
>>> >>>>
>>> >>>> If you add more capacity to the cluster recovery will continue and
>>> finish.
>>> >>>>
>>> >>>>> ceph -s
>>> >>>>>  health HEALTH_WARN 188 pgs backfill; 156 pgs backfill_toofull; 4
>>> pgs backfilling; 55 pgs degraded; 49 pgs recovery_wait; 297 pgs stuck
>>> unclean; recovery 111487/1488290 degraded (7.491%)
>>> >>>>>  monmap e2: 3 mons at {0=
>>> 10.99.10.10:6789/0,12=10.99.10.22:6789/0,6=10.99.10.16:6789/0},
>>> election epoch 332, quorum 0,1,2 0,12,6
>>> >>>>>  osdmap e6748: 24 osds: 23 up, 23 in
>>> >>>>>   pgmap v43314672: 3328 pgs: 3031 active+clean, 43
>>> active+remapped+wait_backfill, 3 active+degraded+wait_backfill, 96
>>> active+remapped+wait_backfill+backfill_toofull, 31 active+recovery_wait, 19
>>> active+degraded+wait_backfill+backfill_toofull, 36 active+remapped, 3
>>> active+remapped+backfilling, 18 active+remapped+backfill_toofull, 6
>>> active+degraded+remapped+wait_backfill, 15 active+recovery_wait+remapped,
>>> 21 active+degraded+remapped+wait_backfill+backfill_toofull, 1
>>> active+recovery_wait+degraded, 1 active+degraded+remapped+backfilling, 2
>>> active+degraded+remapped+backfill_toofull, 2
>>> active+recovery_wait+degraded+remapped; 1698 GB data, 5206 GB used, 971 GB
>>> / 6178 GB avail; 24382B/s rd, 12411KB/s wr, 320op/s; 111487/1488290
>>> degraded (7.491%)
>>> >>>>>
>>> >>>>>
>>> >>>>> I have tried to restart all OSD in the cluster, but does not help
>>> to finish the recovery of the cluster.
>>> >>>>>
>>> >>>>> Have someone any idea
>>> >>>>>
>>> >>>>> Kind Regards
>>> >>>>> Harald Rößler
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> _______________________________________________
>>> >>>>> ceph-users mailing list
>>> >>>>> [email protected]
>>> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> Wido den Hollander
>>> >>>> Ceph consultant and trainer
>>> >>>> 42on B.V.
>>> >>>>
>>> >>>> Phone: +31 (0)20 700 9902
>>> >>>> Skype: contact42on
>>> >>>> _______________________________________________
>>> >>>> ceph-users mailing list
>>> >>>> [email protected]
>>> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >>>
>>> >>
>>> >>
>>> >> --
>>> >> Wido den Hollander
>>> >> Ceph consultant and trainer
>>> >> 42on B.V.
>>> >>
>>> >> Phone: +31 (0)20 700 9902
>>> >> Skype: contact42on
>>> >
>>>
>>>
>>> --
>>> Wido den Hollander
>>> Ceph consultant and trainer
>>> 42on B.V.
>>>
>>> Phone: +31 (0)20 700 9902
>>> Skype: contact42on
>>> _______________________________________________
>>> ceph-users mailing list
>>> [email protected]
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] recovery process stops

Reply via email to