Re: [ceph-users] recovery process stops

Harald Rößler Mon, 20 Oct 2014 13:42:44 -0700

yes, tomorrow I will get the replacement of the failed disk, to get a new node 
with many disk will take a few days.
No other idea?


Harald Rößler   


> Am 20.10.2014 um 16:45 schrieb Wido den Hollander <[email protected]>:
> 
> On 10/20/2014 04:43 PM, Harald Rößler wrote:
>> Yes, I had some OSD which was near full, after that I tried to fix the 
>> problem with "ceph osd reweight-by-utilization", but this does not help. 
>> After that I set the near full ratio to 88% with the idea that the remapping 
>> would fix the issue. Also a restart of the OSD doesn’t help. At the same 
>> time I had a hardware failure of on disk. :-(. After that failure the 
>> recovery process start at "degraded ~ 13%“ and stops at 7%.
>> Honestly I am scared in the moment I am doing the wrong operation.
>> 
> 
> Any chance of adding a new node with some fresh disks? Seems like you
> are operating on the storage capacity limit of the nodes and that your
> only remedy would be adding more spindles.
> 
> Wido
> 
>> Regards
>> Harald Rößler        
>> 
>> 
>> 
>>> Am 20.10.2014 um 14:51 schrieb Wido den Hollander <[email protected]>:
>>> 
>>> On 10/20/2014 02:45 PM, Harald Rößler wrote:
>>>> Dear All
>>>> 
>>>> I have in them moment a issue with my cluster. The recovery process stops.
>>>> 
>>> 
>>> See this: 2 active+degraded+remapped+backfill_toofull
>>> 
>>> 156 pgs backfill_toofull
>>> 
>>> You have one or more OSDs which are to full and that causes recovery to
>>> stop.
>>> 
>>> If you add more capacity to the cluster recovery will continue and finish.
>>> 
>>>> ceph -s
>>>>  health HEALTH_WARN 188 pgs backfill; 156 pgs backfill_toofull; 4 pgs 
>>>> backfilling; 55 pgs degraded; 49 pgs recovery_wait; 297 pgs stuck unclean; 
>>>> recovery 111487/1488290 degraded (7.491%)
>>>>  monmap e2: 3 mons at 
>>>> {0=10.99.10.10:6789/0,12=10.99.10.22:6789/0,6=10.99.10.16:6789/0}, 
>>>> election epoch 332, quorum 0,1,2 0,12,6
>>>>  osdmap e6748: 24 osds: 23 up, 23 in
>>>>   pgmap v43314672: 3328 pgs: 3031 active+clean, 43 
>>>> active+remapped+wait_backfill, 3 active+degraded+wait_backfill, 96 
>>>> active+remapped+wait_backfill+backfill_toofull, 31 active+recovery_wait, 
>>>> 19 active+degraded+wait_backfill+backfill_toofull, 36 active+remapped, 3 
>>>> active+remapped+backfilling, 18 active+remapped+backfill_toofull, 6 
>>>> active+degraded+remapped+wait_backfill, 15 active+recovery_wait+remapped, 
>>>> 21 active+degraded+remapped+wait_backfill+backfill_toofull, 1 
>>>> active+recovery_wait+degraded, 1 active+degraded+remapped+backfilling, 2 
>>>> active+degraded+remapped+backfill_toofull, 2 
>>>> active+recovery_wait+degraded+remapped; 1698 GB data, 5206 GB used, 971 GB 
>>>> / 6178 GB avail; 24382B/s rd, 12411KB/s wr, 320op/s; 111487/1488290 
>>>> degraded (7.491%)
>>>> 
>>>> 
>>>> I have tried to restart all OSD in the cluster, but does not help to 
>>>> finish the recovery of the cluster.
>>>> 
>>>> Have someone any idea
>>>> 
>>>> Kind Regards
>>>> Harald Rößler      
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> [email protected]
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>>> 
>>> 
>>> -- 
>>> Wido den Hollander
>>> Ceph consultant and trainer
>>> 42on B.V.
>>> 
>>> Phone: +31 (0)20 700 9902
>>> Skype: contact42on
>>> _______________________________________________
>>> ceph-users mailing list
>>> [email protected]
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
> 
> 
> -- 
> Wido den Hollander
> Ceph consultant and trainer
> 42on B.V.
> 
> Phone: +31 (0)20 700 9902
> Skype: contact42on

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] recovery process stops

Reply via email to