yes, tomorrow I will get the replacement of the failed disk, to get a new node with many disk will take a few days. No other idea?
Harald Rößler > Am 20.10.2014 um 16:45 schrieb Wido den Hollander <[email protected]>: > > On 10/20/2014 04:43 PM, Harald Rößler wrote: >> Yes, I had some OSD which was near full, after that I tried to fix the >> problem with "ceph osd reweight-by-utilization", but this does not help. >> After that I set the near full ratio to 88% with the idea that the remapping >> would fix the issue. Also a restart of the OSD doesn’t help. At the same >> time I had a hardware failure of on disk. :-(. After that failure the >> recovery process start at "degraded ~ 13%“ and stops at 7%. >> Honestly I am scared in the moment I am doing the wrong operation. >> > > Any chance of adding a new node with some fresh disks? Seems like you > are operating on the storage capacity limit of the nodes and that your > only remedy would be adding more spindles. > > Wido > >> Regards >> Harald Rößler >> >> >> >>> Am 20.10.2014 um 14:51 schrieb Wido den Hollander <[email protected]>: >>> >>> On 10/20/2014 02:45 PM, Harald Rößler wrote: >>>> Dear All >>>> >>>> I have in them moment a issue with my cluster. The recovery process stops. >>>> >>> >>> See this: 2 active+degraded+remapped+backfill_toofull >>> >>> 156 pgs backfill_toofull >>> >>> You have one or more OSDs which are to full and that causes recovery to >>> stop. >>> >>> If you add more capacity to the cluster recovery will continue and finish. >>> >>>> ceph -s >>>> health HEALTH_WARN 188 pgs backfill; 156 pgs backfill_toofull; 4 pgs >>>> backfilling; 55 pgs degraded; 49 pgs recovery_wait; 297 pgs stuck unclean; >>>> recovery 111487/1488290 degraded (7.491%) >>>> monmap e2: 3 mons at >>>> {0=10.99.10.10:6789/0,12=10.99.10.22:6789/0,6=10.99.10.16:6789/0}, >>>> election epoch 332, quorum 0,1,2 0,12,6 >>>> osdmap e6748: 24 osds: 23 up, 23 in >>>> pgmap v43314672: 3328 pgs: 3031 active+clean, 43 >>>> active+remapped+wait_backfill, 3 active+degraded+wait_backfill, 96 >>>> active+remapped+wait_backfill+backfill_toofull, 31 active+recovery_wait, >>>> 19 active+degraded+wait_backfill+backfill_toofull, 36 active+remapped, 3 >>>> active+remapped+backfilling, 18 active+remapped+backfill_toofull, 6 >>>> active+degraded+remapped+wait_backfill, 15 active+recovery_wait+remapped, >>>> 21 active+degraded+remapped+wait_backfill+backfill_toofull, 1 >>>> active+recovery_wait+degraded, 1 active+degraded+remapped+backfilling, 2 >>>> active+degraded+remapped+backfill_toofull, 2 >>>> active+recovery_wait+degraded+remapped; 1698 GB data, 5206 GB used, 971 GB >>>> / 6178 GB avail; 24382B/s rd, 12411KB/s wr, 320op/s; 111487/1488290 >>>> degraded (7.491%) >>>> >>>> >>>> I have tried to restart all OSD in the cluster, but does not help to >>>> finish the recovery of the cluster. >>>> >>>> Have someone any idea >>>> >>>> Kind Regards >>>> Harald Rößler >>>> >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> [email protected] >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>> >>> >>> -- >>> Wido den Hollander >>> Ceph consultant and trainer >>> 42on B.V. >>> >>> Phone: +31 (0)20 700 9902 >>> Skype: contact42on >>> _______________________________________________ >>> ceph-users mailing list >>> [email protected] >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > > -- > Wido den Hollander > Ceph consultant and trainer > 42on B.V. > > Phone: +31 (0)20 700 9902 > Skype: contact42on _______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
