Works great, seemed to have alot less impact than just letting it peer all PGs 
at the same time. Used an increment of 0.05 without issue, then a ceph tell 
'osd.*' injectargs '--osd-max-backfills 2' seems to keep the HDD at around 
85-100% util, but not really affecting the clients.
Solid advice, cheers.

Kind Regards,
David Majchrzak


> 29 jan. 2018 kl. 23:14 skrev David Majchrzak <[email protected]>:
> 
> Thanks Steve!
> 
> So the peering won't actually move any blocks around, but will make sure that 
> all PGs know what state they are in? That means that when I start increasing 
> reweight, PGs will be allocated to the disk, but won't actually recover yet. 
> However, they will be set as "degraded".
> So when all of the peering is done, I'll unset the norecover/nobackfill flags 
> and backfill will commence but will be less I/O intensive than peering and 
> backfilling at the same time?
> 
> Kind Regards,
> 
> David Majchrzak
> 
>> 29 jan. 2018 kl. 22:57 skrev Steve Taylor <[email protected] 
>> <mailto:[email protected]>>:
>> 
>> There are two concerns with setting the reweight to 1.0. The first is 
>> peering and the second is backfilling. Peering is going to block client I/O 
>> on the affected OSDs, while backfilling will only potentially slow things 
>> down.
>> 
>> I don't know what your client I/O looks like, but personally I would 
>> probably set the norecover and nobackfill flags, slowly increment your 
>> reweight value by 0.01 or whatever you deem to be appropriate for your 
>> environment, waiting for peering to complete in between each step. Also 
>> allow any resulting blocked requests to clear up before incrementing your 
>> reweight again.
>> 
>> When your reweight is all the way up to 1.0, inject osd_max_backfills to 
>> whatever you like (or don't if you're happy with it as is) and unset the 
>> norecover and nobackfill flags to let backfilling begin. If you are unable 
>> to handle the impact of backfilling with osd_max_backfills set to 1, then 
>> you need to add some new OSDs to your cluster before doing any of this. They 
>> will have to backfill too, but at least you'll have more spindles to handle 
>> it.
>> 
>> 
>> <SC_LOGO_VERT_4C_100x72_f823be1a-ae53-43d3-975c-b054a1b22ec3.jpg>
>> 
>> Steve Taylor | Senior Software Engineer | StorageCraft Technology 
>> Corporation <https://storagecraft.com/>
>> 380 Data Drive Suite 300 | Draper | Utah | 84020
>> Office: 801.871.2799 |
>> 
>> If you are not the intended recipient of this message or received it 
>> erroneously, please notify the sender and delete it, together with any 
>> attachments, and be advised that any dissemination or copying of this 
>> message is prohibited.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Mon, 2018-01-29 at 22:43 +0100, David Majchrzak wrote:
>>> And so I totally forgot to add df tree to the mail.
>>> Here's the interesting bit from two first nodes. where osd.11 has weight 
>>> but is reweighted to 0.
>>> 
>>> 
>>> root@osd1:~# ceph osd df tree
>>> ID WEIGHT    REWEIGHT SIZE   USE    AVAIL  %USE  VAR  TYPE NAME
>>> -1 181.99997        -   109T 50848G 60878G     0    0 root default
>>> -2  36.39999        - 37242G 16792G 20449G 45.09 0.99     host osd1
>>>  0   3.64000  1.00000  3724G  1730G  1993G 46.48 1.02         osd.0
>>>  1   3.64000  1.00000  3724G  1666G  2057G 44.75 0.98         osd.1
>>>  2   3.64000  1.00000  3724G  1734G  1989G 46.57 1.02         osd.2
>>>  3   3.64000  1.00000  3724G  1387G  2336G 37.25 0.82         osd.3
>>>  4   3.64000  1.00000  3724G  1722G  2002G 46.24 1.01         osd.4
>>>  6   3.64000  1.00000  3724G  1840G  1883G 49.43 1.08         osd.6
>>>  7   3.64000  1.00000  3724G  1651G  2072G 44.34 0.97         osd.7
>>>  8   3.64000  1.00000  3724G  1747G  1976G 46.93 1.03         osd.8
>>>  9   3.64000  1.00000  3724G  1697G  2026G 45.58 1.00         osd.9
>>>  5   3.64000  1.00000  3724G  1614G  2109G 43.34 0.95         osd.5
>>> -3  36.39999        -      0      0      0     0    0     host osd2
>>> 12   3.64000  1.00000  3724G  1730G  1993G 46.46 1.02         osd.12
>>> 13   3.64000  1.00000  3724G  1745G  1978G 46.88 1.03         osd.13
>>> 14   3.64000  1.00000  3724G  1707G  2016G 45.84 1.01         osd.14
>>> 15   3.64000  1.00000  3724G  1540G  2184G 41.35 0.91         osd.15
>>> 16   3.64000  1.00000  3724G  1484G  2239G 39.86 0.87         osd.16
>>> 18   3.64000  1.00000  3724G  1928G  1796G 51.77 1.14         osd.18
>>> 20   3.64000  1.00000  3724G  1767G  1956G 47.45 1.04         osd.20
>>> 10   3.64000  1.00000  3724G  1797G  1926G 48.27 1.06         osd.10
>>> 49   3.64000  1.00000  3724G  1847G  1877G 49.60 1.09         osd.49
>>> 11   3.64000        0      0      0      0     0    0         osd.11
>>> 
>>>> 
>>>> 29 jan. 2018 kl. 22:40 skrev David Majchrzak <[email protected] 
>>>> <mailto:[email protected]>>:
>>>> 
>>>> Hi!
>>>> 
>>>> Cluster: 5 HW nodes, 10 HDDs with SSD journals, filestore, 0.94.9 hammer, 
>>>> debian wheezy (scheduled to upgrade once this is fixed).
>>>> 
>>>> I have a replaced HDD that another admin set to reweight 0 instead of 
>>>> weight 0 (I can't remember the reason).
>>>> What would be the best way to slowly backfill it? Usually I'm using weight 
>>>> and slowly growing it to max size.
>>>> 
>>>> I guess if I just set reweight to 1.0, it will backfill as fast as I let 
>>>> it, that is max 1 backfill / osd but it will probably disrupt client io 
>>>> (this being on hammer).
>>>> 
>>>> And if I set the weight on it to 0, the node will get less weight, and 
>>>> will start moving data around everywhere right?
>>>> 
>>>> Can I use reweight the same way as weight here, slowly increasing it up to 
>>>> 1.0 by increments of say 0.01?
>>>> 
>>>> Kind Regards,
>>>> David Majchrzak
>>>> 
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> [email protected] <mailto:[email protected]>
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>>>> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>> 
>>> 
>>> _______________________________________________
>>> ceph-users mailing list
>>> [email protected] <mailto:[email protected]>
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>>> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>> 
> 
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to