Re: [ceph-users] Hammer reduce recovery impact

2015-09-16 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I was out of the office for a few days. We have some more hosts to add. I'll send some logs for examination. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Sep 11, 2015 at 12:45 AM,

Re: [ceph-users] Hammer reduce recovery impact

2015-09-11 Thread GuangYang
If we are talking about requests being blocked 60+ seconds, those tunings might not help (they help a lot for average latency during recovering/backfilling). It would be interesting to see the logs for those blocked requests at OSD side (they have level 0), pattern to search might be "slow

Re: [ceph-users] Hammer reduce recovery impact

2015-09-11 Thread Paweł Sadowski
On 09/10/2015 10:56 PM, Robert LeBlanc wrote: > Things I've tried: > > * Lowered nr_requests on the spindles from 1000 to 100. This reduced > the max latency sometimes up to 3000 ms down to a max of 500-700 ms. > it has also reduced the huge swings in latency, but has also reduced > throughput

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Christian Balzer
Hello, On Thu, 10 Sep 2015 16:16:10 -0600 Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Do the recovery options kick in when there is only backfill going on? > Aside from having these set just in case as your cluster (and one of mine) is clearly at the limits of

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Somnath Roy
6 PM To: Somnath Roy Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Hammer reduce recovery impact -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Do the recovery options kick in when there is only backfill going on? - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Lincoln Bryant
On 9/10/2015 5:39 PM, Lionel Bouton wrote: For example deep-scrubs were a problem on our installation when at times there were several going on. We implemented a scheduler that enforces limits on simultaneous deep-scrubs and these problems are gone. Hi Lionel, Out of curiosity, how many was

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Lionel Bouton
Le 10/09/2015 22:56, Robert LeBlanc a écrit : > We are trying to add some additional OSDs to our cluster, but the > impact of the backfilling has been very disruptive to client I/O and > we have been trying to figure out how to reduce the impact. We have > seen some client I/O blocked for more

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Do the recovery options kick in when there is only backfill going on? - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Sep 10, 2015 at 3:01 PM, Somnath Roy wrote: > Try all these.. > >

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I don't think the script will help our situation as it is just setting osd_max_backfill from 1 to 0. It looks like that change doesn't go into effect until after it finishes the PG. It would be nice if backfill/recovery would skip the journal, but

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Lionel Bouton
Le 11/09/2015 00:20, Robert LeBlanc a écrit : > I don't think the script will help our situation as it is just setting > osd_max_backfill from 1 to 0. It looks like that change doesn't go > into effect until after it finishes the PG. That was what I was afraid of. Note that it should help a

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Lionel Bouton
Le 11/09/2015 01:24, Lincoln Bryant a écrit : > On 9/10/2015 5:39 PM, Lionel Bouton wrote: >> For example deep-scrubs were a problem on our installation when at >> times there were several going on. We implemented a scheduler that >> enforces limits on simultaneous deep-scrubs and these problems

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Somnath Roy
Try all these.. osd recovery max active = 1 osd max backfills = 1 osd recovery threads = 1 osd recovery op priority = 1 Thanks & Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Robert LeBlanc Sent: Thursday, September 10, 2015