I'm forcing a flush by lower the cache_target_dirty_ratio to a lower value. This forces writes to the EC pool, these are the operations I'm trying to throttle a bit. I am understanding you correctly that's throttling only works for the other way around? Promoting cold objects into the hot cache?
The measurement is a problem for me at the moment. I'm trying to get the perf dumps into collectd/graphite but it seems I need to hand roll a solution since the plugins I found are not working anymore. What I'm doing now is just summing the bandwidth statistics from my nodes to get an approximated number. I hope to make some time this week to write a collectd plugin to fetch get the actual stats from perf dumps. I confirmed the settings are indeed correctly picked up across the nodes in the cluster. I tried switching my pool to readforward since for my needs the EC pool is fast enough for reads but I got scared when I got the warning about data corruption. How safe is readforward really at this point? I noticed the option was removed from the latest docs while still living on the google cached version: http://webcache.googleusercontent.com/search?q=cache:http://docs.ceph.com/docs/master/rados/operations/cache-tiering/ On Mon, May 16, 2016 at 11:14 AM Nick Fisk <n...@fisk.me.uk> wrote: > > -----Original Message----- > > From: Peter Kerdisle [mailto:peter.kerdi...@gmail.com] > > Sent: 15 May 2016 08:04 > > To: Nick Fisk <n...@fisk.me.uk> > > Cc: ceph-users@lists.ceph.com > > Subject: Re: [ceph-users] Erasure pool performance expectations > > > > Hey Nick, > > > > I've been playing around with the osd_tier_promote_max_bytes_sec setting > > but I'm not really seeing any changes. > > > > What would be expected when setting a max bytes value? I would expected > > that my OSDs would throttle themselves to this rate when doing promotes > > but this doesn't seem to be the case. When I set it to 2MB I would > expect a > > node with 10 OSDs to do a max of 20MB/s during promotions. Is this math > > correct? > > Yes that sounds about right, but this will only be for optional promotions > (ie reads that meet the recency/hitset settings). If you are doing any > writes, they will force the object to be promoted as you can't directly > write to an EC pool. And also don't forget that once the cache pool is > full, it will start evicting/flushing cold objects for every new object > that gets promoted into it. > > Few questions > > 1. What promotion rates are you seeing? > > 2. How are you measuring the promotion rate just out of interest? > > 3. Can you confirm that the OSD is picking up that setting correctly by > running something like (sudo ceph --admin-daemon > /var/run/ceph/ceph-osd.0.asok config show | grep promote)? > > > > > Thanks, > > > > Peter > > > > On Tue, May 10, 2016 at 3:48 PM, Nick Fisk <n...@fisk.me.uk> wrote: > > > > > > > -----Original Message----- > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf > > Of > > > Peter Kerdisle > > > Sent: 10 May 2016 14:37 > > > Cc: ceph-users@lists.ceph.com > > > Subject: Re: [ceph-users] Erasure pool performance expectations > > > > > > To answer my own question it seems that you can change settings on the > > fly > > > using > > > > > > ceph tell osd.* injectargs '--osd_tier_promote_max_bytes_sec 5242880' > > > osd.0: osd_tier_promote_max_bytes_sec = '5242880' (unchangeable) > > > > > > However the response seems to imply I can't change this setting. Is > there > > an > > > other way to change these settings? > > > > Sorry Peter, I missed your last email. You can also specify that setting > in the > > ceph.conf, ie I have in mine > > > > osd_tier_promote_max_bytes_sec = 4000000 > > > > > > > > > > > > > > > On Sun, May 8, 2016 at 2:37 PM, Peter Kerdisle > > <peter.kerdi...@gmail.com> > > > wrote: > > > Hey guys, > > > > > > I noticed the merge request that fixes the switch around here > > > https://github.com/ceph/ceph/pull/8912 > > > > > > I had two questions: > > > > > > • Does this effect my performance in any way? Could it explain the slow > > > requests I keep having? > > > • Can I modify these settings manually myself on my cluster? > > > Thanks, > > > > > > Peter > > > > > > > > > On Fri, May 6, 2016 at 9:58 AM, Peter Kerdisle < > peter.kerdi...@gmail.com> > > > wrote: > > > Hey Mark, > > > > > > Sorry I missed your message as I'm only subscribed to daily digests. > > > > > > Date: Tue, 3 May 2016 09:05:02 -0500 > > > From: Mark Nelson <mnel...@redhat.com> > > > To: ceph-users@lists.ceph.com > > > Subject: Re: [ceph-users] Erasure pool performance expectations > > > Message-ID: <df3de049-a7f9-7f86-3ed3-47079e401...@redhat.com> > > > Content-Type: text/plain; charset=windows-1252; format=flowed > > > In addition to what nick said, it's really valuable to watch your cache > > > tier write behavior during heavy IO. One thing I noticed is you said > > > you have 2 SSDs for journals and 7 SSDs for data. > > > > > > I thought the hardware recommendations were 1 journal disk per 3 or 4 > > data > > > disks but I think I might have misunderstood it. Looking at my journal > > > read/writes they seem to be ok > > > though: > > https://www.dropbox.com/s/er7bei4idd56g4d/Screenshot%202016- > > > 05-06%2009.55.30.png?dl=0 > > > > > > However I started running into a lot of slow requests (made a separate > > > thread for those: Diagnosing slow requests) and now I'm hoping these > > could > > > be related to my journaling setup. > > > > > > If they are all of > > > the same type, you're likely bottlenecked by the journal SSDs for > > > writes, which compounded with the heavy promotions is going to really > > > hold you back. > > > What you really want: > > > 1) (assuming filestore) equal large write throughput between the > > > journals and data disks. > > > How would one achieve that? > > > > > > 2) promotions to be limited by some reasonable fraction of the cache > > > tier and/or network throughput (say 70%). This is why the > > > user-configurable promotion throttles were added in jewel. > > > Are these already in the docs somewhere? > > > > > > 3) The cache tier to fill up quickly when empty but change slowly once > > > it's full (ie limiting promotions and evictions). No real way to do > > > this yet. > > > Mark > > > > > > Thanks for your thoughts. > > > > > > Peter > > > > > > > > > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com