Re: [ceph-users] Erasure pool performance expectations

Peter Kerdisle Mon, 16 May 2016 02:40:25 -0700

I'm forcing a flush by lower the cache_target_dirty_ratio to a lower value.
This forces writes to the EC pool, these are the operations I'm trying to
throttle a bit. I am understanding you correctly that's throttling only
works for the other way around? Promoting cold objects into the hot cache?


The measurement is a problem for me at the moment. I'm trying to get the
perf dumps into collectd/graphite but it seems I need to hand roll a
solution since the plugins I found are not working anymore. What I'm doing
now is just summing the bandwidth statistics from my nodes to get an
approximated number. I hope to make some time this week to write a collectd
plugin to fetch get the actual stats from perf dumps.

I confirmed the settings are indeed correctly picked up across the nodes in
the cluster.

I tried switching my pool to readforward since for my needs the EC pool is
fast enough for reads but I got scared when I got the warning about data
corruption. How safe is readforward really at this point? I noticed the
option was removed from the latest docs while still living on the google
cached version:
http://webcache.googleusercontent.com/search?q=cache:http://docs.ceph.com/docs/master/rados/operations/cache-tiering/



On Mon, May 16, 2016 at 11:14 AM Nick Fisk <n...@fisk.me.uk> wrote:

> > -----Original Message-----
> > From: Peter Kerdisle [mailto:peter.kerdi...@gmail.com]
> > Sent: 15 May 2016 08:04
> > To: Nick Fisk <n...@fisk.me.uk>
> > Cc: ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] Erasure pool performance expectations
> >
> > Hey Nick,
> >
> > I've been playing around with the osd_tier_promote_max_bytes_sec setting
> > but I'm not really seeing any changes.
> >
> > What would be expected when setting a max bytes value? I would expected
> > that my OSDs would throttle themselves to this rate when doing promotes
> > but this doesn't seem to be the case. When I set it to 2MB I would
> expect a
> > node with 10 OSDs to do a max of 20MB/s during promotions. Is this math
> > correct?
>
> Yes that sounds about right, but this will only be for optional promotions
> (ie reads that meet the recency/hitset settings). If you are doing any
> writes, they will force the object to be promoted as you can't directly
> write to an EC pool. And also don't forget that once the cache pool is
> full, it will start evicting/flushing cold objects for every new object
> that gets promoted into it.
>
> Few questions
>
> 1. What promotion rates are you seeing?
>
> 2. How are you measuring the promotion rate just out of interest?
>
> 3. Can you confirm that the OSD is picking up that setting  correctly by
> running something like (sudo ceph --admin-daemon
> /var/run/ceph/ceph-osd.0.asok config show | grep promote)?
>
> >
> > Thanks,
> >
> > Peter
> >
> > On Tue, May 10, 2016 at 3:48 PM, Nick Fisk <n...@fisk.me.uk> wrote:
> >
> >
> > > -----Original Message-----
> > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> > Of
> > > Peter Kerdisle
> > > Sent: 10 May 2016 14:37
> > > Cc: ceph-users@lists.ceph.com
> > > Subject: Re: [ceph-users] Erasure pool performance expectations
> > >
> > > To answer my own question it seems that you can change settings on the
> > fly
> > > using
> > >
> > > ceph tell osd.* injectargs '--osd_tier_promote_max_bytes_sec 5242880'
> > > osd.0: osd_tier_promote_max_bytes_sec = '5242880' (unchangeable)
> > >
> > > However the response seems to imply I can't change this setting. Is
> there
> > an
> > > other way to change these settings?
> >
> > Sorry Peter, I missed your last email. You can also specify that setting
> in the
> > ceph.conf, ie I have in mine
> >
> > osd_tier_promote_max_bytes_sec = 4000000
> >
> >
> >
> > >
> > >
> > > On Sun, May 8, 2016 at 2:37 PM, Peter Kerdisle
> > <peter.kerdi...@gmail.com>
> > > wrote:
> > > Hey guys,
> > >
> > > I noticed the merge request that fixes the switch around here
> > > https://github.com/ceph/ceph/pull/8912
> > >
> > > I had two questions:
> > >
> > > • Does this effect my performance in any way? Could it explain the slow
> > > requests I keep having?
> > > • Can I modify these settings manually myself on my cluster?
> > > Thanks,
> > >
> > > Peter
> > >
> > >
> > > On Fri, May 6, 2016 at 9:58 AM, Peter Kerdisle <
> peter.kerdi...@gmail.com>
> > > wrote:
> > > Hey Mark,
> > >
> > > Sorry I missed your message as I'm only subscribed to daily digests.
> > >
> > > Date: Tue, 3 May 2016 09:05:02 -0500
> > > From: Mark Nelson <mnel...@redhat.com>
> > > To: ceph-users@lists.ceph.com
> > > Subject: Re: [ceph-users] Erasure pool performance expectations
> > > Message-ID: <df3de049-a7f9-7f86-3ed3-47079e401...@redhat.com>
> > > Content-Type: text/plain; charset=windows-1252; format=flowed
> > > In addition to what nick said, it's really valuable to watch your cache
> > > tier write behavior during heavy IO.  One thing I noticed is you said
> > > you have 2 SSDs for journals and 7 SSDs for data.
> > >
> > > I thought the hardware recommendations were 1 journal disk per 3 or 4
> > data
> > > disks but I think I might have misunderstood it. Looking at my journal
> > > read/writes they seem to be ok
> > > though:
> > https://www.dropbox.com/s/er7bei4idd56g4d/Screenshot%202016-
> > > 05-06%2009.55.30.png?dl=0
> > >
> > > However I started running into a lot of slow requests (made a separate
> > > thread for those: Diagnosing slow requests) and now I'm hoping these
> > could
> > > be related to my journaling setup.
> > >
> > > If they are all of
> > > the same type, you're likely bottlenecked by the journal SSDs for
> > > writes, which compounded with the heavy promotions is going to really
> > > hold you back.
> > > What you really want:
> > > 1) (assuming filestore) equal large write throughput between the
> > > journals and data disks.
> > > How would one achieve that?
> > >
> > > 2) promotions to be limited by some reasonable fraction of the cache
> > > tier and/or network throughput (say 70%).  This is why the
> > > user-configurable promotion throttles were added in jewel.
> > > Are these already in the docs somewhere?
> > >
> > > 3) The cache tier to fill up quickly when empty but change slowly once
> > > it's full (ie limiting promotions and evictions).  No real way to do
> > > this yet.
> > > Mark
> > >
> > > Thanks for your thoughts.
> > >
> > > Peter
> > >
> > >
> >
>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Erasure pool performance expectations

Reply via email to