On Thu, Aug 21, 2014 at 07:40:45AM +0000, Dan Van Der Ster wrote:
> On 20 Aug 2014, at 17:54, Hugo Mills <h.r.mi...@reading.ac.uk> wrote:
> >> Does your hardware provide enough IOPS for what your users need?
> >> (e.g. what is the op/s from ceph -w)
> > 
> >   Not really an answer to your question, but: Before the ceph cluster
> > went in, we were running the system on two 5-year-old NFS servers for
> > a while. We have about half the total number of spindles that we used
> > to, but more modern drives.
> 
> NFS exported async or sync? If async, it can’t be compared to
> CephFS. Also, if those NFS servers had RAID cards with a wb-cache,
> it can’t really be compared.

   Hmm. Yes, async. Probably wouldn't have been my choice... (I only
started working with this system recently -- about the same time that
the ceph cluster was deployed to replace the older machines. I haven't
had much of say in what's implemented here, but I have to try to
support it.)

   I'm tempted to put the users' home directories back on an NFS
server, and keep ceph for the research data. That at least should give
us more in the way of interactivity (which is the main thing I'm
getting complaints about).

> >   I'll look at how the op/s values change when we have the problem.
> > At the moment (with what I assume to be normal desktop usage from the
> > 3-4 users in the lab), they're flapping wildly somewhere around a
> > median of 350-400, with peaks up to 800. Somewhere around 15-20 MB/s
> > read and write.

> Another tunable to look at is the filestore max sync interval — in
> my experience the colocated journal/OSD setup suffers with the
> default (5s, IIRC), especially when an OSD is getting a constant
> stream of writes. When this happens, the disk heads are constantly
> seeking back and forth between synchronously writing to the journal
> and flushing the outstanding writes. If we would have a dedicated
> (spinning) disk for the journal, then the synchronous writes (to the
> journal) could be done sequentially (thus, quickly) and the flushes
> would also be quick(er). SSD journals can obviously also help with
> this.

   Not sure what you mean about colocated journal/OSD. The journals
aren't on the same device as the OSDs. However, all three journals on
each machine are on the same SSD.

> For a short test I would try increasing filestore max sync interval
> to 30s or maybe even 60s to see if it helps. (I know that at least
> one of the Inktank experts advise against changing the filestore max
> sync interval — but in my experience 5s is much too short for the
> colocated journal setup.) You need to make sure your journals are
> large enough to store 30/60s of writes, but when you have
> predominantly small writes even a few GB of journal ought to be
> enough.

   I'll have a play with that.

   Thanks for all the help so far -- it's been useful. I'm learning
what the right kind of questions are.

   Hugo.

-- 
Hugo Mills :: IT Services, University of Reading
Specialist Engineer, Research Servers :: x6943 :: R07 Harry Pitt Building
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to