Re: [ceph-users] Effect of tunables on client system load

Nathanial Byrnes Tue, 13 Jun 2017 18:44:12 -0700

Thanks very much for the insights Greg!

My most recent suspicion around the resource consumption is that, with my
current configuration, xen is provisioning rbd-nbd storage for guests,
rather than just using the kernel module like I was last time around. And,
(while I'm unsure of how this works) but it seems there is a tapdisk
process for each guest on each xenserver along with the rbd-nbd processes.
Perhaps due to this use of NBD xenserver is taking a scenic route through
userspace that it wasn't before... That said, gluster is attached via fuse
... I apparently need to dig more into how Xen is attaching to Ceph vs
gluster....


   Anyway, thanks again!

   Nate

On Tue, Jun 13, 2017 at 5:30 PM, Gregory Farnum <[email protected]> wrote:

>
>
> On Thu, Jun 8, 2017 at 11:11 PM Nathanial Byrnes <[email protected]> wrote:
>
>> Hi All,
>>    First, some background:
>>        I have been running a small (4 compute nodes) xen server cluster
>> backed by both a small ceph (4 other nodes with a total of 18x 1-spindle
>> osd's) and small gluster cluster (2 nodes each with a 14 spindle RAID
>> array). I started with gluster 3-4 years ago, at first using NFS to access
>> gluster, then upgraded to gluster FUSE. However, I had been facinated with
>> ceph since I first read about it, and probably added ceph as soon as XCP
>> released a kernel with RBD support, possibly approaching 2 years ago.
>>        With Ceph, since I started out with the kernel RBD, I believe it
>> locked me to Bobtail tunables. I connected to XCP via a project that tricks
>> XCP into running LVM on the RBDs managing all this through the iSCSI mgmt
>> infrastructure somehow... Only recently I've switched to a newer project
>> that uses the RBD-NBD mapping instead. This should let me use whatever
>> tunables my client SW support AFAIK. I have not yet changed my tunables as
>> the data re-org will probably take a day or two (only 1Gb networking...).
>>
>>    Over this time period, I've observed that my gluster backed guests
>> tend not to consume as much of domain-0's (the Xen VM management host)
>> resources as do my Ceph backed guests. To me, this is somewhat intuitive
>>  as the ceph client has to do more "thinking" than the gluster client.
>> However, It seems to me that the IO performance of the VM guests is well
>> outside than the difference in spindle count would suggest. I am open to
>> the notion that there are probably quite a few sub-optimal design
>> choices/constraints within the environment. However, I haven't the
>> resources to conduct all that many experiments and benchmarks.... So, over
>> time I've ended up treating ceph as my resilient storage, and gluster as my
>> more performant (3x vs 2x replication, and, as mentioned above, my gluster
>> guests had quicker guest IO and lower dom-0 load).
>>
>>     So, on to my questions:
>>
>>    Would setting my tunables to jewel (my present release), or anything
>> newer than bobtail (which is what I think I am set to if I read the ceph
>> status warning correctly) reduce my dom-0 load and/or improve any aspects
>> of the client IO performance?
>>
>
> Unfortunately no. The tunables are entirely about how CRUSH works, and
> while it's possible to construct pessimal CRUSH maps that are impossible to
> satisfy and take a long time to churn through calculations, it's hard and
> you clearly haven't done that here. I think you're just seeing that the
> basic CPU cost of a Ceph IO is higher than in Gluster, or else there is
> something unusual about the Xen configuration you have here compared to
> more common deployments.
>
>
>>
>>    Will adding nodes to the cluster ceph reduce load on dom-0, and/or
>> improve client IO performance (I doubt the former and would expect the
>> latter...)?
>>
>
> In general adding nodes will increase parallel throughput (ie, async IO on
> one client or the performance of multiple clients), but won't reduce
> latencies. It shouldn't have much (any?) impact on client CPU usage (other
> than if the client is pushing through more IO, it will use proportionally
> more CPU), nor on the CPU usage of existing daemons.
>
>
>>
>>    So, why did I bring up gluster at all? In an ideal world, I would like
>> to have just one storage environment that would satisfy all my
>> organizations needs. If forced to choose with the knowledge I have today, I
>> would have to select gluster. I am hoping to come up with some actionable
>> data points that might help me discover some of my mistakes which might
>> explain my experience to date and maybe even help remedy said mistakes. As
>> I mentioned earlier, I like ceph, more so than gluster, and would like to
>> employ more within my environment. But, given budgetary constraints, I need
>> to do what's best for my organization.
>>
>>
> Yeah. I'm a little surprised you noticed it in the environment you
> described, but there aren't many people running Xen on Ceph so perhaps
> there's something odd happening with the setup it has there which I and
> others aren't picking up on. :/
>
> Good luck!
> -Greg
>

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Effect of tunables on client system load

Reply via email to