Re: rados bench throughput with no disk or network activity
-Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel- ow...@vger.kernel.org] On Behalf Of Deneau, Tom Sent: Friday, May 29, 2015 1:10 AM To: ceph-devel Subject: rados bench throughput with no disk or network activity I've noticed that * with a single node cluster with 4 osds * and running rados bench rand on that same node so no network traffic * with a number of objects small enough so that everything is in the cache so no disk traffic we still peak out at about 1600 MB/sec. And the cpu is 40% idle. (and a good chunk of the cpu activity is the rados benchmark itself) What is likely causing the throttling here? You may want to try this: https://github.com/ceph/ceph/pull/4690 This one will help too: https://github.com/ceph/ceph/pull/4728 I ran into similar problem (rados bench peaking at 2300MB/s when no rados i/o was taking place; in fact, I ripped out every call to librados in the benchmark loop) and those two were solutions for me. With best regards / Pozdrawiam Piotr Dałek -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: rados bench throughput with no disk or network activity
On Thu, May 28, 2015 at 4:09 PM, Deneau, Tom tom.den...@amd.com wrote: I've noticed that * with a single node cluster with 4 osds * and running rados bench rand on that same node so no network traffic * with a number of objects small enough so that everything is in the cache so no disk traffic we still peak out at about 1600 MB/sec. And the cpu is 40% idle. (and a good chunk of the cpu activity is the rados benchmark itself) What is likely causing the throttling here? Well, rados bench itself is essentially single-threaded, so if it's using 100% CPU that's probably the bottleneck you're hitting. Otherwise, by default it will limit itself to 100MB of outstanding IO (there's an objecter config value you can change for this; it's been discussed recently) and that might not be enough given the latencies of hopping packets across different CPUs, and the OSDs have a slightly-embarrassing amount of CPU computation and thread hopping they have to perform on every op (around half a millisecond's worth on each read, I think?). -Gerg -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: rados bench throughput with no disk or network activity
-Original Message- From: Gregory Farnum [mailto:g...@gregs42.com] Sent: Thursday, May 28, 2015 6:18 PM To: Deneau, Tom Cc: ceph-devel Subject: Re: rados bench throughput with no disk or network activity On Thu, May 28, 2015 at 4:09 PM, Deneau, Tom tom.den...@amd.com wrote: I've noticed that * with a single node cluster with 4 osds * and running rados bench rand on that same node so no network traffic * with a number of objects small enough so that everything is in the cache so no disk traffic we still peak out at about 1600 MB/sec. And the cpu is 40% idle. (and a good chunk of the cpu activity is the rados benchmark itself) What is likely causing the throttling here? Well, rados bench itself is essentially single-threaded, so if it's using 100% CPU that's probably the bottleneck you're hitting. Otherwise, by default it will limit itself to 100MB of outstanding IO (there's an objecter config value you can change for this; it's been discussed recently) and that might not be enough given the latencies of hopping packets across different CPUs, and the OSDs have a slightly- embarrassing amount of CPU computation and thread hopping they have to perform on every op (around half a millisecond's worth on each read, I think?). -Gerg Right. I was involved in the objecter config discussion :) and I have set the limits higher. And this 1600 MB/sec limit seems to be the same whatever the size of the objects. rados bench is using about 30% of the cpu and the total cpu usage is about 60% (the rest being mostly from the 4 osds). Hmm, I just tried running 4 copies of rados bench rand, and I can get a little bit higher combined totals, but not much higher maybe 1800 MB/sec. -- Tom N�r��yb�X��ǧv�^�){.n�+���z�]z���{ay�ʇڙ�,j��f���h���z��w��� ���j:+v���w�j�mzZ+�ݢj��!�i
Re: rados bench throughput with no disk or network activity
On Thu, May 28, 2015 at 7:50 PM, Deneau, Tom tom.den...@amd.com wrote: -Original Message- From: Gregory Farnum [mailto:g...@gregs42.com] Sent: Thursday, May 28, 2015 6:18 PM To: Deneau, Tom Cc: ceph-devel Subject: Re: rados bench throughput with no disk or network activity On Thu, May 28, 2015 at 4:09 PM, Deneau, Tom tom.den...@amd.com wrote: I've noticed that * with a single node cluster with 4 osds * and running rados bench rand on that same node so no network traffic * with a number of objects small enough so that everything is in the cache so no disk traffic we still peak out at about 1600 MB/sec. And the cpu is 40% idle. (and a good chunk of the cpu activity is the rados benchmark itself) What is likely causing the throttling here? Well, rados bench itself is essentially single-threaded, so if it's using 100% CPU that's probably the bottleneck you're hitting. Otherwise, by default it will limit itself to 100MB of outstanding IO (there's an objecter config value you can change for this; it's been discussed recently) and that might not be enough given the latencies of hopping packets across different CPUs, and the OSDs have a slightly- embarrassing amount of CPU computation and thread hopping they have to perform on every op (around half a millisecond's worth on each read, I think?). -Gerg Right. I was involved in the objecter config discussion :) and I have set the limits higher. And this 1600 MB/sec limit seems to be the same whatever the size of the objects. rados bench is using about 30% of the cpu and the total cpu usage is about 60% (the rest being mostly from the 4 osds). Hmm, I just tried running 4 copies of rados bench rand, and I can get a little bit higher combined totals, but not much higher maybe 1800 MB/sec. I take it you have multiple different CPUs/cores and you mean 60% average across all the CPUs/cores? Not 60% of a single core? Just wanting to be explicit. In any case, I wonder if what you're seeing is some kind of serialization (contention) either in the kernel or somewhere in the OSD. -- Tom -- Milosz Tanski CTO 16 East 34th Street, 15th floor New York, NY 10016 p: 646-253-9055 e: mil...@adfin.com -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: rados bench throughput with no disk or network activity
On Thu, May 28, 2015 at 4:50 PM, Deneau, Tom tom.den...@amd.com wrote: -Original Message- From: Gregory Farnum [mailto:g...@gregs42.com] Sent: Thursday, May 28, 2015 6:18 PM To: Deneau, Tom Cc: ceph-devel Subject: Re: rados bench throughput with no disk or network activity On Thu, May 28, 2015 at 4:09 PM, Deneau, Tom tom.den...@amd.com wrote: I've noticed that * with a single node cluster with 4 osds * and running rados bench rand on that same node so no network traffic * with a number of objects small enough so that everything is in the cache so no disk traffic we still peak out at about 1600 MB/sec. And the cpu is 40% idle. (and a good chunk of the cpu activity is the rados benchmark itself) What is likely causing the throttling here? Well, rados bench itself is essentially single-threaded, so if it's using 100% CPU that's probably the bottleneck you're hitting. Otherwise, by default it will limit itself to 100MB of outstanding IO (there's an objecter config value you can change for this; it's been discussed recently) and that might not be enough given the latencies of hopping packets across different CPUs, and the OSDs have a slightly- embarrassing amount of CPU computation and thread hopping they have to perform on every op (around half a millisecond's worth on each read, I think?). -Gerg Right. I was involved in the objecter config discussion :) and I have set the limits higher. And this 1600 MB/sec limit seems to be the same whatever the size of the objects. rados bench is using about 30% of the cpu and the total cpu usage is about 60% (the rest being mostly from the 4 osds). Hmm, I just tried running 4 copies of rados bench rand, and I can get a little bit higher combined totals, but not much higher maybe 1800 MB/sec. You might also just be approaching the limits of your hardware's effective memory bandwidth in a configuration like that — that seems low to me but between shuffling data back and forth on a bunch of sockets and things adds up. I don't know if there's a good way to measure that. -Greg -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html