Re: rados bench throughput with no disk or network activity

2015-05-29 Thread Dałek , Piotr
 -Original Message-
 From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-
 ow...@vger.kernel.org] On Behalf Of Deneau, Tom
 Sent: Friday, May 29, 2015 1:10 AM
 To: ceph-devel
 Subject: rados bench throughput with no disk or network activity
 
 I've noticed that
* with a single node cluster with 4 osds
* and running rados bench rand on that same node so no network traffic
* with a number of objects small enough so that everything is in the cache
 so no disk traffic
 
 we still peak out at about 1600 MB/sec.
 
 And the cpu is 40% idle. (and a good chunk of the cpu activity is the rados
 benchmark itself)
 
 What is likely causing the throttling here?

You may want to try this: https://github.com/ceph/ceph/pull/4690
This one will help too: https://github.com/ceph/ceph/pull/4728

I ran into similar problem (rados bench peaking at 2300MB/s when no rados
i/o was taking place; in fact, I ripped out every call to librados in the 
benchmark 
loop) and those two were solutions for me.

With best regards / Pozdrawiam
Piotr Dałek


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rados bench throughput with no disk or network activity

2015-05-28 Thread Gregory Farnum
On Thu, May 28, 2015 at 4:09 PM, Deneau, Tom tom.den...@amd.com wrote:
 I've noticed that
* with a single node cluster with 4 osds
* and running rados bench rand on that same node so no network traffic
* with a number of objects small enough so that everything is in the cache 
 so no disk traffic

 we still peak out at about 1600 MB/sec.

 And the cpu is 40% idle. (and a good chunk of the cpu activity is the rados 
 benchmark itself)

 What is likely causing the throttling here?

Well, rados bench itself is essentially single-threaded, so if it's
using 100% CPU that's probably the bottleneck you're hitting.

Otherwise, by default it will limit itself to 100MB of outstanding IO
(there's an objecter config value you can change for this; it's been
discussed recently) and that might not be enough given the latencies
of hopping packets across different CPUs, and the OSDs have a
slightly-embarrassing amount of CPU computation and thread hopping
they have to perform on every op (around half a millisecond's worth on
each read, I think?).
-Gerg
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: rados bench throughput with no disk or network activity

2015-05-28 Thread Deneau, Tom


 -Original Message-
 From: Gregory Farnum [mailto:g...@gregs42.com]
 Sent: Thursday, May 28, 2015 6:18 PM
 To: Deneau, Tom
 Cc: ceph-devel
 Subject: Re: rados bench throughput with no disk or network activity
 
 On Thu, May 28, 2015 at 4:09 PM, Deneau, Tom tom.den...@amd.com wrote:
  I've noticed that
 * with a single node cluster with 4 osds
 * and running rados bench rand on that same node so no network traffic
 * with a number of objects small enough so that everything is in
  the cache so no disk traffic
 
  we still peak out at about 1600 MB/sec.
 
  And the cpu is 40% idle. (and a good chunk of the cpu activity is the
  rados benchmark itself)
 
  What is likely causing the throttling here?
 
 Well, rados bench itself is essentially single-threaded, so if it's using
 100% CPU that's probably the bottleneck you're hitting.
 
 Otherwise, by default it will limit itself to 100MB of outstanding IO
 (there's an objecter config value you can change for this; it's been
 discussed recently) and that might not be enough given the latencies of
 hopping packets across different CPUs, and the OSDs have a slightly-
 embarrassing amount of CPU computation and thread hopping they have to
 perform on every op (around half a millisecond's worth on each read, I
 think?).
 -Gerg

Right.  I was involved in the objecter config discussion :) and 
I have set the limits higher.   And this 1600 MB/sec limit seems to be
the same whatever the size of the objects.

rados bench is using about 30% of the cpu and the total cpu usage is about 60%
(the rest being mostly from the 4 osds).

Hmm, I just tried running 4 copies of rados bench rand, and I can get a little 
bit
higher combined totals, but not much higher maybe 1800 MB/sec.
 
-- Tom

N�r��yb�X��ǧv�^�)޺{.n�+���z�]z���{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj��!�i

Re: rados bench throughput with no disk or network activity

2015-05-28 Thread Milosz Tanski
On Thu, May 28, 2015 at 7:50 PM, Deneau, Tom tom.den...@amd.com wrote:


 -Original Message-
 From: Gregory Farnum [mailto:g...@gregs42.com]
 Sent: Thursday, May 28, 2015 6:18 PM
 To: Deneau, Tom
 Cc: ceph-devel
 Subject: Re: rados bench throughput with no disk or network activity

 On Thu, May 28, 2015 at 4:09 PM, Deneau, Tom tom.den...@amd.com wrote:
  I've noticed that
 * with a single node cluster with 4 osds
 * and running rados bench rand on that same node so no network traffic
 * with a number of objects small enough so that everything is in
  the cache so no disk traffic
 
  we still peak out at about 1600 MB/sec.
 
  And the cpu is 40% idle. (and a good chunk of the cpu activity is the
  rados benchmark itself)
 
  What is likely causing the throttling here?

 Well, rados bench itself is essentially single-threaded, so if it's using
 100% CPU that's probably the bottleneck you're hitting.

 Otherwise, by default it will limit itself to 100MB of outstanding IO
 (there's an objecter config value you can change for this; it's been
 discussed recently) and that might not be enough given the latencies of
 hopping packets across different CPUs, and the OSDs have a slightly-
 embarrassing amount of CPU computation and thread hopping they have to
 perform on every op (around half a millisecond's worth on each read, I
 think?).
 -Gerg

 Right.  I was involved in the objecter config discussion :) and
 I have set the limits higher.   And this 1600 MB/sec limit seems to be
 the same whatever the size of the objects.

 rados bench is using about 30% of the cpu and the total cpu usage is about 60%
 (the rest being mostly from the 4 osds).

 Hmm, I just tried running 4 copies of rados bench rand, and I can get a 
 little bit
 higher combined totals, but not much higher maybe 1800 MB/sec.

I take it you have multiple different CPUs/cores and you mean 60%
average across all the CPUs/cores? Not 60% of a single core? Just
wanting to be explicit.

In any case, I wonder if what you're seeing is some kind of
serialization (contention) either in the kernel or somewhere in the
OSD.


 -- Tom




-- 
Milosz Tanski
CTO
16 East 34th Street, 15th floor
New York, NY 10016

p: 646-253-9055
e: mil...@adfin.com
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rados bench throughput with no disk or network activity

2015-05-28 Thread Gregory Farnum
On Thu, May 28, 2015 at 4:50 PM, Deneau, Tom tom.den...@amd.com wrote:


 -Original Message-
 From: Gregory Farnum [mailto:g...@gregs42.com]
 Sent: Thursday, May 28, 2015 6:18 PM
 To: Deneau, Tom
 Cc: ceph-devel
 Subject: Re: rados bench throughput with no disk or network activity

 On Thu, May 28, 2015 at 4:09 PM, Deneau, Tom tom.den...@amd.com wrote:
  I've noticed that
 * with a single node cluster with 4 osds
 * and running rados bench rand on that same node so no network traffic
 * with a number of objects small enough so that everything is in
  the cache so no disk traffic
 
  we still peak out at about 1600 MB/sec.
 
  And the cpu is 40% idle. (and a good chunk of the cpu activity is the
  rados benchmark itself)
 
  What is likely causing the throttling here?

 Well, rados bench itself is essentially single-threaded, so if it's using
 100% CPU that's probably the bottleneck you're hitting.

 Otherwise, by default it will limit itself to 100MB of outstanding IO
 (there's an objecter config value you can change for this; it's been
 discussed recently) and that might not be enough given the latencies of
 hopping packets across different CPUs, and the OSDs have a slightly-
 embarrassing amount of CPU computation and thread hopping they have to
 perform on every op (around half a millisecond's worth on each read, I
 think?).
 -Gerg

 Right.  I was involved in the objecter config discussion :) and
 I have set the limits higher.   And this 1600 MB/sec limit seems to be
 the same whatever the size of the objects.

 rados bench is using about 30% of the cpu and the total cpu usage is about 60%
 (the rest being mostly from the 4 osds).

 Hmm, I just tried running 4 copies of rados bench rand, and I can get a 
 little bit
 higher combined totals, but not much higher maybe 1800 MB/sec.

You might also just be approaching the limits of your hardware's
effective memory bandwidth in a configuration like that — that seems
low to me but between shuffling data back and forth on a bunch of
sockets and things adds up. I don't know if there's a good way to
measure that.
-Greg
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html