Three things come to my mind when looking at your setup/results:

1)The number of pg's.  According to the documentation it should be:

(number of OSD * 100 ) / number of replicas.

Maybe playing with the number a bit would yield better results.


2) Although I am NOT using SSD's as journals, I am using 10GigE with 6 nodes 
and 72 OSD's, and I am able to get around 1100 MB/s using rados bench and the 
default 4 MB object size, the default number of threads (16) and 1 replica.

It seems like the consensus is that you should front around 4 to 6 OSD's total 
per SSD in order to get the best performance.  Also you need to make sure that 
the SSD's that you are using can actually handle the combined write throughput 
from 10 OSD's.

Otherwise the SSD's become the bottleneck.

3) What is the output if you just use the default number of threads (instead of 
100)?

For example:

'rados bench -p test3 30 write'


Shain

Shain Miley | Manager of Systems and Infrastructure, Digital Media | 
[email protected] | 202.513.3649

________________________________________
From: [email protected] [[email protected]] on 
behalf of Kuo Hugo [[email protected]]
Sent: Thursday, December 19, 2013 7:17 AM
To: [email protected]
Subject: [ceph-users] [Rados Bench Result] Maximum bandwidth only 380MB/sec     
with 10Gb Network.

Hi folks,

I'm doing a test for Rados bench now. The cluster is deployed by ceph-deploy

Ceph version: Emperor
FS : XFS

I created a pool test3 with size1 :
pool 13 'test3' rep size 1 min_size 1 crush_ruleset 0 object_hash rjenkins 
pg_num 2000 pgp_num 2000 last_change 166 owner 0

The Rados bench command issued from an independent benchmark machine.

Public network : 10GbE
Cluster network : 10GbE
Result : rados bench -p test3 30 write -t 100
Total time run:         30.275789
Total writes made:      2876
Write size:             4194304
Bandwidth (MB/sec):     379.974

Stddev Bandwidth:       90.2077
Max bandwidth (MB/sec): 420
Min bandwidth (MB/sec): 0
Average Latency:        1.03858
Stddev Latency:         0.112756
Max latency:            2.20545
Min latency:            0.10155

Why the bandwidth only 379MB/sec?   I expected the speed suppose to be over 
900MB/sec due to the pool size only 1.

There're 3 storage nodes here.
10 OSDs on each storage node + journals on a SSD with 10 partitions.

Any suggestions?

Thanks






_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to