Hi Matteo,

 

Ceph introduces latency into the write path and so what you are seeing is
typical. If you increase the iodepth of the fio test you should get higher
results though, until you start maxing out your CPU.

 

Nick

 

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Matteo Dacrema
Sent: 26 October 2015 11:20
To: ceph-us...@ceph.com
Subject: [ceph-users] BAD nvme SSD performance

 

Hi all,

 

I've recently buy two Samsung SM951 256GB nvme PCIe SSDs and built a 2 OSD
ceph cluster with min_size = 1.

I've tested them with fio ad I obtained two very different results with
these two situations with fio.

This is the command : fio  --ioengine=libaio --direct=1  --name=test
--filename=test --bs=4k  --size=100M --readwrite=randwrite  --numjobs=200
--group_reporting

 

On the OSD host I've obtained this result:

bw=575493KB/s, iops=143873

 

On the client host with a mounted volume I've obtained this result:

 

Fio executed on the client osd with a mounted volume:

bw=9288.1KB/s, iops=2322

 

I've obtained this results with Journal and data on the same disk and also
with Journal on separate SSD.

 

I've two OSD host with 64GB of RAM and 2x Intel Xeon E5-2620 @ 2.00GHz and
one MON host with 128GB of RAM and 2x Intel Xeon E5-2620 @ 2.00 GHz.

I'm using 10G mellanox NIC and Switch with jumbo frames.

 

I also did other test with this configuration ( see attached Excel workbook
)

Hardware configuration for each of the two OSD nodes:

                3x  100GB Intel SSD DC S3700 with 3 * 30 GB partition for
every SSD 

                9x  1TB Seagate HDD

Results: about 12k IOPS with 4k bs and same fio test.

 

I can't understand where is the problem with nvme SSDs.

Anyone can help me? 

 

Here the ceph.conf:

[global]

fsid = 3392a053-7b48-49d3-8fc9-50f245513cc7

mon_initial_members = mon1

mon_host = 192.168.1.3

auth_cluster_required = cephx

auth_service_required = cephx

auth_client_required = cephx

osd_pool_default_size = 2

mon_client_hung_interval = 1.0

mon_client_ping_interval = 5.0

public_network = 192.168.1.0/24

cluster_network = 192.168.1.0/24

mon_osd_full_ratio = .90

mon_osd_nearfull_ratio = .85

 

[mon]

mon_warn_on_legacy_crush_tunables = false

 

[mon.1]

host = mon1

mon_addr = 192.168.1.3:6789

 

[osd]

osd_journal_size = 30000

journal_dio = true

journal_aio = true

osd_op_threads = 24

osd_op_thread_timeout = 60

osd_disk_threads = 8

osd_recovery_threads = 2

osd_recovery_max_active = 1

osd_max_backfills = 2

osd_mkfs_type = xfs

osd_mkfs_options_xfs = "-f -i size=2048"

osd_mount_options_xfs = "rw,noatime,inode64,logbsize=256k,delaylog"

filestore_xattr_use_omap = false

filestore_max_inline_xattr_size = 512

filestore_max_sync_interval = 10

filestore_merge_threshold = 40

filestore_split_multiple = 8

filestore_flusher = false

filestore_queue_max_ops = 2000

filestore_queue_max_bytes = 536870912

filestore_queue_committing_max_ops = 500

filestore_queue_committing_max_bytes = 268435456

filestore_op_threads = 2

 

Best regards,

Matteo

 




 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to