Hi,

With inspiration from all the other performance threads going on here, I 
started to investigate on my own as well.

I’m seeing a lot iowait on the OSD, and the journal utilised at 2-7%, with 
about 8-30MB/s (mostly around 8MB/s write). This is a dumpling cluster. The 
goal here is to increase the utilisation to maybe 50%.

Journals: Intel DC S3700, OSD: HGST 4TB

I did some initial testing to make the wbthrottle have more in the buffer, and 
I think I managed to do it, didn’t affect the journal utilisation though.

There’s 12 cores for the 10 OSDs per machine to utilise, and they use about 20% 
of them, so I guess no bottle neck there.

Well that’s the problem, I really can’t see any bottleneck with the current 
layout, maybe it’s out copper 10Gb that’s giving us too much latency?

It would be fancy with some kind of bottle-neck troubleshoot in ceph docs :)
I’m guessing I’m not the only one on these kinds of specs and would be 
interesting to see if there’s optimisation to be done.

Hope you guys have a nice weekend :)

Cheers,
Josef

Ping from a host to OSD:

6 packets transmitted, 6 received, 0% packet loss, time 4998ms
rtt min/avg/max/mdev = 0.063/0.107/0.193/0.048 ms

Setting on the OSD

{ "filestore_wbthrottle_xfs_ios_start_flusher": "5000"}
{ "filestore_wbthrottle_xfs_inodes_start_flusher": "5000"}
{ "filestore_wbthrottle_xfs_ios_hard_limit": "10000"}
{ "filestore_wbthrottle_xfs_inodes_hard_limit": "10000"}
{ "filestore_max_sync_interval": "30”}

From the standard

{ "filestore_wbthrottle_xfs_ios_start_flusher": "500"}
{ "filestore_wbthrottle_xfs_inodes_start_flusher": "500"}
{ "filestore_wbthrottle_xfs_ios_hard_limit": “5000"}
{ "filestore_wbthrottle_xfs_inodes_hard_limit": “5000"}
{ "filestore_max_sync_interval": “5”}


a single dump_historic_ops

        { "description": "osd_op(client.47765822.0:99270434 
rbd_data.1da982c2eb141f2.0000000000005825 [stat,write 2093056~8192] 3.8130048c 
e19290)",
          "rmw_flags": 6,
          "received_at": "2015-04-26 08:24:03.226255",
          "age": "87.026653",
          "duration": "0.801927",
          "flag_point": "commit sent; apply or cleanup",
          "client_info": { "client": "client.47765822",
              "tid": 99270434},
          "events": [
                { "time": "2015-04-26 08:24:03.226329",
                  "event": "waiting_for_osdmap"},
                { "time": "2015-04-26 08:24:03.230921",
                  "event": "reached_pg"},
                { "time": "2015-04-26 08:24:03.230928",
                  "event": "started"},
                { "time": "2015-04-26 08:24:03.230931",
                  "event": "started"},
                { "time": "2015-04-26 08:24:03.231791",
                  "event": "waiting for subops from [22,48]"},
                { "time": "2015-04-26 08:24:03.231813",
                  "event": "commit_queued_for_journal_write"},
                { "time": "2015-04-26 08:24:03.231849",
                  "event": "write_thread_in_journal_buffer"},
                { "time": "2015-04-26 08:24:03.232075",
                  "event": "journaled_completion_queued"},
                { "time": "2015-04-26 08:24:03.232492",
                  "event": "op_commit"},
                { "time": "2015-04-26 08:24:03.233134",
                  "event": "sub_op_commit_rec"},
                { "time": "2015-04-26 08:24:03.233183",
                  "event": "op_applied"},
                { "time": "2015-04-26 08:24:04.028167",
                  "event": "sub_op_commit_rec"},
                { "time": "2015-04-26 08:24:04.028174",
                  "event": "commit_sent"},
                { "time": "2015-04-26 08:24:04.028182",
                  "event": "done"}]},


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to