Hello all,
Does anybody try to use cephfs?
I have two servers with RHEL7.1(latest kernel 3.10.0-229.14.1.el7.x86_64). Each
server has 15G flash for ceph journal and 12*2Tb SATA disk for data.
I have Infiniband(ipoib) 56Gb/s interconnect between nodes.
Cluster version
# ceph -v
ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
Cluster config
# cat /etc/ceph/ceph.conf
[global]
auth service required = cephx
auth client required = cephx
auth cluster required = cephx
fsid = 0f05deaf-ee6f-4342-b589-5ecf5527aa6f
mon osd full ratio = .95
mon osd nearfull ratio = .90
osd pool default size = 2
osd pool default min size = 1
osd pool default pg num = 32
osd pool default pgp num = 32
max open files = 131072
osd crush chooseleaf type = 1
[mds]
[mds.a]
host = ak34
[mon]
mon_initial_members = a,b
[mon.a]
host = ak34
mon addr = 172.24.32.134:6789
[mon.b]
host = ak35
mon addr = 172.24.32.135:6789
[osd]
osd journal size = 1000
[osd.0]
osd uuid = b3b3cd37-8df5-4455-8104-006ddba2c443
host = ak34
public addr = 172.24.32.134
osd journal = /CEPH_JOURNAL/osd/ceph-0/journal
.....
Below tree of cluster
# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 45.75037 root default
-2 45.75037 region RU
-3 45.75037 datacenter ru-msk-ak48t
-4 22.87518 host ak34
0 1.90627 osd.0 up 1.00000 1.00000
1 1.90627 osd.1 up 1.00000 1.00000
2 1.90627 osd.2 up 1.00000 1.00000
3 1.90627 osd.3 up 1.00000 1.00000
4 1.90627 osd.4 up 1.00000 1.00000
5 1.90627 osd.5 up 1.00000 1.00000
6 1.90627 osd.6 up 1.00000 1.00000
7 1.90627 osd.7 up 1.00000 1.00000
8 1.90627 osd.8 up 1.00000 1.00000
9 1.90627 osd.9 up 1.00000 1.00000
10 1.90627 osd.10 up 1.00000 1.00000
11 1.90627 osd.11 up 1.00000 1.00000
-5 22.87518 host ak35
12 1.90627 osd.12 up 1.00000 1.00000
13 1.90627 osd.13 up 1.00000 1.00000
14 1.90627 osd.14 up 1.00000 1.00000
15 1.90627 osd.15 up 1.00000 1.00000
16 1.90627 osd.16 up 1.00000 1.00000
17 1.90627 osd.17 up 1.00000 1.00000
18 1.90627 osd.18 up 1.00000 1.00000
19 1.90627 osd.19 up 1.00000 1.00000
20 1.90627 osd.20 up 1.00000 1.00000
21 1.90627 osd.21 up 1.00000 1.00000
22 1.90627 osd.22 up 1.00000 1.00000
23 1.90627 osd.23 up 1.00000 1.00000
Status of cluster
# ceph -s
cluster 0f05deaf-ee6f-4342-b589-5ecf5527aa6f
health HEALTH_OK
monmap e1: 2 mons at {a=172.24.32.134:6789/0,b=172.24.32.135:6789/0}
election epoch 10, quorum 0,1 a,b
mdsmap e14: 1/1/1 up {0=a=up:active}
osdmap e194: 24 osds: 24 up, 24 in
pgmap v2305: 384 pgs, 3 pools, 271 GB data, 72288 objects
545 GB used, 44132 GB / 44678 GB avail
384 active+clean
Pools for cephfs
]# ceph osd dump|grep pg
pool 1 'cephfs_data' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 256 pgp_num 256 last_change 154 flags hashpspool
crash_replay_interval 45 stripe_width 0
pool 2 'cephfs_metadata' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 64 pgp_num 64 last_change 144 flags hashpspool
stripe_width 0
Rados bench
# rados bench -p cephfs_data 300 write --no-cleanup && rados bench -p
cephfs_data 300 seq
Maintaining 16 concurrent writes of 4194304 bytes for up to 300 seconds or 0
objects
Object prefix: benchmark_data_XXXXXXXXXXXXXXXXXXXX_8108
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
0 0 0 0 0 0 - 0
1 16 170 154 615.74 616 0.109984 0.0978277
2 16 335 319 637.817 660 0.0623079 0.0985001
3 16 496 480 639.852 644 0.0992808 0.0982317
4 16 662 646 645.862 664 0.0683485 0.0980203
5 16 831 815 651.796 676 0.0773545 0.0973635
6 15 994 979 652.479 656 0.112323 0.096901
7 16 1164 1148 655.826 676 0.107592 0.0969845
8 16 1327 1311 655.335 652 0.0960067 0.0968445
9 16 1488 1472 654.066 644 0.0780589 0.0970879
.....
297 16 43445 43429 584.811 596 0.0569516 0.109399
298 16 43601 43585 584.942 624 0.0707439 0.109388
299 16 43756 43740 585.059 620 0.20408 0.109363
2015-10-15 14:16:59.622610min lat: 0.0109677 max lat: 0.951389 avg lat: 0.109344
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
300 13 43901 43888 585.082 592 0.0768806 0.109344
Total time run: 300.329089
Total reads made: 43901
Read size: 4194304
Bandwidth (MB/sec): 584.705
Average Latency: 0.109407
Max latency: 0.951389
Min latency: 0.0109677
But real write speed is very low
# dd if=/dev/zero|pv|dd oflag=direct of=44444 bs=4k count=10k
10240+0 records in1.5MiB/s] [
<=>
]
10240+0 records out
41943040 bytes (42 MB) copied, 25.9155 s, 1.6 MB/s
40.1MiB 0:00:25 [1.55MiB/s] [
<=>
]
# dd if=/dev/zero|pv|dd oflag=direct of=44444 bs=32k count=10k
10240+0 records in0.5MiB/s] [
<=>
]
10240+0 records out
335544320 bytes (336 MB) copied, 28.2998 s, 11.9 MB/s
320MiB 0:00:28 [11.3MiB/s] [
<=>
]
Do you know of root cause of low speed of write to FS?
Thank you for help in advance!!
--
Best Regards,
Stanislav Butkeev
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com