Re: [ceph-users] suse_enterprise_storage3_rbd_LIO_vmware_performance_bad

mq Fri, 01 Jul 2016 02:28:48 -0700

HI  
1.
2 sw  iscsi gateways(deploy on osd/monitor ) using lrbd to create,the iscsi 
target is LIO 
configuration:
{
  "auth": [
    {
      "target": "iqn.2016-07.org.linux-iscsi.iscsi.x86:testvol", 
      "authentication": "none"
    }
  ], 
  "targets": [
    {
      "target": "iqn.2016-07.org.linux-iscsi.iscsi.x86:testvol", 
      "hosts": [
        {
          "host": "node2", 
          "portal": "east"
        }, 
        {
          "host": "node3", 
          "portal": "west"
        }
      ]
    }
  ], 
  "portals": [
    {
      "name": "east", 
      "addresses": [
        "10.0.52.92"
      ]
    }, 
    {
      "name": "west", 
      "addresses": [
        "10.0.52.93"
      ]
    }
  ], 
  "pools": [
    {
      "pool": "rbd", 
      "gateways": [
        {
          "target": "iqn.2016-07.org.linux-iscsi.iscsi.x86:testvol", 
          "tpg": [
            {
              "image": "testvol"
            }
          ]
        }
      ]
    }
  ]
}


2 the ceph cluster itself’s performance is ok. i create a rbd on one of ceph 
node. fio results is nice: 4K randwrite IOPS=3013 bw=100MB/s.
so i think the ceph cluster have no bottleneck.

3  Intel S3510 SSD 480G enterprise not consumer

new test :clone a VM in wmware can reach 100MB/s. but fio and dd test in vm 
still poor.


> 在 2016年7月1日，下午4:18，Christian Balzer <[email protected]> 写道：
> 
> 
> Hello,
> 
> On Fri, 1 Jul 2016 13:04:45 +0800 mq wrote:
> 
>> Hi list
>> I have tested suse enterprise storage3 using 2 iscsi  gateway attached
>> to  vmware. The performance is bad.  
> 
> First off, it's somewhat funny that you're testing the repackaged SUSE
> Ceph, but asking for help here (with Ceph being owned by Red Hat).
> 
> Aside from that, you're not telling us what these 2 iSCSI gateways are
> (SW, HW specs/configuration).
> 
> Having iSCSI on top of Ceph is by the very nature of things going to be
> slower than native Ceph.
> 
> Use "rbd bench" or a VM client with RBD to get a base number of what your
> Ceph cluster is capable of, this will help identifying where the slowdown
> is.
> 
>> I have turn off  VAAI following the
>> (https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1033665
>>  
>> <https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1033665>)
>> <https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1033665
>>  
>> <https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1033665>)>.
>> My cluster 3 ceph nodes :2*E5-2620 64G , mem 2*1Gbps (3*10K SAS, 1*480G
>> SSD) per node, SSD as journal 1 vmware node  2*E5-2620 64G , mem 2*1Gbps 
> 
> That's a slow (latency wise) network, but not your problem.
> What SSD model? 
> A 480GB size suggests a consumer model and that would explain a lot.
> 
> Check you storage nodes with atop during the fio runs and see if you can
> spot a bottleneck.
> 
> Christian
> 
>> # ceph -s
>>    cluster 0199f68d-a745-4da3-9670-15f2981e7a15
>>     health HEALTH_OK
>>     monmap e1: 3 mons at
>> {node1=192.168.50.91:6789/0,node2=192.168.50.92:6789/0,node3=192.168.50.93:6789/0}
>> election epoch 22, quorum 0,1,2 node1,node2,node3 osdmap e200: 9 osds: 9
>> up, 9 in flags sortbitwise
>>      pgmap v1162: 448 pgs, 1 pools, 14337 MB data, 4935 objects
>>            18339 MB used, 5005 GB / 5023 GB avail
>>                 448 active+clean
>>  client io 87438 kB/s wr, 0 op/s rd, 213 op/s wr
>> 
>> sudo ceph osd tree
>> ID WEIGHT  TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY
>> -1 4.90581 root default
>> -2 1.63527     host node1
>> 0 0.54509         osd.0       up  1.00000          1.00000
>> 1 0.54509         osd.1       up  1.00000          1.00000
>> 2 0.54509         osd.2       up  1.00000          1.00000
>> -3 1.63527     host node2
>> 3 0.54509         osd.3       up  1.00000          1.00000
>> 4 0.54509         osd.4       up  1.00000          1.00000
>> 5 0.54509         osd.5       up  1.00000          1.00000
>> -4 1.63527     host node3
>> 6 0.54509         osd.6       up  1.00000          1.00000
>> 7 0.54509         osd.7       up  1.00000          1.00000
>> 8 0.54509         osd.8       up  1.00000          1.00000
>> 
>> 
>> 
>> An linux vm in vmmare， running fio.  4k randwrite result just 64 IOPS
>> lantency is high，dd test just 11MB／s. 
>> fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite -size=100G
>> -filename=/dev/sdb  -name="EBS 4KB randwrite test" -iodepth=32
>> -runtime=60 EBS 4KB randwrite test: (g=0): rw=randwrite,
>> bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=32 fio-2.0.13 Starting 1
>> thread Jobs: 1 (f=1): [w] [100.0% done] [0K/131K/0K /s] [0 /32 /0  iops]
>> [eta 00m:00s] EBS 4KB randwrite test: (groupid=0, jobs=1): err= 0:
>> pid=6766: Wed Jun 29 21:28:06 2016 write: io=15696KB, bw=264627 B/s,
>> iops=64 , runt= 60737msec slat (usec): min=10 , max=213 , avg=35.54,
>> stdev=16.41 clat (msec): min=1 , max=31368 , avg=495.01, stdev=1862.52
>>     lat (msec): min=2 , max=31368 , avg=495.04, stdev=1862.52
>>    clat percentiles (msec):
>>     |  1.00th=[    7],  5.00th=[    8], 10.00th=[    8],
>> 20.00th=[    9], | 30.00th=[    9], 40.00th=[   10], 50.00th=[  198],
>> 60.00th=[  204], | 70.00th=[  208], 80.00th=[  217], 90.00th=[  799],
>> 95.00th=[ 1795], | 99.00th=[ 7177], 99.50th=[12649], 99.90th=[16712],
>> 99.95th=[16712], | 99.99th=[16712]
>>    bw (KB/s)  : min=   36, max=11960, per=100.00%, avg=264.77,
>> stdev=1110.81 lat (msec) : 2=0.03%, 4=0.23%, 10=40.93%, 20=0.48%,
>> 50=0.03% lat (msec) : 100=0.08%, 250=39.55%, 500=5.63%, 750=2.91%,
>> 1000=1.35% lat (msec) : 2000=4.03%, >=2000=4.77%
>>  cpu          : usr=0.02%, sys=0.22%, ctx=2973, majf=0,
>> minf=18446744073709538907 IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%,
>> 16=0.4%, 32=99.2%, >=64=0.0% submit    : 0=0.0%, 4=100.0%, 8=0.0%,
>> 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete  : 0=0.0%, 4=100.0%,
>> 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0% issued    :
>> total=r=0/w=3924/d=0, short=r=0/w=0/d=0 
>> Run status group 0 (all jobs):
>>  WRITE: io=15696KB, aggrb=258KB/s, minb=258KB/s, maxb=258KB/s,
>> mint=60737msec, maxt=60737msec 
>> Disk stats (read/write):
>>  sdb: ios=83/3921, merge=0/0, ticks=60/1903085, in_queue=1931694,
>> util=100.00%
>> 
>> anyone can give me some suggestion to improve the performance ?
>> 
>> Regards
>> 
>> MQ
>> 
>> 
> 
> 
> -- 
> Christian Balzer        Network/Systems Engineer                
> [email protected] <mailto:[email protected]>          Global OnLine Japan/Rakuten 
> Communications
> http://www.gol.com/ <http://www.gol.com/>

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] suse_enterprise_storage3_rbd_LIO_vmware_performance_bad

Reply via email to