Got it Robert, It was my mistake, I put post-up instead of pre-up, now it
changed ok, I'll do new tests with this config and let you know.

Regards,


*German*

2015-11-23 15:36 GMT-03:00 German Anders <[email protected]>:

> Hi Robert,
>
> Thanks for the response. I was configured as 'datagram', so I try to
> changed it in the /etc/network/interfaces file and add the following:
>
> ## IB0 PUBLIC_CEPH
> auto ib0
> iface ib0 inet static
>         address 172.23.17.8
>         netmask 255.255.240.0
>         network 172.23.16.0
>     post-up echo connected > /sys/class/net/ib0/mode
>     post-up /sbin/ifconfig ib0 mtu 65520
>
> ## IB1 CLUS_CEPH
> auto ib1
> iface ib1 inet static
>         address 172.23.32.8
>         netmask 255.255.240.0
>         network 172.23.47.254
>     post-up echo connected > /sys/class/net/ib1/mode
>     post-up /sbin/ifconfig ib1 mtu 65520
>
>
>
> then reboot, but when it come up again, the mode is still saying
> 'datagram' instead of 'connected', any idea?
>
> Regards,
>
>
> *German*
>
> 2015-11-23 15:06 GMT-03:00 Robert LeBlanc <[email protected]>:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA256
>>
>> Are you using unconnected mode or connected mode? With connected mode
>> you can up your MTU to 64K which may help on the network side.
>> - ----------------
>> Robert LeBlanc
>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>>
>>
>> On Mon, Nov 23, 2015 at 10:40 AM, German Anders  wrote:
>> > Hi Mark,
>> >
>> > Thanks a lot for the quick response. Regarding the numbers that you
>> send me,
>> > they look REALLY nice. I've the following setup
>> >
>> > 4 OSD nodes:
>> >
>> > 2 x Intel Xeon E5-2650v2 @2.60Ghz
>> > 1 x Network controller: Mellanox Technologies MT27500 Family
>> [ConnectX-3]
>> > Dual-Port (1 for PUB and 1 for CLUS)
>> > 1 x SAS2308 PCI-Express Fusion-MPT SAS-2
>> > 8 x Intel SSD DC S3510 800GB (1 OSD on each drive + journal on the same
>> > drive, so 1:1 relationship)
>> > 3 x Intel SSD DC S3710 200GB (to be used maybe as a cache tier)
>> > 128GB RAM
>> >
>> > [0:0:0:0]    disk    ATA      INTEL SSDSC2BA20 0110  /dev/sdc
>> > [0:0:1:0]    disk    ATA      INTEL SSDSC2BA20 0110  /dev/sdd
>> > [0:0:2:0]    disk    ATA      INTEL SSDSC2BA20 0110  /dev/sde
>> > [0:0:3:0]    disk    ATA      INTEL SSDSC2BB80 0130  /dev/sdf
>> > [0:0:4:0]    disk    ATA      INTEL SSDSC2BB80 0130  /dev/sdg
>> > [0:0:5:0]    disk    ATA      INTEL SSDSC2BB80 0130  /dev/sdh
>> > [0:0:6:0]    disk    ATA      INTEL SSDSC2BB80 0130  /dev/sdi
>> > [0:0:7:0]    disk    ATA      INTEL SSDSC2BB80 0130  /dev/sdj
>> > [0:0:8:0]    disk    ATA      INTEL SSDSC2BB80 0130  /dev/sdk
>> > [0:0:9:0]    disk    ATA      INTEL SSDSC2BB80 0130  /dev/sdl
>> > [0:0:10:0]   disk    ATA      INTEL SSDSC2BB80 0130  /dev/sdm
>> >
>> > sdf                                8:80   0 745.2G  0 disk
>> > |-sdf1                             8:81   0 740.2G  0 part
>> > /var/lib/ceph/osd/ceph-16
>> > `-sdf2                             8:82   0     5G  0 part
>> > sdg                                8:96   0 745.2G  0 disk
>> > |-sdg1                             8:97   0 740.2G  0 part
>> > /var/lib/ceph/osd/ceph-17
>> > `-sdg2                             8:98   0     5G  0 part
>> > sdh                                8:112  0 745.2G  0 disk
>> > |-sdh1                             8:113  0 740.2G  0 part
>> > /var/lib/ceph/osd/ceph-18
>> > `-sdh2                             8:114  0     5G  0 part
>> > sdi                                8:128  0 745.2G  0 disk
>> > |-sdi1                             8:129  0 740.2G  0 part
>> > /var/lib/ceph/osd/ceph-19
>> > `-sdi2                             8:130  0     5G  0 part
>> > sdj                                8:144  0 745.2G  0 disk
>> > |-sdj1                             8:145  0 740.2G  0 part
>> > /var/lib/ceph/osd/ceph-20
>> > `-sdj2                             8:146  0     5G  0 part
>> > sdk                                8:160  0 745.2G  0 disk
>> > |-sdk1                             8:161  0 740.2G  0 part
>> > /var/lib/ceph/osd/ceph-21
>> > `-sdk2                             8:162  0     5G  0 part
>> > sdl                                8:176  0 745.2G  0 disk
>> > |-sdl1                             8:177  0 740.2G  0 part
>> > /var/lib/ceph/osd/ceph-22
>> > `-sdl2                             8:178  0     5G  0 part
>> > sdm                                8:192  0 745.2G  0 disk
>> > |-sdm1                             8:193  0 740.2G  0 part
>> > /var/lib/ceph/osd/ceph-23
>> > `-sdm2                             8:194  0     5G  0 part
>> >
>> >
>> > $ rados bench -p rbd 20 write --no-cleanup -t 4
>> >  Maintaining 4 concurrent writes of 4194304 bytes for up to 20 seconds
>> or 0
>> > objects
>> >  Object prefix: benchmark_data_cibm01_1409
>> >    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
>> lat
>> >      0       0         0         0         0         0         -
>>  0
>> >      1       4       121       117   467.894       468 0.0337203
>> 0.0336809
>> >      2       4       244       240   479.895       492 0.0304306
>> 0.0330524
>> >      3       4       372       368   490.559       512 0.0361914
>> 0.0323822
>> >      4       4       491       487   486.899       476 0.0346544
>> 0.0327169
>> >      5       4       587       583   466.302       384  0.110718
>> 0.0342427
>> >      6       4       701       697   464.575       456 0.0324953
>> 0.0343136
>> >      7       4       811       807   461.053       440 0.0400344
>> 0.0345994
>> >      8       4       923       919   459.412       448 0.0255677
>> 0.0345767
>> >      9       4      1032      1028   456.803       436 0.0309743
>> 0.0349256
>> >     10       4      1119      1115   445.917       348  0.229508
>> 0.0357856
>> >     11       4      1222      1218   442.826       412 0.0277902
>> 0.0360635
>> >     12       4      1315      1311   436.919       372 0.0303377
>> 0.0365673
>> >     13       4      1424      1420   436.842       436 0.0288001
>>  0.03659
>> >     14       4      1524      1520   434.206       400 0.0360993
>> 0.0367697
>> >     15       4      1632      1628   434.054       432 0.0296406
>> 0.0366877
>> >     16       4      1740      1736   433.921       432 0.0310995
>> 0.0367746
>> >     17       4      1836      1832    430.98       384 0.0250518
>> 0.0370169
>> >     18       4      1941      1937   430.366       420  0.027502
>> 0.0371341
>> >     19       4      2049      2045   430.448       432 0.0260257
>> 0.0370807
>> > 2015-11-23 12:10:58.587087min lat: 0.0229266 max lat: 0.27063 avg lat:
>> > 0.0373936
>> >    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
>> lat
>> >     20       4      2141      2137   427.322       368 0.0351276
>> 0.0373936
>> >  Total time run:         20.186437
>> > Total writes made:      2141
>> > Write size:             4194304
>> > Bandwidth (MB/sec):     424.245
>> >
>> > Stddev Bandwidth:       102.136
>> > Max bandwidth (MB/sec): 512
>> > Min bandwidth (MB/sec): 0
>> > Average Latency:        0.0376536
>> > Stddev Latency:         0.032886
>> > Max latency:            0.27063
>> > Min latency:            0.0229266
>> >
>> >
>> > $ rados bench -p rbd 20 seq --no-cleanup -t 4
>> >    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
>> lat
>> >      0       0         0         0         0         0         -
>>  0
>> >      1       4       394       390   1559.52      1560 0.0148888
>> 0.0102236
>> >      2       4       753       749   1496.68      1436 0.0129162
>> 0.0106595
>> >      3       4      1137      1133   1509.65      1536 0.0101854
>> 0.0105731
>> >      4       4      1526      1522   1521.17      1556 0.0122154
>> 0.0103827
>> >      5       4      1890      1886   1508.07      14560.00825445
>> 0.0105908
>> >  Total time run:        5.675418
>> > Total reads made:     2141
>> > Read size:            4194304
>> > Bandwidth (MB/sec):    1508.964
>> >
>> > Average Latency:       0.0105951
>> > Max latency:           0.211469
>> > Min latency:           0.00603694
>> >
>> >
>> > I'm not even close to those numbers that you are getting... :( any
>> ideas? or
>> > hints? Also I've configured NOOP as the scheduler for all the SSD
>> disks. I
>> > don't know really what else to look for, in order to improve
>> performance and
>> > get some similar numbers from what you are getting
>> >
>> >
>> > Thanks in advance,
>> >
>> > Cheers,
>> >
>> >
>> > German
>> >
>> > 2015-11-23 13:32 GMT-03:00 Mark Nelson :
>> >>
>> >> Hi German,
>> >>
>> >> I don't have exactly the same setup, but on the ceph community cluster
>> I
>> >> have tests with:
>> >>
>> >> 4 nodes, each of which are configured in some tests with:
>> >>
>> >> 2 x Intel Xeon E5-2650
>> >> 1 x Intel XL710 40GbE (currently limited to about 2.5GB/s each)
>> >> 1 x Intel P3700 800GB (4 OSDs per card using 4 data and 4 journal
>> >> partitions)
>> >> 64GB RAM
>> >>
>> >> With filestore, I can get an aggregate throughput of:
>> >>
>> >> 1MB randread: 8715.3MB/s
>> >> 4MB randread: 8046.2MB/s
>> >>
>> >> This is with 4 fio instances on the same nodes as the OSDs using the
>> fio
>> >> librbd engine.
>> >>
>> >> A couple of things I would suggest trying:
>> >>
>> >> 1) See how rados bench does.  This is an easy test and you can see how
>> >> different the numbers look.
>> >>
>> >> 2) try fio with librbd to see if it might be a qemu limitation.
>> >>
>> >> 3) Assuming you are using IPoIB, try some iperf tests to see how your
>> >> network is doing.
>> >>
>> >> Mark
>> >>
>> >>
>> >> On 11/23/2015 10:17 AM, German Anders wrote:
>> >>>
>> >>> Thanks a lot for the quick update Greg. This lead me to ask if there's
>> >>> anything out there to improve performance in an Infiniband environment
>> >>> with Ceph. In the cluster that I mentioned earlier. I've setup 4 OSD
>> >>> server nodes nodes each with 8 OSD daemons running with 800x Intel SSD
>> >>> DC S3710 disks (740.2G for OSD and 5G for Journal) and also using IB
>> FDR
>> >>> 56Gb/s for the PUB and CLUS network, and I'm getting the following fio
>> >>> numbers:
>> >>>
>> >>>
>> >>> # fio --rw=randread --bs=1m --numjobs=4 --iodepth=32 --runtime=22
>> >>> --time_based --size=16777216k --loops=1 --ioengine=libaio --direct=1
>> >>> --invalidate=1 --fsync_on_close=1 --randrepeat=1 --norandommap
>> >>> --group_reporting --exitall --name
>> >>> dev-ceph-randread-1m-4thr-libaio-32iodepth-22sec
>> >>> --filename=/mnt/rbd/test1
>> >>> dev-ceph-randread-1m-4thr-libaio-32iodepth-22sec: (g=0): rw=randread,
>> >>> bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
>> >>> ...
>> >>> dev-ceph-randread-1m-4thr-libaio-32iodepth-22sec: (g=0): rw=randread,
>> >>> bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=32
>> >>> fio-2.1.3
>> >>> Starting 4 processes
>> >>> dev-ceph-randread-1m-4thr-libaio-32iodepth-22sec: Laying out IO
>> file(s)
>> >>> (1 file(s) / 16384MB)
>> >>> Jobs: 4 (f=4): [rrrr] [33.8% done] [1082MB/0KB/0KB /s] [1081/0/0 iops]
>> >>> [eta 00m:45s]
>> >>> dev-ceph-randread-1m-4thr-libaio-32iodepth-22sec: (groupid=0, jobs=4):
>> >>> err= 0: pid=63852: Mon Nov 23 10:48:07 2015
>> >>>    read : io=21899MB, bw=988.23MB/s, iops=988, runt= 22160msec
>> >>>      slat (usec): min=192, max=186274, avg=3990.48, stdev=7533.77
>> >>>      clat (usec): min=10, max=808610, avg=125099.41, stdev=90717.56
>> >>>       lat (msec): min=6, max=809, avg=129.09, stdev=91.14
>> >>>      clat percentiles (msec):
>> >>>       |  1.00th=[   27],  5.00th=[   38], 10.00th=[   45], 20.00th=[
>> >>> 61],
>> >>>       | 30.00th=[   74], 40.00th=[   85], 50.00th=[  100], 60.00th=[
>> >>> 117],
>> >>>       | 70.00th=[  141], 80.00th=[  174], 90.00th=[  235], 95.00th=[
>> >>> 297],
>> >>>       | 99.00th=[  482], 99.50th=[  578], 99.90th=[  717], 99.95th=[
>> >>> 750],
>> >>>       | 99.99th=[  775]
>> >>>      bw (KB  /s): min=134691, max=335872, per=25.08%, avg=253748.08,
>> >>> stdev=40454.88
>> >>>      lat (usec) : 20=0.01%
>> >>>      lat (msec) : 10=0.02%, 20=0.27%, 50=12.90%, 100=36.93%,
>> 250=41.39%
>> >>>      lat (msec) : 500=7.59%, 750=0.84%, 1000=0.05%
>> >>>    cpu          : usr=0.11%, sys=26.76%, ctx=39695, majf=0, minf=405
>> >>>    IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.3%, 32=99.4%,
>> >>>  >=64=0.0%
>> >>>       submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >>>  >=64=0.0%
>> >>>       complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%,
>> >>>  >=64=0.0%
>> >>>       issued    : total=r=21899/w=0/d=0, short=r=0/w=0/d=0
>> >>>
>> >>> Run status group 0 (all jobs):
>> >>>     READ: io=21899MB, aggrb=988.23MB/s, minb=988.23MB/s,
>> >>> maxb=988.23MB/s, mint=22160msec, maxt=22160msec
>> >>>
>> >>> Disk stats (read/write):
>> >>>    rbd1: ios=43736/163, merge=0/5, ticks=3189484/15276,
>> >>> in_queue=3214988, util=99.78%
>> >>>
>> >>>
>> >>>
>> >>>
>> ############################################################################################################################################################
>> >>>
>> >>>
>> >>> # fio --rw=randread --bs=4m --numjobs=4 --iodepth=32 --runtime=22
>> >>> --time_based --size=16777216k --loops=1 --ioengine=libaio --direct=1
>> >>> --invalidate=1 --fsync_on_close=1 --randrepeat=1 --norandommap
>> >>> --group_reporting --exitall --name
>> >>> dev-ceph-randread-4m-4thr-libaio-32iodepth-22sec
>> >>> --filename=/mnt/rbd/test2
>> >>>
>> >>> fio-2.1.3
>> >>> Starting 4 processes
>> >>> dev-ceph-randread-4m-4thr-libaio-32iodepth-22sec: Laying out IO
>> file(s)
>> >>> (1 file(s) / 16384MB)
>> >>> Jobs: 4 (f=4): [rrrr] [28.7% done] [894.3MB/0KB/0KB /s] [223/0/0 iops]
>> >>> [eta 00m:57s]
>> >>> dev-ceph-randread-4m-4thr-libaio-32iodepth-22sec: (groupid=0, jobs=4):
>> >>> err= 0: pid=64654: Mon Nov 23 10:51:58 2015
>> >>>    read : io=18952MB, bw=876868KB/s, iops=214, runt= 22132msec
>> >>>      slat (usec): min=518, max=81398, avg=18576.88, stdev=14840.55
>> >>>      clat (msec): min=90, max=1915, avg=570.37, stdev=166.51
>> >>>       lat (msec): min=123, max=1936, avg=588.95, stdev=169.19
>> >>>      clat percentiles (msec):
>> >>>       |  1.00th=[  258],  5.00th=[  343], 10.00th=[  383], 20.00th=[
>> >>> 437],
>> >>>       | 30.00th=[  482], 40.00th=[  519], 50.00th=[  553], 60.00th=[
>> >>> 594],
>> >>>       | 70.00th=[  627], 80.00th=[  685], 90.00th=[  775], 95.00th=[
>> >>> 865],
>> >>>       | 99.00th=[ 1057], 99.50th=[ 1156], 99.90th=[ 1680], 99.95th=[
>> >>> 1860],
>> >>>       | 99.99th=[ 1909]
>> >>>      bw (KB  /s): min= 5665, max=383251, per=24.61%, avg=215755.74,
>> >>> stdev=61735.70
>> >>>      lat (msec) : 100=0.02%, 250=0.80%, 500=33.88%, 750=53.31%,
>> >>> 1000=10.26%
>> >>>      lat (msec) : 2000=1.73%
>> >>>    cpu          : usr=0.07%, sys=12.52%, ctx=32466, majf=0, minf=372
>> >>>    IO depths    : 1=0.1%, 2=0.2%, 4=0.3%, 8=0.7%, 16=1.4%, 32=97.4%,
>> >>>  >=64=0.0%
>> >>>       submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >>>  >=64=0.0%
>> >>>       complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%,
>> >>>  >=64=0.0%
>> >>>       issued    : total=r=4738/w=0/d=0, short=r=0/w=0/d=0
>> >>>
>> >>> Run status group 0 (all jobs):
>> >>>     READ: io=18952MB, aggrb=876868KB/s, minb=876868KB/s,
>> >>> maxb=876868KB/s, mint=22132msec, maxt=22132msec
>> >>>
>> >>> Disk stats (read/write):
>> >>>    rbd1: ios=37721/177, merge=0/5, ticks=3075924/11408,
>> >>> in_queue=3097448, util=99.77%
>> >>>
>> >>>
>> >>> Can anyone share some results from a similar environment?
>> >>>
>> >>> Thanks in advance,
>> >>>
>> >>> Best,
>> >>>
>> >>> **
>> >>>
>> >>> *German*
>> >>>
>> >>> 2015-11-23 13:08 GMT-03:00 Gregory Farnum >> >:
>> >>>
>> >>>     On Mon, Nov 23, 2015 at 10:05 AM, German Anders
>> >>>     > wrote:
>> >>>     > Hi all,
>> >>>     >
>> >>>     > I want to know if there's any improvement or update regarding
>> ceph
>> >>> 0.94.5
>> >>>     > with accelio, I've an already configured cluster (with no data
>> on
>> >>> it) and I
>> >>>     > would like to know if there's a way to 'modify' the cluster in
>> >>> order to use
>> >>>     > accelio. Any info would be really appreciated.
>> >>>
>> >>>     The XioMessenger is still experimental. As far as I know it's not
>> >>>     expected to be stable any time soon and I can't imagine it will be
>> >>>     backported to Hammer even when done.
>> >>>     -Greg
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> ceph-users mailing list
>> >>> [email protected]
>> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>
>> >> _______________________________________________
>> >> ceph-users mailing list
>> >> [email protected]
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > [email protected]
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: Mailvelope v1.2.3
>> Comment: https://www.mailvelope.com
>>
>> wsFcBAEBCAAQBQJWU1WqCRDmVDuy+mK58QAAo5cQALjuZB+dyjbcRDyScvj/
>> qjurMqCHlScgG9U8CE4L6/E/QUfCNmdvE4KaeQC82oj/SplXYOuglTHJkUMg
>> KPyjb9jJs+ZyS560IoUB/l/XQZpO9WL+DNnSAg96Hpb3eG+G5jukW9/E/QHQ
>> aDjn/c1njEqUhxMAosUFZR58CxejyyI5Vr/SXX+oE6y2tCF31Z3KPiOVTOtj
>> BPIx74xpigXMSP+zaK4UelhjPzrRnefkN2sLpQS5uwJlOY1f35KoM3dX+LHO
>> 2BWpyrLUtL6ZzpalKr/QbaWko1VM109vjAoPZ3X82ig9DZp2DW8ZVX4abVcy
>> +Zyre4SCncKFJZcL9VkQHPJxRFhqXHC43mpSHIKmhuhmGVwr9ngiKGUY1Q7t
>> O0aks06KHfqSRxjWmuhtP0eMLwsH7gLAEqqtAjnIhRTCDDkhRdp/MdZJ7ftO
>> LHF9+Eqdp/KiVrGK7BX9zwVshr608bR4g7JCfK4/ukSHXOWFVR6GZ8jue85q
>> e6dWhHsdwrPt1QnSrfhnKjoMdhTpvPVzlxqo2jHDXEyE57RxW/zXr776HxcQ
>> cISj4zDZ0nGZ1F8w4DdB0ql8CpsCDAEoaNG0ZQPXcItyrHIB0lFOJYDi5m+4
>> YqOCG8TWh7b28IbEEwwUSpx3pi2iyH0ObJZM5dgf62AOCKCEsixf+UguFVwd
>> /jdL
>> =6LtO
>> -----END PGP SIGNATURE-----
>>
>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to