Re: [Lustre-discuss] [HPDD-discuss] Same performance Infiniband and Ethernet

Pardo Diaz, Alfonso Tue, 20 May 2014 23:34:26 -0700

Thanks Richard, I appreciate your advice.

I was able to sature the channel using: XDD, 10 threads writing in 10 OST, each 
OST in difference OSS and this is the result:


ETHERNET
                                   T  Q       Bytes             Ops      Time   
      Rate       IOPS      Latency   %CPU
TARGET   Average     0  1    2147483648    65536   140.156    15.322     467.59 
   0.0021    39.16
TARGET   Average     1  1    2147483648    65536   140.785    15.254     465.50 
   0.0021    39.11
TARGET   Average     2  1    2147483648    65536   140.559    15.278     466.25 
   0.0021    39.14
TARGET   Average     3  1    2147483648    65536   176.141    12.192     372.07 
   0.0027    38.02
TARGET   Average     4  1    2147483648    65536   168.234    12.765     389.55 
   0.0026    38.54
TARGET   Average     5  1    2147483648    65536   140.823    15.250     465.38 
   0.0021    39.11
TARGET   Average     6  1    2147483648    65536   140.183    15.319     467.50 
   0.0021    39.16
TARGET   Average     8  1    2147483648    65536   176.432    12.172     371.45 
   0.0027    38.02
TARGET   Average     9  1    2147483648    65536   167.944    12.787     390.23 
   0.0026    38.57
         Combined   10 10   21474836480   655360   180.000   119.305     
3640.89    0.0003    387.99

INFINIBAND
                                   T  Q       Bytes             Ops      Time   
      Rate       IOPS      Latency   %CPU
TARGET   Average     0  1    2147483648    65536     9.369   229.217     
6995.16    0.0001    480.40
TARGET   Average     1  1    2147483648    65536     9.540   225.110     
6869.80    0.0001    474.25
TARGET   Average     2  1    2147483648    65536     8.963   239.582     
7311.45    0.0001    479.85
TARGET   Average     3  1    2147483648    65536     9.480   226.521     
6912.86    0.0001    478.21
TARGET   Average     4  1    2147483648    65536     9.109   235.748     
7194.47    0.0001    480.83
TARGET   Average     5  1    2147483648    65536     9.284   231.299     
7058.69    0.0001    479.04
TARGET   Average     6  1    2147483648    65536     8.839   242.947     
7414.15    0.0001    480.55
TARGET   Average     7  1    2147483648    65536     9.210   233.166     
7115.65    0.0001    480.17
TARGET   Average     8  1    2147483648    65536     9.373   229.125     
6992.33    0.0001    475.13
TARGET   Average     9  1    2147483648    65536     9.184   233.828     
7135.86    0.0001    480.25
         Combined   10 10   21474836480   655360     9.540   2251.097     
68698.03    0.0000    4788.69


A estimate is 0,6Gbits (max 1Gbit) by ethernet and 16Gbits by infiniband (max 
40Gbits).

REGARDS!


El 19/05/2014, a las 17:37, Mohr Jr, Richard Frank (Rick Mohr) <rm...@utk.edu> 
escribió:

> Alfonso,
> 
> Based on my attempts to benchmark single client Lustre performance, here is 
> some advice/comments that I have.  (YMMV)
> 
> 1) On the IB client, I recommend disabling checksums (lctl set_param 
> osc.*.checksums=0).  Having checksums enabled sometimes results in a 
> significant performance hit.
> 
> 2) Single-threaded tests (like dd) will usually bottleneck before you can max 
> out the total client performance.  You need to use a multi-threaded tool 
> (like xdd) and have several threads perform IO at the same time in order to 
> measure aggregate single client performance.
> 
> 3) When using a tool like xdd, set up the test to run for a fixed amount of 
> time rather than having each thread write a fixed amount of data.  If all 
> threads write a fixed amount of data (say 1 GB), and if any of the threads 
> run slower than others, you might get skewed results for the aggregate 
> throughput because of the stragglers.
> 
> 4) In order to avoid contention at the ost level among the multiple threads 
> on a single client, precreate the output files with stripe_count=1 and 
> statically assign them evenly to the different osts.  Have each thread write 
> to a different file so that no two processes write to the same ost.  If you 
> don't have enough osts to saturate the client, you can always have two files 
> per ost.  Going beyond that will likely hurt more than help, at least for an 
> ldiskfs backend.
> 
> 5) In my testing, I seem to get worse results using direct I/O for write 
> tests,  so I usually just use buffered I/O.  Based on my understanding, the 
> max_dirty_mb parameter on the client (which defaults to 32 MB) limits the 
> amount of dirty written data than can be cached on each ost.  Unless you have 
> increased this to a very large number, that parameter will likely mitigate 
> any effects of client caching on the test results.  (NOTE: This reasoning 
> only applies to write tests.  Any written data can still be cached by the 
> client, and a subsequent read test might very well pull data from cache 
> unless you have taken steps to flush the cached data.)
> 
> If you have 10 oss nodes and 20 osts in your file system, I would start by 
> running a test with 10 threads and have each thread write to a single ost on 
> different servers.  You can increase/decrease the number of threads as needed 
> to see if the aggregate performance gets better/worse.  On my clients with 
> QDR IB, I typically see aggregate write speeds in the range of 2.5-3.0 GB/s.
> 
> You are probably already aware of this, but just in case, make sure that the 
> IB clients you use for testing don't also have ethernet connections to your 
> OSS servers.  If the client has an ethernet and an IB path to the same 
> server, it will choose one of the paths to use.  It could end up choosing 
> ethernet instead of IB and mess up your results.
> 
> -- 
> Rick Mohr
> Senior HPC System Administrator
> National Institute for Computational Sciences
> http://www.nics.tennessee.edu
> 
> 
> On May 19, 2014, at 6:33 AM, "Pardo Diaz, Alfonso" <alfonso.pa...@ciemat.es>
> wrote:
> 
>> Hi,
>> 
>> I have migrated my Lustre 2.2 to 2.5.1 and I have equipped my OSS/MDS and 
>> clients with Infiniband QDR interfaces.
>> I have compile lustre with OFED 3.2 and I have configured lnet module with:
>> 
>> options lent networks=“o2ib(ib0),tcp(eth0)”
>> 
>> 
>> But when I try to compare the lustre performance across Infiniband (o2ib), I 
>> get the same performance than across ethernet (tcp):
>> 
>> INFINIBAND TEST:
>> dd if=/dev/zero of=test.dat bs=1M count=1000
>> 1000+0 records in
>> 1000+0 records out
>> 1048576000 bytes (1,0 GB) copied, 5,88433 s, 178 MB/s
>> 
>> ETHERNET TEST:
>> dd if=/dev/zero of=test.dat bs=1M count=1000
>> 1000+0 records in
>> 1000+0 records out
>> 1048576000 bytes (1,0 GB) copied, 5,97423 s, 154 MB/s
>> 
>> 
>> And this is my scenario:
>> 
>> - 1 MDs with SSD RAID10 MDT
>> - 10 OSS with 2 OST per OSS
>> - Infiniband interface in connected mode
>> - Centos 6.5
>> - Lustre 2.5.1
>> - Striped filesystem “lfs setstripe -s 1M -c 10"
>> 
>> 
>> I know my infiniband running correctly, because if I use IPERF3 between 
>> client and servers I got 40Gb/s by infiniband and 1Gb/s by ethernet 
>> connections.
>> 
>> 
>> 
>> Could you help me?
>> 
>> 
>> Regards,
>> 
>> 
>> 
>> 
>> 
>> Alfonso Pardo Diaz
>> System Administrator / Researcher
>> c/ Sola nº 1; 10200 Trujillo, ESPAÑA
>> Tel: +34 927 65 93 17 Fax: +34 927 32 32 37
>> 
>> 
>> 
>> 
>> ----------------------------
>> Confidencialidad: 
>> Este mensaje y sus ficheros adjuntos se dirige exclusivamente a su 
>> destinatario y puede contener información privilegiada o confidencial. Si no 
>> es vd. el destinatario indicado, queda notificado de que la utilización, 
>> divulgación y/o copia sin autorización está prohibida en virtud de la 
>> legislación vigente. Si ha recibido este mensaje por error, le rogamos que 
>> nos lo comunique inmediatamente respondiendo al mensaje y proceda a su 
>> destrucción.
>> 
>> Disclaimer: 
>> This message and its attached files is intended exclusively for its 
>> recipients and may contain confidential information. If you received this 
>> e-mail in error you are hereby notified that any dissemination, copy or 
>> disclosure of this communication is strictly prohibited and may be unlawful. 
>> In this case, please notify us by a reply and delete this email and its 
>> contents immediately. 
>> ----------------------------
>> 
>> _______________________________________________
>> HPDD-discuss mailing list
>> hpdd-disc...@lists.01.org
>> https://lists.01.org/mailman/listinfo/hpdd-discuss
> 
> 
> 

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] [HPDD-discuss] Same performance Infiniband and Ethernet

Reply via email to