Thanks Richard, I appreciate your advice.
I was able to sature the channel using: XDD, 10 threads writing in 10 OST, each
OST in difference OSS and this is the result:
ETHERNET
T Q Bytes Ops Time
Rate IOPS Latency %CPU
TARGET Average 0 1 2147483648 65536 140.156 15.322 467.59
0.0021 39.16
TARGET Average 1 1 2147483648 65536 140.785 15.254 465.50
0.0021 39.11
TARGET Average 2 1 2147483648 65536 140.559 15.278 466.25
0.0021 39.14
TARGET Average 3 1 2147483648 65536 176.141 12.192 372.07
0.0027 38.02
TARGET Average 4 1 2147483648 65536 168.234 12.765 389.55
0.0026 38.54
TARGET Average 5 1 2147483648 65536 140.823 15.250 465.38
0.0021 39.11
TARGET Average 6 1 2147483648 65536 140.183 15.319 467.50
0.0021 39.16
TARGET Average 8 1 2147483648 65536 176.432 12.172 371.45
0.0027 38.02
TARGET Average 9 1 2147483648 65536 167.944 12.787 390.23
0.0026 38.57
Combined 10 10 21474836480 655360 180.000 119.305
3640.89 0.0003 387.99
INFINIBAND
T Q Bytes Ops Time
Rate IOPS Latency %CPU
TARGET Average 0 1 2147483648 65536 9.369 229.217
6995.16 0.0001 480.40
TARGET Average 1 1 2147483648 65536 9.540 225.110
6869.80 0.0001 474.25
TARGET Average 2 1 2147483648 65536 8.963 239.582
7311.45 0.0001 479.85
TARGET Average 3 1 2147483648 65536 9.480 226.521
6912.86 0.0001 478.21
TARGET Average 4 1 2147483648 65536 9.109 235.748
7194.47 0.0001 480.83
TARGET Average 5 1 2147483648 65536 9.284 231.299
7058.69 0.0001 479.04
TARGET Average 6 1 2147483648 65536 8.839 242.947
7414.15 0.0001 480.55
TARGET Average 7 1 2147483648 65536 9.210 233.166
7115.65 0.0001 480.17
TARGET Average 8 1 2147483648 65536 9.373 229.125
6992.33 0.0001 475.13
TARGET Average 9 1 2147483648 65536 9.184 233.828
7135.86 0.0001 480.25
Combined 10 10 21474836480 655360 9.540 2251.097
68698.03 0.0000 4788.69
A estimate is 0,6Gbits (max 1Gbit) by ethernet and 16Gbits by infiniband (max
40Gbits).
REGARDS!
El 19/05/2014, a las 17:37, Mohr Jr, Richard Frank (Rick Mohr) <[email protected]>
escribió:
> Alfonso,
>
> Based on my attempts to benchmark single client Lustre performance, here is
> some advice/comments that I have. (YMMV)
>
> 1) On the IB client, I recommend disabling checksums (lctl set_param
> osc.*.checksums=0). Having checksums enabled sometimes results in a
> significant performance hit.
>
> 2) Single-threaded tests (like dd) will usually bottleneck before you can max
> out the total client performance. You need to use a multi-threaded tool
> (like xdd) and have several threads perform IO at the same time in order to
> measure aggregate single client performance.
>
> 3) When using a tool like xdd, set up the test to run for a fixed amount of
> time rather than having each thread write a fixed amount of data. If all
> threads write a fixed amount of data (say 1 GB), and if any of the threads
> run slower than others, you might get skewed results for the aggregate
> throughput because of the stragglers.
>
> 4) In order to avoid contention at the ost level among the multiple threads
> on a single client, precreate the output files with stripe_count=1 and
> statically assign them evenly to the different osts. Have each thread write
> to a different file so that no two processes write to the same ost. If you
> don't have enough osts to saturate the client, you can always have two files
> per ost. Going beyond that will likely hurt more than help, at least for an
> ldiskfs backend.
>
> 5) In my testing, I seem to get worse results using direct I/O for write
> tests, so I usually just use buffered I/O. Based on my understanding, the
> max_dirty_mb parameter on the client (which defaults to 32 MB) limits the
> amount of dirty written data than can be cached on each ost. Unless you have
> increased this to a very large number, that parameter will likely mitigate
> any effects of client caching on the test results. (NOTE: This reasoning
> only applies to write tests. Any written data can still be cached by the
> client, and a subsequent read test might very well pull data from cache
> unless you have taken steps to flush the cached data.)
>
> If you have 10 oss nodes and 20 osts in your file system, I would start by
> running a test with 10 threads and have each thread write to a single ost on
> different servers. You can increase/decrease the number of threads as needed
> to see if the aggregate performance gets better/worse. On my clients with
> QDR IB, I typically see aggregate write speeds in the range of 2.5-3.0 GB/s.
>
> You are probably already aware of this, but just in case, make sure that the
> IB clients you use for testing don't also have ethernet connections to your
> OSS servers. If the client has an ethernet and an IB path to the same
> server, it will choose one of the paths to use. It could end up choosing
> ethernet instead of IB and mess up your results.
>
> --
> Rick Mohr
> Senior HPC System Administrator
> National Institute for Computational Sciences
> http://www.nics.tennessee.edu
>
>
> On May 19, 2014, at 6:33 AM, "Pardo Diaz, Alfonso" <[email protected]>
> wrote:
>
>> Hi,
>>
>> I have migrated my Lustre 2.2 to 2.5.1 and I have equipped my OSS/MDS and
>> clients with Infiniband QDR interfaces.
>> I have compile lustre with OFED 3.2 and I have configured lnet module with:
>>
>> options lent networks=“o2ib(ib0),tcp(eth0)”
>>
>>
>> But when I try to compare the lustre performance across Infiniband (o2ib), I
>> get the same performance than across ethernet (tcp):
>>
>> INFINIBAND TEST:
>> dd if=/dev/zero of=test.dat bs=1M count=1000
>> 1000+0 records in
>> 1000+0 records out
>> 1048576000 bytes (1,0 GB) copied, 5,88433 s, 178 MB/s
>>
>> ETHERNET TEST:
>> dd if=/dev/zero of=test.dat bs=1M count=1000
>> 1000+0 records in
>> 1000+0 records out
>> 1048576000 bytes (1,0 GB) copied, 5,97423 s, 154 MB/s
>>
>>
>> And this is my scenario:
>>
>> - 1 MDs with SSD RAID10 MDT
>> - 10 OSS with 2 OST per OSS
>> - Infiniband interface in connected mode
>> - Centos 6.5
>> - Lustre 2.5.1
>> - Striped filesystem “lfs setstripe -s 1M -c 10"
>>
>>
>> I know my infiniband running correctly, because if I use IPERF3 between
>> client and servers I got 40Gb/s by infiniband and 1Gb/s by ethernet
>> connections.
>>
>>
>>
>> Could you help me?
>>
>>
>> Regards,
>>
>>
>>
>>
>>
>> Alfonso Pardo Diaz
>> System Administrator / Researcher
>> c/ Sola nº 1; 10200 Trujillo, ESPAÑA
>> Tel: +34 927 65 93 17 Fax: +34 927 32 32 37
>>
>>
>>
>>
>> ----------------------------
>> Confidencialidad:
>> Este mensaje y sus ficheros adjuntos se dirige exclusivamente a su
>> destinatario y puede contener información privilegiada o confidencial. Si no
>> es vd. el destinatario indicado, queda notificado de que la utilización,
>> divulgación y/o copia sin autorización está prohibida en virtud de la
>> legislación vigente. Si ha recibido este mensaje por error, le rogamos que
>> nos lo comunique inmediatamente respondiendo al mensaje y proceda a su
>> destrucción.
>>
>> Disclaimer:
>> This message and its attached files is intended exclusively for its
>> recipients and may contain confidential information. If you received this
>> e-mail in error you are hereby notified that any dissemination, copy or
>> disclosure of this communication is strictly prohibited and may be unlawful.
>> In this case, please notify us by a reply and delete this email and its
>> contents immediately.
>> ----------------------------
>>
>> _______________________________________________
>> HPDD-discuss mailing list
>> [email protected]
>> https://lists.01.org/mailman/listinfo/hpdd-discuss
>
>
>
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss