Re: [lustre-discuss] weird issue w. lnet routers

2017-11-29 Thread John Casu

thanks guys for all your help.

looks like the issue is fundamentally poor performance across 100GbE, where I'm 
only
getting ~50Gb/s using iperf. I believe the MTU is set correctly across all my 
systems
Using connectx-4 in 100GbE mode.

thanks again,
-john



On 11/28/17 9:03 PM, Colin Faber wrote:

Are peer credits set appropriately across your fabric?

On Nov 28, 2017 8:40 PM, "john casu" > wrote:

Thanks,
just about to try that MTU setting.

It's a small lustre system... 2 routers, MDS/MGS pair, OSS pair, JBOD pair 
(112 drives for OST)
and yes, routing between EDR & 100GbE

-john

On 11/28/17 7:28 PM, Raj wrote:

John, increasing MTU size on Ethernet side should increase the b/w. I 
also have a feeling that some lnet routers and/or
intermediate switches/routers does not have jumbo frame turned on (some 
switches needs to be set at 9212 bytes ).
How many LNet  routers are you using? I believe you are routing between 
EDR IB and 100GbE.


On Tue, Nov 28, 2017 at 7:21 PM John Casu  >> wrote:

     just built a system w. lnet routers that bridge Infiniband & 
100GbE, using Centos built in Infiniband support
     servers are Infiniband, clients are 100GbE (connectx-4 cards)

     my direct write performance from clients over Infiniband is around 
15GB/s

     When I introduced the lnet routers, performance dropped to 10GB/s

     Thought the problem was an MTU of 1500, but when I changed the 
MTUs to 9000
     performance dropped to 3GB/s.

     When I tuned according to John Fragella's LUG slides, things went 
even slower (1.5GB/s write)

     does anyone have any ideas on what I'm doing wrong??

     thanks,
     -john c.

     ___
     lustre-discuss mailing list
lustre-discuss@lists.lustre.org  
>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org 


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org 
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org 



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] weird issue w. lnet routers

2017-11-28 Thread Colin Faber
Are peer credits set appropriately across your fabric?

On Nov 28, 2017 8:40 PM, "john casu"  wrote:

> Thanks,
> just about to try that MTU setting.
>
> It's a small lustre system... 2 routers, MDS/MGS pair, OSS pair, JBOD pair
> (112 drives for OST)
> and yes, routing between EDR & 100GbE
>
> -john
>
> On 11/28/17 7:28 PM, Raj wrote:
>
>> John, increasing MTU size on Ethernet side should increase the b/w. I
>> also have a feeling that some lnet routers and/or
>> intermediate switches/routers does not have jumbo frame turned on (some
>> switches needs to be set at 9212 bytes ).
>> How many LNet  routers are you using? I believe you are routing between
>> EDR IB and 100GbE.
>>
>>
>> On Tue, Nov 28, 2017 at 7:21 PM John Casu > > wrote:
>>
>> just built a system w. lnet routers that bridge Infiniband & 100GbE,
>> using Centos built in Infiniband support
>> servers are Infiniband, clients are 100GbE (connectx-4 cards)
>>
>> my direct write performance from clients over Infiniband is around
>> 15GB/s
>>
>> When I introduced the lnet routers, performance dropped to 10GB/s
>>
>> Thought the problem was an MTU of 1500, but when I changed the MTUs
>> to 9000
>> performance dropped to 3GB/s.
>>
>> When I tuned according to John Fragella's LUG slides, things went
>> even slower (1.5GB/s write)
>>
>> does anyone have any ideas on what I'm doing wrong??
>>
>> thanks,
>> -john c.
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org > ustre.org>
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] weird issue w. lnet routers

2017-11-28 Thread john casu

Thanks,
just about to try that MTU setting.

It's a small lustre system... 2 routers, MDS/MGS pair, OSS pair, JBOD pair (112 
drives for OST)
and yes, routing between EDR & 100GbE

-john

On 11/28/17 7:28 PM, Raj wrote:

John, increasing MTU size on Ethernet side should increase the b/w. I also have 
a feeling that some lnet routers and/or
intermediate switches/routers does not have jumbo frame turned on (some 
switches needs to be set at 9212 bytes ).
How many LNet  routers are you using? I believe you are routing between EDR IB 
and 100GbE.


On Tue, Nov 28, 2017 at 7:21 PM John Casu > wrote:

just built a system w. lnet routers that bridge Infiniband & 100GbE, using 
Centos built in Infiniband support
servers are Infiniband, clients are 100GbE (connectx-4 cards)

my direct write performance from clients over Infiniband is around 15GB/s

When I introduced the lnet routers, performance dropped to 10GB/s

Thought the problem was an MTU of 1500, but when I changed the MTUs to 9000
performance dropped to 3GB/s.

When I tuned according to John Fragella's LUG slides, things went even 
slower (1.5GB/s write)

does anyone have any ideas on what I'm doing wrong??

thanks,
-john c.

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org 
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] weird issue w. lnet routers

2017-11-28 Thread john casu

I set MTUs on switches to 9100 and nodes to 9000.
It was kind of bizarre to see what happened.

The only tuning I've done is to get lnet.conf according to John's slides.
Haven't yet done any other tuning.

my IOR benchmark for Infiniband had 20 clients/node (1 per core)
halving that number improved my write bandwidth from 1.5 GB/s to 9GB/s

-john

On 11/28/17 5:42 PM, John Fragalla wrote:

Hi John C,

This is interesting.  When you changed MTU size, did you change it end to end, 
including the switches?  If any path does not
have Jumbo frames enabled, it will revert back 1500.

Did you tune the server, routers, and clients with the same lnet params?  Did 
you tune lustre clients regarding max rpcs,
dirty_mb, LRU, checksums, max_pages_per_rpc, etc?

On the switch, MTU should be set to max, bigger than 9000, to accommodate for 
payload size coming from the nodes.





Thanks.



jnf





--
John Fragalla
Senior Storage Engineer
High Performance Computing
Cray Inc.
jfraga...@cray.com 
+1-951-258-7629



On 11/28/17 5:21 PM, John Casu wrote:

just built a system w. lnet routers that bridge Infiniband & 100GbE, using 
Centos built in Infiniband support
servers are Infiniband, clients are 100GbE (connectx-4 cards)

my direct write performance from clients over Infiniband is around 15GB/s

When I introduced the lnet routers, performance dropped to 10GB/s

Thought the problem was an MTU of 1500, but when I changed the MTUs to 9000
performance dropped to 3GB/s.

When I tuned according to John Fragella's LUG slides, things went even slower 
(1.5GB/s write)

does anyone have any ideas on what I'm doing wrong??

thanks,
-john c.

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] weird issue w. lnet routers

2017-11-28 Thread Raj
John, increasing MTU size on Ethernet side should increase the b/w. I also
have a feeling that some lnet routers and/or intermediate switches/routers
does not have jumbo frame turned on (some switches needs to be set at 9212
bytes ).
How many LNet  routers are you using? I believe you are routing between EDR
IB and 100GbE.


On Tue, Nov 28, 2017 at 7:21 PM John Casu  wrote:

> just built a system w. lnet routers that bridge Infiniband & 100GbE, using
> Centos built in Infiniband support
> servers are Infiniband, clients are 100GbE (connectx-4 cards)
>
> my direct write performance from clients over Infiniband is around 15GB/s
>
> When I introduced the lnet routers, performance dropped to 10GB/s
>
> Thought the problem was an MTU of 1500, but when I changed the MTUs to 9000
> performance dropped to 3GB/s.
>
> When I tuned according to John Fragella's LUG slides, things went even
> slower (1.5GB/s write)
>
> does anyone have any ideas on what I'm doing wrong??
>
> thanks,
> -john c.
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] weird issue w. lnet routers

2017-11-28 Thread John Fragalla

Hi John C,

This is interesting.  When you changed MTU size, did you change it end 
to end, including the switches?  If any path does not have Jumbo frames 
enabled, it will revert back 1500.


Did you tune the server, routers, and clients with the same lnet 
params?  Did you tune lustre clients regarding max rpcs, dirty_mb, LRU, 
checksums, max_pages_per_rpc, etc?


On the switch, MTU should be set to max, bigger than 9000, to 
accommodate for payload size coming from the nodes.


Thanks.

jnf

--
John Fragalla
Senior Storage Engineer
High Performance Computing
Cray Inc.
jfraga...@cray.com 
+1-951-258-7629

On 11/28/17 5:21 PM, John Casu wrote:
just built a system w. lnet routers that bridge Infiniband & 100GbE, 
using Centos built in Infiniband support

servers are Infiniband, clients are 100GbE (connectx-4 cards)

my direct write performance from clients over Infiniband is around 15GB/s

When I introduced the lnet routers, performance dropped to 10GB/s

Thought the problem was an MTU of 1500, but when I changed the MTUs to 
9000

performance dropped to 3GB/s.

When I tuned according to John Fragella's LUG slides, things went even 
slower (1.5GB/s write)


does anyone have any ideas on what I'm doing wrong??

thanks,
-john c.

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] weird issue w. lnet routers

2017-11-28 Thread Jeff Johnson
John,

I can't speak to Fragella's tuning making things worse but...

Have you run iperf3 and lnet_selftest from your Ethernet clients to each of
the lnet routers to establish what your top end is? It'd be good to
determine if you have an Ethernet problem vs a lnet problem.

Also, are you running Ethernet rdma? If not interrupts on the receive end
can be vexing.

--Jeff

On Tue, Nov 28, 2017 at 17:21 John Casu  wrote:

> just built a system w. lnet routers that bridge Infiniband & 100GbE, using
> Centos built in Infiniband support
> servers are Infiniband, clients are 100GbE (connectx-4 cards)
>
> my direct write performance from clients over Infiniband is around 15GB/s
>
> When I introduced the lnet routers, performance dropped to 10GB/s
>
> Thought the problem was an MTU of 1500, but when I changed the MTUs to 9000
> performance dropped to 3GB/s.
>
> When I tuned according to John Fragella's LUG slides, things went even
> slower (1.5GB/s write)
>
> does anyone have any ideas on what I'm doing wrong??
>
> thanks,
> -john c.
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
-- 
--
Jeff Johnson
Co-Founder
Aeon Computing

jeff.john...@aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x1001   f: 858-412-3845
m: 619-204-9061

4170 Morena Boulevard, Suite D - San Diego, CA 92117

High-Performance Computing / Lustre Filesystems / Scale-out Storage
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] weird issue w. lnet routers

2017-11-28 Thread John Casu

just built a system w. lnet routers that bridge Infiniband & 100GbE, using 
Centos built in Infiniband support
servers are Infiniband, clients are 100GbE (connectx-4 cards)

my direct write performance from clients over Infiniband is around 15GB/s

When I introduced the lnet routers, performance dropped to 10GB/s

Thought the problem was an MTU of 1500, but when I changed the MTUs to 9000
performance dropped to 3GB/s.

When I tuned according to John Fragella's LUG slides, things went even slower 
(1.5GB/s write)

does anyone have any ideas on what I'm doing wrong??

thanks,
-john c.

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org