Re: [lustre-discuss] Lustre traffic slow on OPA fabric network

2018-08-15 Thread Robin Humble
Hi Kurt,

On Thu, Jul 12, 2018 at 02:36:49PM -0400, Kurt Strosahl wrote:
>   That's really helpful.  The version on the servers is IEEL 2.5.42, while 
> the routers and OPA nodes are all running 2.10.4... We'be been looking at 
> upgrading our old system to 2.10 or 2.11.

just an update on this. we moved our old 2.5 IEEL lustre to 2.10.4
(still rhel6.x) but sadly it didn't solve our lnet routing problem.
sorry for the bad advice.

>   I checked the opa clients and the lnet routers, they all use the same 
> parameters that you do except for the map_on_demand (which our system 
> defaults to 256).

we eventually realised that with the "new" ways of setting ko2iblnd and
lnet options we could configure each card (qib/mlnx, opa) separately and
have them "optimal", but still doesn't work without errors so far.

haven't 100% ruled out shonky FINSTAR opa optical cables yet, but it
seems quite unlikely.

did you make any progress?

cheers,
robin

>
>w/r,
>Kurt
>
>- Original Message -
>From: "Robin Humble" 
>To: "Kurt Strosahl" 
>Cc: lustre-discuss@lists.lustre.org
>Sent: Tuesday, July 10, 2018 5:03:30 AM
>Subject: Re: [lustre-discuss] Lustre traffic slow on OPA fabric network
>
>Hi Kurt,
>
>On Tue, Jul 03, 2018 at 02:59:22PM -0400, Kurt Strosahl wrote:
>>   I've been seeing a great deal of slowness from clients on an OPA network 
>> accessing lustre through lnet routers.  The nodes take very long to complete 
>> things like lfs df, and show lots of dropped / reestablished connections.  
>> The OSS systems show this as well, and occasionally will report that all 
>> routes are down to a host on the omnipath fabric.  They also show large 
>> numbers of bulk callback errors.  The lnet router show large numbers of 
>> PUT_NACK messages, as well as Abort reconnection messages for nodes on the 
>> OPA fabric.
>
>I don't suppose you're talking to a super-old Lustre version via the
>lnet routers?
>
>we see excellent performance OPA to IB via lnet routers wth 2.10.x
>clients and 2.9 servers, but when we try to talk to a IEEL 2.5.41
>servers then we see pretty much exactly the symptoms you describe.
>
>strangely direct mounts of old lustre on new clients on IB work ok, but
>not via lnet routers to OPA. old lustre to new clients on tcp networks
>are ok. lnet self tests OPA to IB also work fine, it's just when we do
>the actual mounts...
>anyway, we are going to try and resolve the problem by updating the
>IEEL to 2.9 or 2.10
>
>hmm, now that I think of it, we did have to tweak the ko2iblnd options
>a lot on the lnet router to get it this stable. I forget the symptoms
>we were seeing though, sorry.
>we found the minimum common denominator settings between the IB network
>and the OPA, and tuned ko2iblnd on the lnet routers down to that. if it
>finds one OPA card then Lustre imposes an agressive OPA config on all
>IB networks which made our mlx4 cards on a ipath/qib fabric unhappy.
>
>FWIW, for our hardware combo, ko2iblnd options are
>
>  options ko2iblnd-opa peer_credits=8 peer_credits_hiw=0 credits=256 
> concurrent_sends=0 ntx=512 map_on_demand=0 fmr_pool_size=512 
> fmr_flush_trigger=384 fmr_cache=1 conns_per_peer=1
>
>I don't know what most of these do, so please take with a grain of salt.
>
>cheers,
>robin
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Lustre traffic slow on OPA fabric network

2018-07-12 Thread Kurt Strosahl
Thanks, 

   That's really helpful.  The version on the servers is IEEL 2.5.42, while the 
routers and OPA nodes are all running 2.10.4... We'be been looking at upgrading 
our old system to 2.10 or 2.11.

   I checked the opa clients and the lnet routers, they all use the same 
parameters that you do except for the map_on_demand (which our system defaults 
to 256).


w/r,
Kurt

- Original Message -
From: "Robin Humble" 
To: "Kurt Strosahl" 
Cc: lustre-discuss@lists.lustre.org
Sent: Tuesday, July 10, 2018 5:03:30 AM
Subject: Re: [lustre-discuss] Lustre traffic slow on OPA fabric network

Hi Kurt,

On Tue, Jul 03, 2018 at 02:59:22PM -0400, Kurt Strosahl wrote:
>   I've been seeing a great deal of slowness from clients on an OPA network 
> accessing lustre through lnet routers.  The nodes take very long to complete 
> things like lfs df, and show lots of dropped / reestablished connections.  
> The OSS systems show this as well, and occasionally will report that all 
> routes are down to a host on the omnipath fabric.  They also show large 
> numbers of bulk callback errors.  The lnet router show large numbers of 
> PUT_NACK messages, as well as Abort reconnection messages for nodes on the 
> OPA fabric.

I don't suppose you're talking to a super-old Lustre version via the
lnet routers?

we see excellent performance OPA to IB via lnet routers wth 2.10.x
clients and 2.9 servers, but when we try to talk to a IEEL 2.5.41
servers then we see pretty much exactly the symptoms you describe.

strangely direct mounts of old lustre on new clients on IB work ok, but
not via lnet routers to OPA. old lustre to new clients on tcp networks
are ok. lnet self tests OPA to IB also work fine, it's just when we do
the actual mounts...
anyway, we are going to try and resolve the problem by updating the
IEEL to 2.9 or 2.10

hmm, now that I think of it, we did have to tweak the ko2iblnd options
a lot on the lnet router to get it this stable. I forget the symptoms
we were seeing though, sorry.
we found the minimum common denominator settings between the IB network
and the OPA, and tuned ko2iblnd on the lnet routers down to that. if it
finds one OPA card then Lustre imposes an agressive OPA config on all
IB networks which made our mlx4 cards on a ipath/qib fabric unhappy.

FWIW, for our hardware combo, ko2iblnd options are

  options ko2iblnd-opa peer_credits=8 peer_credits_hiw=0 credits=256 
concurrent_sends=0 ntx=512 map_on_demand=0 fmr_pool_size=512 
fmr_flush_trigger=384 fmr_cache=1 conns_per_peer=1

I don't know what most of these do, so please take with a grain of salt.

cheers,
robin
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Lustre traffic slow on OPA fabric network

2018-07-10 Thread Robin Humble
Hi Kurt,

On Tue, Jul 03, 2018 at 02:59:22PM -0400, Kurt Strosahl wrote:
>   I've been seeing a great deal of slowness from clients on an OPA network 
> accessing lustre through lnet routers.  The nodes take very long to complete 
> things like lfs df, and show lots of dropped / reestablished connections.  
> The OSS systems show this as well, and occasionally will report that all 
> routes are down to a host on the omnipath fabric.  They also show large 
> numbers of bulk callback errors.  The lnet router show large numbers of 
> PUT_NACK messages, as well as Abort reconnection messages for nodes on the 
> OPA fabric.

I don't suppose you're talking to a super-old Lustre version via the
lnet routers?

we see excellent performance OPA to IB via lnet routers wth 2.10.x
clients and 2.9 servers, but when we try to talk to a IEEL 2.5.41
servers then we see pretty much exactly the symptoms you describe.

strangely direct mounts of old lustre on new clients on IB work ok, but
not via lnet routers to OPA. old lustre to new clients on tcp networks
are ok. lnet self tests OPA to IB also work fine, it's just when we do
the actual mounts...
anyway, we are going to try and resolve the problem by updating the
IEEL to 2.9 or 2.10

hmm, now that I think of it, we did have to tweak the ko2iblnd options
a lot on the lnet router to get it this stable. I forget the symptoms
we were seeing though, sorry.
we found the minimum common denominator settings between the IB network
and the OPA, and tuned ko2iblnd on the lnet routers down to that. if it
finds one OPA card then Lustre imposes an agressive OPA config on all
IB networks which made our mlx4 cards on a ipath/qib fabric unhappy.

FWIW, for our hardware combo, ko2iblnd options are

  options ko2iblnd-opa peer_credits=8 peer_credits_hiw=0 credits=256 
concurrent_sends=0 ntx=512 map_on_demand=0 fmr_pool_size=512 
fmr_flush_trigger=384 fmr_cache=1 conns_per_peer=1

I don't know what most of these do, so please take with a grain of salt.

cheers,
robin
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Lustre traffic slow on OPA fabric network

2018-07-05 Thread Cory Spitz
It sounds like you've diagnosed the problem to be your OPA fabric.  Do you have 
network errors that will help confirm your theory?  Can you test your network 
without Lustre & LNet to prove its fitness?  That is, do you pass network 
diagnostics?  If it goes well, maybe LNet Self Test can help as a diagnostic.  
There is a guide at http://wiki.lustre.org/LNET_Selftest.

-Cory

-- 

On 7/3/18, 1:59 PM, "lustre-discuss on behalf of Kurt Strosahl" 
 wrote:

Good Afternoon,

   I've been seeing a great deal of slowness from clients on an OPA network 
accessing lustre through lnet routers.  The nodes take very long to complete 
things like lfs df, and show lots of dropped / reestablished connections.  The 
OSS systems show this as well, and occasionally will report that all routes are 
down to a host on the omnipath fabric.  They also show large numbers of bulk 
callback errors.  The lnet router show large numbers of PUT_NACK messages, as 
well as Abort reconnection messages for nodes on the OPA fabric.

w/r, 
Kurt J. Strosahl
System Administrator: Lustre, HPC
Scientific Computing Group, Thomas Jefferson National Accelerator Facility
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Lustre traffic slow on OPA fabric network

2018-07-03 Thread Kurt Strosahl
Good Afternoon,

   I've been seeing a great deal of slowness from clients on an OPA network 
accessing lustre through lnet routers.  The nodes take very long to complete 
things like lfs df, and show lots of dropped / reestablished connections.  The 
OSS systems show this as well, and occasionally will report that all routes are 
down to a host on the omnipath fabric.  They also show large numbers of bulk 
callback errors.  The lnet router show large numbers of PUT_NACK messages, as 
well as Abort reconnection messages for nodes on the OPA fabric.

w/r, 
Kurt J. Strosahl
System Administrator: Lustre, HPC
Scientific Computing Group, Thomas Jefferson National Accelerator Facility
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org