no problem
On Wed, 6 Mar 2019 at 12:15, Riccardo Veraldi
wrote:
> On 3/6/19 11:29 AM, Amir Shehata wrote:
>
> The reason for the load being split across the tcp and o2ib0 for the 2.12
> client, is because the MR code sees both interfaces and realizes it can use
> both of them and so it does.
>
The reason for the load being split across the tcp and o2ib0 for the 2.12
client, is because the MR code sees both interfaces and realizes it can use
both of them and so it does.
To disable this behavior you can disable discovery on the 2.12 client. I
think that should just get the client to only
Hello Amir i answer in-line
On 3/5/19 3:42 PM, Amir Shehata wrote:
It looks like the ping is passing. Did you try it several times to
make sure it always pings successfully?
The way it works is the MDS (2.12) discovers all the interfaces on the
peer. There is a concept of the primary NID for
it is not exactly this problem.
here is my setup
* MDS is on tcp0
* client is on tcp0 and o2ib0
* OSS is on tcp0 and o2ib0
The problem is that the MDS is discovering both the lustre client and
the OSS as well over o2ib and it should not because the MDS has only one
ethernet interface. I
Take a look at this: https://jira.whamcloud.com/browse/LU-11840
Let me know if this is the same issue you're seeing.
On Tue, 5 Mar 2019 at 14:04, Amir Shehata
wrote:
> Hi Riccardo,
>
> It's not LNet Health. It's Dynamic Discovery. What's happening is that
> 2.12 is discovering all the
Hi Riccardo,
It's not LNet Health. It's Dynamic Discovery. What's happening is that 2.12
is discovering all the interfaces on the peer. That's why you see all the
interfaces in the peer show.
Multi-Rail doesn't enable o2ib. It just sees it. If the node doing the
discovery has only tcp, then it
I think I figured out the problem.
My problem is related to Lnet Network Health feature:
https://jira.whamcloud.com/browse/LU-9120
the lustre MDS and the lsutre client having same version 2.12.0
negotiate a Multi-rail peer connection while this does not happen with
the other clients (2.10.5).
Riccardo,
Since 2.12 is still a relatively new maintenance release, it would be helpful
if you could open an LU and provide more detail there - Such as what clients
were doing, if you were using any new features (like DoM or FLR), and full
dmesg from the clients and servers involved in these