Or Gerlitz wrote:
>> If I am not mistaken the issue you mention is a little different from the 
>> one I pointed out.
>> Without bonding I see the following:
>> kernel: ib0: multicast join failed for 
>> ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11
>> However, with bonding what I see is :
>> ib0: multicast join failed for 0001:0000:0000:0000:0000:0000:0000:0000, 
>> status -22
> 
> Please note that -11 EAGAIN (try again) is and -22 is EINVAL (invalid
> argument). So you can get EAGAIN when the underlying core sa agent is
> not ready to send SA queries, while you get EINVAL when attempting to
> join on a junk MGID. I am confident that for long time we see joins on
> junk MGIDs and it has been reported on this list (google...) in the
> past, no resolution yet.

Or,

I looked through the mailing list going back more than a year. The closest
I can find to this issue (-EINVAL) was when you reported problems with junk 
MGID on a
child interface (and that works properly now).

I agree that the -EAGAIN problem has been known for some time now. However, 
this issue with
IPoIB bonding is new. My recollections are that it all worked properly around 
end October.
I had not tested since then, so this is something that must have cropped in the 
interregnum.

> 
> Under bonding there might be a window is time where from the kernel
> network stack perspective the bonding device ether-type is ethernet
> and not infiniband and hence the wrong (ip_eth_mc_map instead of
> ip_ib_mc_map) function would be called to do the mapping from the IP
> multicast address to the HW multicast address
> 
> 
>> Subsequently an ib-bond status does not reveal any slave as active as shown 
>> below:
>> ib-bond --status
>> bond0: 80:00:04:04:fe:80:00:00:00:00:00:00:00:05:ad:00:00:03:05:b9
>> slave0: ib0
>> slave1: ib1
> 
> As this script is not standard and deprecated, I would recommend not
> to use it but rather the classic /proc/net/bonding/bond0 entry, along
> with ip addr show on bond0, ib0, ib1
Thanks for alerting me to the fact that the ib-bond script was deprecated. 
Again this seemed
to all work about 6 weeks ago. Is that (ib-bond is deprecated) documented 
somewhere?

Pradeep

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to