From a quick look at the code, it does look like there are some races
in ipoib_multicast.c.  The place where a QP is actually attached to a
group is essentially (trimming debug prints):

                if (test_and_set_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags))
                        return 0;

                ret = ipoib_mcast_attach(dev, be16_to_cpu(mcast->mcmember.mlid),
                                         &mcast->mcmember.mgid);

and the place where a QP is detached is:

        if (test_and_clear_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags)) {
                ret = ipoib_mcast_detach(dev, be16_to_cpu(mcast->mcmember.mlid),
                                         &mcast->mcmember.mgid);

Going back to 2.6.20 (pre-multicast changes), this area of the code looks like it has the same race. Was IPoIB HA testing done on 2.6.20 or earlier versions of the code, and if so, were any issues found? (I'm not sure we've found all of the problems yet.)

- Sean
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to