Yes, the kernel crashed after watchdog detected a lockup. This happened while running udaddy -p 0x2 with RoCE. The crash isn't 100% reproducible but I have a pair of nodes where that used to crash with probability of 50%. The fix you suggest seems to detect the mismatch between port space and link layer earlier, so I guess it handles the bug better. I'll rewrite, retest and send a fix on Sunday.
- monis On Thu, Jul 7, 2011 at 7:55 PM, Hefty, Sean <[email protected]> wrote: >> In general, when link layer is ETHERNET it is wrong to use IPoIB port space >> since >> no IPoIB interface is available. Specifically, setting qkey when port space >> is >> RDMA_PS_IPOIB, requires SA query which is impossible when link layer is >> IB_LINK_LAYER_ETHERNET. > > Can you describe the problem that the current code causes? Does it lead to a > kernel crash? > > Maybe the issue is that port space ipoib should never be associated with a > non-IB device, with a check added to cma_acquire_dev(). > > - Sean > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to [email protected] > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
