On Thu, Dec 24, 2015 at 9:46 AM, Matan Barak <[email protected]> wrote:
> On Wed, Dec 23, 2015 at 10:04 PM, Doug Ledford <[email protected]> wrote:
>> On 10/15/2015 12:58 PM, Hefty, Sean wrote:
>>>>>> ib_create_ah_from_wc needs to resolve the DMAC in order to create the
>>>>>> AH (this may result sending an ARP and waiting for response).
>>>>>> CM uses this function (which is now sleepable).
>>>>>
>>>>> This is a significant change to the CM. The CM calls are invoked
>>>> assuming that they return relatively quickly. They're invoked from
>>>> callbacks and internally. Having the calls now wait for an ARP response
>>>> requires that this be re-architected, so the calling thread doesn't go out
>>>> to lunch for several seconds.
>>>>
>>>> Agree - this is a significant change, but it was done a long time ago
>>>> (at v4.3 if I recall). When we need to send a message we need to
>>>
>>> We're at 4.3-rc5?
>>>
>>>> figure out the destination MAC. Even the passive side needs to do that
>>>> as some vendors don't report the source MAC of the packet in their wc.
>>>> Even if they did, since IP based addressing is rout-able by its
>>>> nature, it should follow the networking stack rules. Some crazy
>>>> configurations could force sending responses to packets that came from
>>>> router1 to router2 - so we have no choice than resolving the DMAC at
>>>> every side.
>>>
>>> Ib_create_ah_from_wc is broken. It is now an asynchronous operation, only
>>> the call itself was left as synchronous. We can't block kernel threads for
>>> a minute, or however long ARP takes to resolve. The call itself must
>>> change to be async, and all users of it updated to allocate some request,
>>> queue it, and handle all race conditions that result -- such as state
>>> changes or destruction of the work that caused the request to be initiated.
>>>
>>
>> I don't know who had intended to address this, but it got left out of
>> the 4.4 work. We need to not let this drop through the cracks (for
>> another release). Can someone please put fixing this properly on their
>> TODO list?
>>
>
> IMHO, the proposed patch makes things better. Not applying the current
> patch means we have a "sleeping while atomic" error (in addition to
> the fact that kernel threads could wait until the ARP process
> finishes), which is pretty bad. I tend to agree that adding another CM
> state is probably a better approach, but unless someone steps up and
> add this for v4.5, I think that's the best thing we have.
>
>> --
>> Doug Ledford <[email protected]>
>> GPG KeyID: 0E572FDD
>>
>>
>
> Matan
Yishai has found a double free bug in the error flow of this patch.
The fix is pretty simple.
Thanks Yishai for catching and testing this fix.
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 07a3bbf..832674f 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -296,10 +296,9 @@ static int _cm_alloc_response_msg(struct cm_port *port,
0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
GFP_ATOMIC,
IB_MGMT_BASE_VERSION);
- if (IS_ERR(m)) {
- ib_destroy_ah(ah);
+ if (IS_ERR(m))
return PTR_ERR(m);
- }
+
m->ah = ah;
*msg = m;
return 0;
@@ -310,13 +309,18 @@ static int cm_alloc_response_msg(struct cm_port *port,
struct ib_mad_send_buf **msg)
{
struct ib_ah *ah;
+ int ret;
ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
mad_recv_wc->recv_buf.grh, port->port_num);
if (IS_ERR(ah))
return PTR_ERR(ah);
- return _cm_alloc_response_msg(port, mad_recv_wc, ah, msg);
+ ret = _cm_alloc_response_msg(port, mad_recv_wc, ah, msg);
+ if (ret)
+ ib_destroy_ah(ah);
+
+ return ret;
}
static void cm_free_msg(struct ib_mad_send_buf *msg)
Doug, if you intend to take this patch. I can squash this fix and respin it.
Thanks,
Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html