On Thu, Dec 24, 2015 at 9:46 AM, Matan Barak <mat...@dev.mellanox.co.il> wrote:
> On Wed, Dec 23, 2015 at 10:04 PM, Doug Ledford <dledf...@redhat.com> wrote:
>> On 10/15/2015 12:58 PM, Hefty, Sean wrote:
>>>>>> ib_create_ah_from_wc needs to resolve the DMAC in order to create the
>>>>>> AH (this may result sending an ARP and waiting for response).
>>>>>> CM uses this function (which is now sleepable).
>>>>>
>>>>> This is a significant change to the CM.  The CM calls are invoked
>>>> assuming that they return relatively quickly.  They're invoked from
>>>> callbacks and internally.  Having the calls now wait for an ARP response
>>>> requires that this be re-architected, so the calling thread doesn't go out
>>>> to lunch for several seconds.
>>>>
>>>> Agree - this is a significant change, but it was done a long time ago
>>>> (at v4.3 if I recall). When we need to send a message we need to
>>>
>>> We're at 4.3-rc5?
>>>
>>>> figure out the destination MAC. Even the passive side needs to do that
>>>> as some vendors don't report the source MAC of the packet in their wc.
>>>> Even if they did, since IP based addressing is rout-able by its
>>>> nature, it should follow the networking stack rules. Some crazy
>>>> configurations could force sending responses to packets that came from
>>>> router1 to router2 - so we have no choice than resolving the DMAC at
>>>> every side.
>>>
>>> Ib_create_ah_from_wc is broken.   It is now an asynchronous operation, only 
>>> the call itself was left as synchronous.  We can't block kernel threads for 
>>> a minute, or however long ARP takes to resolve.  The call itself must 
>>> change to be async, and all users of it updated to allocate some request, 
>>> queue it, and handle all race conditions that result -- such as state 
>>> changes or destruction of the work that caused the request to be initiated.
>>>
>>
>> I don't know who had intended to address this, but it got left out of
>> the 4.4 work.  We need to not let this drop through the cracks (for
>> another release).  Can someone please put fixing this properly on their
>> TODO list?
>>
>
> IMHO, the proposed patch makes things better. Not applying the current
> patch means we have a "sleeping while atomic" error (in addition to
> the fact that kernel threads could wait until the ARP process
> finishes), which is pretty bad. I tend to agree that adding another CM
> state is probably a better approach, but unless someone steps up and
> add this for v4.5, I think that's the best thing we have.
>
>> --
>> Doug Ledford <dledf...@redhat.com>
>>               GPG KeyID: 0E572FDD
>>
>>
>
> Matan

Yishai has found a double free bug in the error flow of this patch.
The fix is pretty simple.
Thanks Yishai for catching and testing this fix.

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 07a3bbf..832674f 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -296,10 +296,9 @@ static int _cm_alloc_response_msg(struct cm_port *port,
                               0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
                               GFP_ATOMIC,
                               IB_MGMT_BASE_VERSION);
-       if (IS_ERR(m)) {
-               ib_destroy_ah(ah);
+       if (IS_ERR(m))
                return PTR_ERR(m);
-       }
+
        m->ah = ah;
        *msg = m;
        return 0;
@@ -310,13 +309,18 @@ static int cm_alloc_response_msg(struct cm_port *port,
                                 struct ib_mad_send_buf **msg)
 {
        struct ib_ah *ah;
+       int ret;

        ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
                                  mad_recv_wc->recv_buf.grh, port->port_num);
        if (IS_ERR(ah))
                return PTR_ERR(ah);

-       return _cm_alloc_response_msg(port, mad_recv_wc, ah, msg);
+       ret = _cm_alloc_response_msg(port, mad_recv_wc, ah, msg);
+       if (ret)
+               ib_destroy_ah(ah);
+
+       return ret;
 }

 static void cm_free_msg(struct ib_mad_send_buf *msg)


Doug, if you intend to take this patch. I can squash this fix and respin it.

Thanks,
Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to