On 2011/03/29, at 17:18, Steven Dake wrote:
> On 03/28/2011 09:06 PM, Zane Bitter wrote:
>>
>> On 2011/03/28, at 02:31, Steven Dake wrote:
>>
>>>
>>> Yes, the receive buffers handling was changed by your patch. What
>>> happens if a recv buffer is delivered to totemsrp? That recv buffer is
>>> put back in the "send buffers free list". Prior to this change, the
>>> recv buffer would be posted back to the recv queue via ibv_post_recv.
>>> totemsrp doesn't know anything about recv buffer vs send buffer, and I'd
>>> prefer to keep the complexities of rdma out of totemsrp.c.
>>>
>>> Regards
>>> -steve
>>
>> OK, that's true, I am assuming that anything passed to
>> iba_mcast_flush_send() or iba_mcast_noflush_send() is actually a send_buf
>> (which is wrong!). I did a hunt through the code to trace back where each
>> buffer that is sent to either of these functions is allocated. Full details
>> are below, but the executive summary:
>>
>> * buffers passed to totemrrp_token_send() are allocated on the stack or in
>> the totemsrp instance.
>> - I think we already decided to keep doing the memcpy() in the iba driver
>> for these
>> * buffers passed to totemrrp_mcast_flush_send() are allocated on the stack.
>> - This makes my patch wrong - we should probably continue to do a memcpy()
>> in the iba driver for these too
>> * buffers passed to totemrrp_mcast_noflush_send() are ultimately allocated
>> by totemsrp_buffer_alloc()
>>
>> I definitely agree that passing a receive buffer to iba_mcast_noflush_send()
>> would be bad, but if it's happening I can't find where. Intuitively, this
>> seems plausible to me: if we receive a packet we are not going to just
>> transmit it straight away; we need to make a copy in case we need to
>> retransmit it again later, and the net driver will want its buffer back as
>> soon as the deliver call up to totemsrp is finished (in the iba case it gets
>> reposted to the receive queue immediately once iba_deliver_fn() returns).
>>
>
> When a message is received via totem, it triggers:
> static int mcast_cq_recv_event_fn (hdb_handle_t poll_handle, int
> events, int suck, void *context)
>
> This causes the recv buffer to be posted back to the hardware recv
> buffer list:
> mcast_recv_buf_post in the same funtion
Yep, that's still there:
res = ibv_poll_cq (instance->mcast_recv_cq, 64, wc);
if (res > 0) {
for (i = 0; i < res; i++) {
iba_deliver_fn (instance, wc[i].wr_id, wc[i].byte_len);
mcast_recv_buf_post (instance, wrid2void(wc[i].wr_id));
// <<< This one
}
}
And it still posts the receive buffers back to the queue:
static inline int mcast_recv_buf_post (struct totemiba_instance *instance,
struct recv_buf *recv_buf)
{
struct ibv_recv_wr *fail_recv;
int res;
res = ibv_post_recv (instance->mcast_cma_id->qp, &recv_buf->recv_wr,
&fail_recv);
return (res);
}
> In your patches, when a message is received via totem, it is delivered,
> and then freed via totemsrp.c:messages_free() (which adds it to a list
> of free messages for the send queue) but never posts it back to the recv
> buffer.
totemsrp.c:messages_free() is freeing the messages that are stored in the
regular_sort_queue in the totemsrp instance. Per yesterday's analysis (below),
all of those appear to originate from a totemsrp_buffer_alloc() allocation
(i.e. they are all transmit buffers).
Are we looking at the same code? I have the patches applied on top of b4bef1cb
cheers,
Zane.
> Regards
> -steve
>
>> Tracing (somewhat less exhaustively) in the other direction, through the
>> deliver callbacks, I see the following things happening to the receive
>> buffers:
>> message_handler_orf_token() copied to the stack, sometimes retransmitted
>> from there
>> message_handler_mcast() copied to a buffer allocated with
>> totemsrp_buffer_alloc() and that buffer added to a sort queue
>> message_handler_memb_merge_detect() copied to the stack
>> message_handler_memb_join() passed to memb_join_process(), no transmit
>> message_handler_memb_commit_token() copied to the stack, sometimes
>> retransmitted from there
>> message_handler_token_hold_cancel() nothing
>>
>> But if I am still missing something, I would be _very_ happy to find out
>> where it is ;)
>>
>> thanks!
>> Zane.
>>
>>
>>
>> totemrrp_token_send()
>> - totemsrp.c:2585 instance->orf_token_retransmit
>> - totemsrp.c:2669 passed to token_send()
>> - totemsrp.c:2848 passed to memb_state_commit_token_send_recovery()
>> - totemsrp.c:2876 instance->commit_token (== instance->commit_token_storage)
>>
>> token_send()
>> - totemsrp.c:2741 <stack>
>> - totemsrp.c:3619 <stack>
>>
>> memb_state_commit_token_send_recovery()
>> - totemsrp.c:1992 passed to memb_state_recovery_enter()
>>
>> memb_state_recovery_enter()
>> - totemsrp.c:4309 alloca()
>>
>>
>> totemrrp_mcast_flush_send()
>> - totemsrp.c:2701 <stack>
>> - totemsrp.c:3011 <stack>
>> - totemsrp.c:3081 <stack>
>> - totemsrp.c:3101 <stack>
>>
>>
>> totemrrp_mcast_noflush_send()
>> - totemsrp.c:2273 instance->regular_sort_queue
>> - totemsrp.c:2273 instance->recovery_sort_queue
>> - totemsrp.c:2424 instance->retrans_message_queue
>> - totemsrp.c:2424 instance->new_message_queue
>>
>> instance->regular_sort_queue
>> - totemsrp.c:1691 instance->recovery_sort_queue
>> - totemsrp.c:1790 instance->recovery_sort_queue
>> - totemsrp.c:2422 instance->new_message_queue
>> - totemsrp.c:3869 totemsrp_buffer_alloc()
>>
>> instance->recovery_sort_queue
>> - totemsrp.c:2422 instance->retrans_message_queue
>> - totemsrp.c:3869 totemsrp_buffer_alloc()
>>
>> instance->retrans_message_queue
>> - totemsrp.c:2127 totemsrp_buffer_alloc()
>>
>> instance->new_message_queue
>> - totemsrp.c:2214 totemsrp_buffer_alloc()
>>
>
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais