nes: add support of iWARP multicast acceleration over IB_QPT_RAW_ETY QP type

Walukiewicz, Miroslaw Wed, 05 May 2010 06:42:30 -0700

Steve, 

> ud_post_send and friends implements the transmit path for IMA. Our RAW ETH QP 
> needs access to physical addresses from user space. Due to security reasons 
> we should make a virtual-to-physical address translation in kernel. 
>
>   
Steve Wise wrote:
But why couldn't you just use the normal memory registration paths?  IE 
the user mode app does ibv_reg_mr() and then uses lkey/addr/len in SGEs 
in the ibv_post_send() which could do kernel bypass.

I see here some misunderstanding. Let me explain better how our tramsmit path 
works. 

In our implementation we use normal memory registration path using ibv_reg_mr 
and we use ibv_post_send() with lkey/vaddr/len.

The implementation of ibv_post_send (nes_post_send in libnes) for RAW QP passes 
lkey/virtual_addr/len information to kernel using shared page to our device 
driver (ud_post_send). There is no data copy here and the driver is used only 
for fast synchronization.

Because our RAW ETH QP must use physical addresses only,  ud_post_send() in 
kernel makes a virtual to physical memory translation and accesses the QP HW 
for packet transmission. Previously a packet buffer memory was registered and 
pinned by ibv_reg_mr to provide necessary information for making such 
translation.

Steve Wise wrote:
Seems like maybe you could fix the non-bypass post_send/recv paths 
instead of implementing an entirely new user<->kernel interface...

The non-bypass post_send/recv channel (using /dev/infiniband/rdma_cm) is shared 
with all other user-kernel  communication and it is quite complex. It is a 
perfect path for QP/CQ/PD/mem management but for me it is too complex for 
traffic acceleration. 

The user<->kernel  path  through additional driver, shared page for 
lkey/vaddr/len passing and SW memory translation in kernel is much more 
effective. 

Maybe it is a good idea to make that API more official after some kind of 
standarization. Our tests proved that it works. We achieved twice better 
performance and latency. That way could open the way for adding some non-RDMA 
devices to devices supported OFED API. 

Regards,

Mirek

-----Original Message-----
From: Steve Wise [mailto:[email protected]] 
Sent: Tuesday, May 04, 2010 8:15 PM
To: Walukiewicz, Miroslaw
Cc: [email protected]; [email protected]
Subject: Re: [PATCH 2/2] RDMA/nes: add support of iWARP multicast acceleration 
over IB_QPT_RAW_ETY QP type

Walukiewicz, Miroslaw wrote:
> Hello Steve,
>
> Our Hw QP is not a UD type QP but L2 raw QP. In verbs API there is assumtion 
> that user provides a data payload only for TX and similarly receives a 
> payload only. The protocol headers (in case of UD - MAC/IP/UDP) are attached 
> by HW. 
>
> Our QP implementation in HW  does not provide such possibity of attaching 
> headers by HW for UD traffic so for multicast acceleration we choose L2 raw 
> path. It provides some overhead for user application but it is still zero 
> copy apprach.
>
> I thought about using a simulation of UD path using L2 raw QP to get the same 
> result like for true UD QP (user handles a payload only). Such approach costs 
> additional copy of payload in SW due to putting headers first and next 
> payload to single tx buffer. Similar situation is for rx. It is a need for 
> copy payload to posted buffers or provide data with some offset. 
>
> ud_post_send and friends implements the transmit path for IMA. Our RAW ETH QP 
> needs access to physical addresses from user space. Due to security reasons 
> we should make a virtual-to-physical address translation in kernel. 
>
>   

But why couldn't you just use the normal memory registration paths?  IE 
the user mode app does ibv_reg_mr() and then uses lkey/addr/len in SGEs 
in the ibv_post_send() which could do kernel bypass.

> Unfortunately an OFED path for ibv_post_send diving to kernel is quite slow 
> due to some number of dynamic memory allocations in the path. We choose to 
> create own private post_send channel to increase tx bandwidth using 
> ud_post_send and friends.

Seems like maybe you could fix the non-bypass post_send/recv paths 
instead of implementing an entirely new user<->kernel interface...

Steve.

>  
>
> Regards,
>
> Mirek
>
> -----Original Message-----
> From: Steve Wise [mailto:[email protected]] 
> Sent: Tuesday, May 04, 2010 7:19 PM
> To: Walukiewicz, Miroslaw
> Cc: [email protected]; [email protected]
> Subject: Re: [PATCH 2/2] RDMA/nes: add support of iWARP multicast 
> acceleration over IB_QPT_RAW_ETY QP type
>
> Hey Mirek,
>
> It looks like this patch adds a new file interface for a UD service.  
> Why didn't you extend the existing UD interface as needed? 
>
> What IO is supported with these changes?  IMA via the raw QP, but what 
> ud_post_send() and friends used for?
>
>
> Steve.
>
>
>
> [email protected] wrote:
>   
>> This patch implements iWarp multicast acceleration (IMA)
>> over IB_QPT_RAW_ETY QP type in nes driver.
>>
>> Application creates a raw eth QP (IBV_QPT_RAW_ETH in user-space) and
>> manages the multicast via ibv_attach_mcast and ibv_detach_mcast calls.
>>
>> Calling ibv_attach_mcast/ibv_datach_mcast has an effect of
>> enabling/disabling L2 MAC address filters in HW.
>>
>> Signed-off-by: Mirek Walukiewicz <[email protected]>
>>
>>
>>
>>     

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 2/2] RDMA/nes: add support of iWARP multicast acceleration over IB_QPT_RAW_ETY QP type

Reply via email to