Re: [driver-discuss] [networking-discuss] A question : can be avoid using ´bcopy´ in Tx of the NIC drive r?

Brian Xu - Sun Microsystems - Beijing China Mon, 02 Mar 2009 23:15:32 -0800

zeeshanul huq - Sun Microsystems - Beijing China wrote:

Brain,
Brian Xu - Sun Microsystems - Beijing China wrote:
zeeshanul huq - Sun Microsystems - Beijing China wrote:
Hi Brian,

The overhead of it is not only dma binding, but also unbinding.
If no copybuf is used, the overhead of the unbinding is quite quitesmall comparing to the binding.
I'm sure it is quite quite smaller. But I'm also thinking of theoverhead of maintaining the dma handlers list for each packet. If wecan use single dma handler for each packet, maintaining of it will bemuch easier. In that case, unbinding has to be much faster.
And some other shortages are:
1) we have to hold the MBLKs until packet transmition complete. Withbcopy we are able to free them immediately. So when the system arenear to running out of MBLKs, bcopy works better.
I don't know when running out of MBLKs occurs. When the system isshort of kernel memory?
I observed it from time to time during netstress test running with alarge number of UDP sessions.
If it is the case, then the extra bcopy also consumes kernel memory.
I think there is some difference. In our NIC driver, we usepre-alloced dedicated memory resource for bcopy. On the other hand,MBLKs are more widely sharable resource and so they are more easy toget exhausted in some cases .

Ok. I see.

Thanks,
Brian

2) In some driver like bge, it only has a small number of TX bufferdescriptor. With bcopy, it ensures one BD per transmit packet, whileit may require more than one with dma_bind. so using dma bind, itwill run out of Tx BD more quicker during heavy traffic.
Yes. This is reasonable.
That's part of the reasons why we use both bcopy and dma_bind in ourNIC driver. I agree we need a more faster dma binding and unbindingsolution.
What I suggested is one way to get much faster dma binding.
Of course, the original binding is also kept to meet the bcopyrequirement.
Thanks,
Brian
Regards,
Zeeshanul Huq

Garrett D'Amore wrote:
Brian Xu - Sun Microsystems - Beijing China wrote:
Hi there,

I have a question here:
Why all of the NIC drivers have to bcopy the MBLKs for transmit?(some of them bcopy always, and some others bcopy under athreshold of the packet length).
I think one of the reason is the overhead of the setup of dma onthe fly is greater than the overhead of bcopy for short packets. Iwant to know if this is the case and if there are any other reasons.
Yes. For any packet reasonably sized bcopy (ETHERMTU or smaller)is faster on *all* recent hardware. (This is confirmed on even anolder 300MHz Via C3.) (Hmm... I've heard that for some Niagrasystems this might not be true, however. But I've not tested itmyself.)
I think the situation is different with jumbo frames, though.
If what I guess is the major cause, I have a proposal and I wantto hear your advice whether it makes sense.
The most time-consuming action for the dma setup is the dma bind,more specific, calling into the VM layer to get the PFN for thevaddr(hat_getpfnum()), since it need to search the huge pagetable. While for the MBLKs, essentially which are slab objects,the PFN has already been determined in the slab layer, and formost of their usage, we only touch the magazine layer, where thePFN is a pre determined one. That is, the PFN should be consideredas a constructed state, but we don't leverage it for dma bind.
In storage, we have a field 'b_shadow' in buf(9S) to store thepages which are recently used, through which the PFNs can beeasily got. so inthe case that b_shadow works, ddi_dma_buf_bind_handle() is muchfaster than the ddi_dma_mem_bind_handle().Another example, moving the dma bind of the HBA driver(mpt) fromTx path to the kmem cache constrcutor, mpt driver got 26%throughput increment. See CR6707308.
If the mblk could store the PFN info and we had addi_dma_mblk_bind_handle() like interface, then I think it willbenefit the performance of the NIC drivers. I consulted the PAE,and got a answer that the bcopy is typically about 10-15% of aNIC TX workload.
There are things that can do to make DMA faster, better, andsimpler. In an ideal world, the GLDv3 could do most of this work,and the mblk could just carry the ddi_dma_cookie with it.
   -- Garrett
Thanks,
Brian

_______________________________________________
driver-discuss mailing list
driver-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/driver-discuss
_______________________________________________
networking-discuss mailing list
networking-disc...@opensolaris.org
_______________________________________________
networking-discuss mailing list
networking-disc...@opensolaris.org


_______________________________________________
driver-discuss mailing list
driver-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/driver-discuss

Re: [driver-discuss] [networking-discuss] A question : can be avoid using ´bcopy´ in Tx of the NIC drive r?

Reply via email to