Re: [driver-discuss] [networking-discuss] A quest ion: can be avoid using ´bcopy´ in Tx of the NIC dri ver?

Brian Xu - Sun Microsystems - Beijing China Tue, 03 Mar 2009 18:31:00 -0800

Mark Johnson wrote:

Brian Xu - Sun Microsystems - Beijing China wrote:
> If the mblk could store the PFN info and we had a
> ddi_dma_mblk_bind_handle() like interface, then I think it will benefit
> the performance of the NIC drivers. I consulted the PAE, and got a
> answer that the bcopy is typically about 10-15% of a NIC TX workload.

This is a good area to investigate. I don't believe a new
DDI interface is the way to approach it though (e.g.
ddi_dma_mblk_bind_handle()). It should be an option to be
done for you in gld. i.e. gld gives you a list of cookies
that fit within your dma constraints (dma_attr).

I like the idea of gld´s giving a list of cookies to the driver. Thiswould make the life of the drivers easier.

Then how the gld gets those cookies? gld still need to do dma bind onthe MBLKs. so caching PFNs and use a ddi_dma_mblk_bind_handle() likeinterface is one way to approach.


There's more to it than just caching a PFN though. The real
solution is to bring the rx and tx code path optimizations
around buffer management into a common piece of code (I know
it sounds blue sky :-), but it's what's needed long term).
It's more important to bring all the NICs up to a consistent
level. I would expect different code paths for both different
platforms and different NIC properties.

Are there any existing docs on this project. I am interested in them andI´d like to have a look.


Thanks,
Brian


hat_getpfnum() on x86 is not cheap. Your not trying to improve
performance as much as reduce CPU overhead. Secondly,
assuming that bcopy will always be as fast, as it is
today, is a bad assumption. As you can see with the Niagara
family, and you will (I predict) see in x86 as soon as
they increase their thread/core count per socket to
>= 16-32ish, the per "thread/core" copy bandwidth is
less since the H/W needs to provide fair memory/bus
bandwidth to all of the "threads/cores". On Niagara today,
this requires multiple rx copy threads for 10G NICs.
These are the types of optimizations which need to be in
a common piece of code so that all NICs can take advantage
of them without having to hard code platform knowledge
in each one.

When a significantly different new platform comes out, there
should only have to be one piece of code which has to be optimized
instead of having to touch every NIC driver again. Obviously
that is only a goal :-). Not 100% achievable. But a close enough
solution is sufficient.



Garrett D'Amore wrote:

Actually, IOMMU resources are a bigger issue.
With bcopy, the pre-allocated dma also occupies IOMMU entry. withoutbcopy, more IOMMU entries are needed and are allocated on the fly.So do you mean there may be not enough IOMMU entries? Please clarify.
Without bcopy, you might have to allocate more IOMMU entries. Its abigger problem on the rx path when you do loanup and buffer recycling(using esballoc),

but even on the tx side, if you have a packet that is spread acrossmultiple pages (or chained mbufs even!), then you might need moreIOMMU entries. And, usually you still have the IOMMU entries forbcopy because you *really* want to bcopy for small packets unless youwant to have terrible small packet performance.


Any changes should address both with and without IOMMU.
You *cannot* assume that FORCE_PHYSICAL will always work
or that x86 will always not use an IOMMU. There are some
folks who will take the performance penalty for the extra
memory protection (I am not one of them :-) ).

If the device is using an IOMMU, some suggestions for
optimizing the tx code path is to have a ddi_dma_alloc_handle()
like interface which you can pass the max buf size to the
routine.  This can then alloc the DMA handle, reserve IOMMU
space, cache the IOMMU address in the dma handle. This would
work well for NICs since they are small buffers. This would
be in place of the old dvma_reserve interfaces.

Again, this is another place where the code should
be common for all NICs.  For TX, all they should
have to care about is a cookie list which they need to
send, which fits within their dma constraints. The
common code should take the IOMMU or lack of one,
platform, etc., into consideration.



MRJ


_______________________________________________
driver-discuss mailing list
driver-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/driver-discuss

Re: [driver-discuss] [networking-discuss] A quest ion: can be avoid using ´bcopy´ in Tx of the NIC dri ver?

Reply via email to