Mark Johnson wrote:



Brian Xu - Sun Microsystems - Beijing China wrote:
> If the mblk could store the PFN info and we had a
> ddi_dma_mblk_bind_handle() like interface, then I think it will benefit
> the performance of the NIC drivers. I consulted the PAE, and got a
> answer that the bcopy is typically about 10-15% of a NIC TX workload.

This is a good area to investigate. I don't believe a new
DDI interface is the way to approach it though (e.g.
ddi_dma_mblk_bind_handle()). It should be an option to be
done for you in gld. i.e. gld gives you a list of cookies
that fit within your dma constraints (dma_attr).
I like the idea of gld´s giving a list of cookies to the driver. This would make the life of the drivers easier.

Then how the gld gets those cookies? gld still need to do dma bind on the MBLKs. so caching PFNs and use a ddi_dma_mblk_bind_handle() like interface is one way to approach.

There's more to it than just caching a PFN though. The real
solution is to bring the rx and tx code path optimizations
around buffer management into a common piece of code (I know
it sounds blue sky :-), but it's what's needed long term).
It's more important to bring all the NICs up to a consistent
level. I would expect different code paths for both different
platforms and different NIC properties.
Are there any existing docs on this project. I am interested in them and I´d like to have a look.

Thanks,
Brian

hat_getpfnum() on x86 is not cheap. Your not trying to improve
performance as much as reduce CPU overhead. Secondly,
assuming that bcopy will always be as fast, as it is
today, is a bad assumption. As you can see with the Niagara
family, and you will (I predict) see in x86 as soon as
they increase their thread/core count per socket to
>= 16-32ish, the per "thread/core" copy bandwidth is
less since the H/W needs to provide fair memory/bus
bandwidth to all of the "threads/cores". On Niagara today,
this requires multiple rx copy threads for 10G NICs.
These are the types of optimizations which need to be in
a common piece of code so that all NICs can take advantage
of them without having to hard code platform knowledge
in each one.

When a significantly different new platform comes out, there
should only have to be one piece of code which has to be optimized
instead of having to touch every NIC driver again. Obviously
that is only a goal :-). Not 100% achievable. But a close enough
solution is sufficient.



Garrett D'Amore wrote:
Actually, IOMMU resources are a bigger issue.
With bcopy, the pre-allocated dma also occupies IOMMU entry. without bcopy, more IOMMU entries are needed and are allocated on the fly. So do you mean there may be not enough IOMMU entries? Please clarify.

Without bcopy, you might have to allocate more IOMMU entries. Its a bigger problem on the rx path when you do loanup and buffer recycling (using esballoc),
>
but even on the tx side, if you have a packet that is spread across multiple pages (or chained mbufs even!), then you might need more IOMMU entries. And, usually you still have the IOMMU entries for bcopy because you *really* want to bcopy for small packets unless you want to have terrible small packet performance.

Any changes should address both with and without IOMMU.
You *cannot* assume that FORCE_PHYSICAL will always work
or that x86 will always not use an IOMMU. There are some
folks who will take the performance penalty for the extra
memory protection (I am not one of them :-) ).

If the device is using an IOMMU, some suggestions for
optimizing the tx code path is to have a ddi_dma_alloc_handle()
like interface which you can pass the max buf size to the
routine.  This can then alloc the DMA handle, reserve IOMMU
space, cache the IOMMU address in the dma handle. This would
work well for NICs since they are small buffers. This would
be in place of the old dvma_reserve interfaces.

Again, this is another place where the code should
be common for all NICs.  For TX, all they should
have to care about is a cookie list which they need to
send, which fits within their dma constraints. The
common code should take the IOMMU or lack of one,
platform, etc., into consideration.



MRJ






_______________________________________________
driver-discuss mailing list
driver-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/driver-discuss

Reply via email to