Andrew Gallatin wrote:
Garrett D'Amore wrote:
But IMO, we're probably optimizing the wrong part of the stack here.
bcopy is 10% of the performance hit, according to RPE. What about
the other 90%?
The right part of the stack has already been optimized, it is called
TSO (or LSO by Solaris).
<..>
I'd rather avoid continuing to grossly complicate device drivers with
DMA details unless there is a significant benefit to doing so. Right
now, for ethernet, I'm not sure there is. (Again, Jumbo Frames
changes the trade off, a lot. Primarily because it eliminates most
of the other overhead so that bcopy dominates.)
For typical traffic, on typical segments, you can't use jumbo frames,
so spending all your effort trying to make dma work faster is
probably not the best use of you energy.
Yes, but you can use LSO. LSO effectively turns *every* bulk
data transmission into a super-dooper jumbo frame, where 64KB
is passed down to the driver in one shot. So, I'm sorry, but
you cannot sweep the "jumbo frame" case under the rug anymore.
Yes, for LSO or LRO, better mapping would be good. Most NIC drivers
don't support this, though. (10GbE ones are the notable exception.)
- -Garrett
Drew
_______________________________________________
driver-discuss mailing list
driver-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/driver-discuss