I agree with Dana's sentiments here.
I'll add one thing, from my own experience: in my own experience the
pain and suffering of dealing with scatter/gather is not justified on
modern systems with ordinary sized ethernet frames. In fact, experience
shows that just copying frames (you can use mcopymsg()) into a
contiguous preallocated buffer is more efficient than trying to worry
about scatter/gather. (Scatter/gather is only really applicable when
you map buffers directly without copying them anyway.)
The simplifications in the code that come with mcopymsg() generally pay
huge dividends in debug and support, as well as performance. (The
per-packet processing overhead drops significantly, due to significant
reductions in the hot code path size, and far fewer branches to evaluate
as well.)
This is so much the case, that most of the latest Sun drivers have
abandoned direct mapping in favor of bcopy, at least for frame sizes of
ordinary ethernet MTUs.
The situation might be different for jumbo frames, but I doubt that the
hardware we're dealing with here can deal with frames larger than 2K.
-- Garrett
Dana H. Myers wrote:
> Steven Stallion wrote:
>> Garrett D'Amore wrote:
>>> In a word: Yes.
>>>
>>> I don't know enough about the chipset., but I'd probably allocate more
>>> than 2, even, since you might be passed down a packet chain and don't
>>> want to have to putbq() the packets (or have GLDv3 do so for you).
>>> Although, pcnet isn't going to be fast in any case. :-) You have room
>>> for 40 1600 byte frames (not including descriptor overhead) -- I'd
>>> configure 16 frames in each direction -- experience with other NIC
>>> drivers has shown that such a configuration works well. If that's too
>>> many, maybe do 16 rx and 8 tx.
>>
>> That makes much more sense. The chipset uses 256 byte alignment, so
>> there could be a pretty fair bit of overhead for smaller packet chains.
>>
>> That said, would it make sense to use scatter/gather on the tx side, or
>> should I just treat it as a typical ring?
> Network traffic generally breaks down into two classes - big
> (payload-bearing) frames and small (ack-bearing) frames. The role
> of a host has a lot to do with the breakdown of big vs. small frames,
> but you really don't want to make drivers require tweaking to work
> acceptably in any given role. When hardware gives me the option of
> scatter/gather, I'll use it, though I'm much happier with 64-byte
> chunks rather than 256-byte chunks.
>
> I once wrote a driver for the DP8390 for an experimental project,
> probably 12 years ago. The DP8390 doesn't offer the ability to do
> scatter/gather on transmit - the NIC requires a single contiguous
> frame for transmission. Correct? I'd keep two maxframe-sized
> transmit buffers, then.
>
> The DP8390 NIC, and those derived from it, had a double-handful of
> very annoying bugs; while the details don't spring to mind, I remember
> it was painful to get any decent performance out of the NIC. A
> quick search with Google is likely to shed some light on this.
>
> Given my recollections of the DP8390, I'd probably spend less
> time worrying about performance and more time worrying about stability
> ;-)
>
> Cheers,
> Dana
>
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss