On 27.09.2010 18:09, Julian Elischer wrote:
On 9/27/10 6:14 AM, Andre Oppermann wrote:
On 27.09.2010 15:18, Luigi Rizzo wrote:
On Mon, Sep 27, 2010 at 02:55:45PM +0200, Andre Oppermann wrote:
...
my idea was to have an extra field in the mbuf to tell how much room
should be reserved/used for metadata (such as mtags) after
the payload area so you don't need to change the allocator, and
possibly can even modify this on an existing mbuf.
Almost always mbufs have spare room (e.g. incoming pkts have all
data in the cluster and mostly empty mdata; outgoing, except
for rare cases, tend to be in a similar situation.
So this approach would allow to take an already allocated
mbuf and put the mtag in the spare area after the data.

For incoming data this approach could work as usually 2K mbuf clusters
are used and they have trailing space available, or rather the normal
mbuf referencing the cluster doesn't have its own data section unused.

When trailing space should be used the M_TAILINGSPACE() needs modifications
and a full tree audit is required to make sure that all mbuf consumers are
correctly using it and not some own version that directly assumes certain
mbuf sizes, etc. A lot of work.

For locally generated mbufs and socket buffers we try to use the mbufs to
their maximal extent. When the socket buffer data is packetized it normally
is referenced then we get the normal mbuf with its data portion unused. So
that could work.

A complication is the m_tag_free() field and function which puts the memory
deallocation into the hands of the mtag user. That means all mtag consumers
have to made aware of provided storage w/o having to return the memory
directly
to the memory allocator (malloc/UMA).

So the only way I realistically see is to make use of the mbuf's unused
data portion when it has external storage to it. This should probably
cover about 98% of all cases. The rest has to malloc() the mtag storage
as usual.

so it wouldn't be bad -- i cannot judge the numbers, but definitely
it would work for all incoming traffic, plus all tcp data packets
(as the payload is in the cluster), plus all pure acks (which are small),
plus all UDP above some 200 bytes...

Yes, about that.

I could whip up a prototype for review in the next weeks.

I seem to remember that jeffr had already something done in Perforce.

That's a more general overhaul of the way mbuf's are structured and
allocated with UMA. I'm not sure it provides for the mtag issue. Will
check though.

I'd like to see if we can go over his stuff and any other suggested changes 
before 9.0
and see if we can agree on a change for 9.0

Jeff, we discussed this a year ago.. do you still have your suggested changes?

In other recent communication Jeff indicated to revisit the mbuf/UMA
situation at around end of this year.

--
Andre
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[email protected]"

Reply via email to