On 22/02/19(Fri) 15:01, David Gwynne wrote:
> On Thu, Feb 21, 2019 at 04:29:27PM -0300, Martin Pieuchot wrote:
> > On 21/02/19(Thu) 14:19, David Gwynne wrote:
> > > right now we add vlan_input as a possible input handler on the parent
> > > interface, and if the packet is for a vlan we take it and pretend we
> > > received it on the vlan interface by calling if_input against that mbuf.
> > > 
> > > as mpi notes, the if input queue stuff looks like a lot of work,
> > > especially for a single packet. my opinion is that we got away with
> > > the if input stuff we've done to try and encourage an mpsafe network
> > > stack because we amortised the cost of it over many packets off the
> > > hardware ring. vlan does it a packet at a time.
> > > 
> > > this moves the handling of vlan packets back into ether_input by
> > > calling vlan_input directly on packets that are either marked as vlan
> > > tagged or have a vlan ethertype. note that we have to do that anyway,
> > > this just makes it explicit.
> > > 
> > > vlan_input is then tweaked to implement all the important bits of if
> > > input. part of what if input does is count the packets. because vlan
> > > already has per cpu counters for bypassing queues on output, we can use
> > > them again for input from any cpu. if i ever get round to making a
> > > driver handle multiple rx rings this means we can rx vlan packets
> > > concurrently, they don't get serialised to a single if input q.
> > > 
> > > finally, hrvoje popovski has tested this diff and get's a significant
> > > bump with it. on a machine that can forward 1100Kpps without vlan, it
> > > goes from 790Kpps with vlan to 870Kpps. On a box that can do 730Kpps
> > > without vlans, it goes from 550Kpps with vlan to 840Kpps. We're
> > > still trying to figure that last one out, but it does appear to be
> > > faster.
> > > 
> > > thoughts? ok?
> > 
> > Why do we need to move stuff to ether_input() if all we want is to
> > bypass ifiq_input()?  Isn't a 3 line diff enough^^ ?
> 
> Fair point. It turns out it's not quite three lines, but it's still
> smaller.

I'm unhappy to see the bpf & packet magic reappear in pseudo-drivers.

This is going to spread in every pseudo-driver, no?  So why not keeping
it in the new API?   Should we document if_input() vs if_input_one()?
Should we assert that if_input_one() is only called from a network
thread?  If yes, should we pick a better name?

Reply via email to