On 22/02/19(Fri) 15:01, David Gwynne wrote: > On Thu, Feb 21, 2019 at 04:29:27PM -0300, Martin Pieuchot wrote: > > On 21/02/19(Thu) 14:19, David Gwynne wrote: > > > right now we add vlan_input as a possible input handler on the parent > > > interface, and if the packet is for a vlan we take it and pretend we > > > received it on the vlan interface by calling if_input against that mbuf. > > > > > > as mpi notes, the if input queue stuff looks like a lot of work, > > > especially for a single packet. my opinion is that we got away with > > > the if input stuff we've done to try and encourage an mpsafe network > > > stack because we amortised the cost of it over many packets off the > > > hardware ring. vlan does it a packet at a time. > > > > > > this moves the handling of vlan packets back into ether_input by > > > calling vlan_input directly on packets that are either marked as vlan > > > tagged or have a vlan ethertype. note that we have to do that anyway, > > > this just makes it explicit. > > > > > > vlan_input is then tweaked to implement all the important bits of if > > > input. part of what if input does is count the packets. because vlan > > > already has per cpu counters for bypassing queues on output, we can use > > > them again for input from any cpu. if i ever get round to making a > > > driver handle multiple rx rings this means we can rx vlan packets > > > concurrently, they don't get serialised to a single if input q. > > > > > > finally, hrvoje popovski has tested this diff and get's a significant > > > bump with it. on a machine that can forward 1100Kpps without vlan, it > > > goes from 790Kpps with vlan to 870Kpps. On a box that can do 730Kpps > > > without vlans, it goes from 550Kpps with vlan to 840Kpps. We're > > > still trying to figure that last one out, but it does appear to be > > > faster. > > > > > > thoughts? ok? > > > > Why do we need to move stuff to ether_input() if all we want is to > > bypass ifiq_input()? Isn't a 3 line diff enough^^ ? > > Fair point. It turns out it's not quite three lines, but it's still > smaller.
I'm unhappy to see the bpf & packet magic reappear in pseudo-drivers. This is going to spread in every pseudo-driver, no? So why not keeping it in the new API? Should we document if_input() vs if_input_one()? Should we assert that if_input_one() is only called from a network thread? If yes, should we pick a better name?
