Re: Another attempt to bypass ifqs on interfaces
On Sun, Dec 23, 2018 at 01:53:48PM -0200, Martin Pieuchot wrote: > On 20/12/18(Thu) 11:35, David Gwynne wrote: > > My last attempt in "let etherip(4) output directly to the ip stack" > > didn't go so well. This one at least doesn't break bridge(4) on > > Ethernet. > > It breaks it on mpw(4). I argue that mpw(4) is broken anyway, but this would make it worse, yes. > > This diff effectively turns if_enqueue into a per interface function > > pointer. Things still call if_enqueue(), which does some pf stuff, and then > > it calls ifp->if_enqueue. By default, that function pointer is > > if_enqueue_ifq() which puts the packet on the interface send queues. > > > > Ethernet interfaces intercept the call to if_enqueue_ifq(). They get set > > up to call ether_enqueue, which does the bridge stuff and then calls > > ac->ac_enqueue, which shadows the ifp->if_enqueue functionality. > > Why do we need two abstraction layers? It feels weird to have > pseudo-drivers rely on a pointer in 'struct arpcom', because > they don't necessarily need this structure. If a pseudo-driver pretends to be Ethernet it will have arpcom. The main reason to have two layers for Ethernet interfaces is to take bridge(4) knowledge out of the generic interface code. > > This takes Ethernet bridge(4) knowledge out of the generic interface > > code. Non-Ethernet things that want to bypass ifqs can override > > ifp->if_enqueue. Ethernet things can override ac->ac_enqueue. > > Why take it out of the generic? We spent a lot of energy putting it > there, now you're taking the opposite direction. So let's spread > NBRIDGE > 1 all over the stack again? The bridge stuff would just be in if_ethersubr.c. Hopefully. if_isconnected() would be harder to pull apart. > > > This diff includes a change to vlan(4) to take advantage of this. It > > overrides ac_enqueue with vlan_enqueue(). So a packet sent out a vlan > > interface will go into if_enqueue(), which will call ether_enqueue() > > because ether_ifattach set that up. ether_enqueue() will call > > ac->ac_enqueue which is vlan_enqueue on vlan interfaces, which will > > eventually call if_enqueue() on the parent interface. > > I like the idea behind vlan_transmit(). > > > There are some things I'd like to try if this goes in. eg, it may be > > possible to have ether_enqueue simple call ac->ac_enqueue, and let > > bridge(4) swap it out with bridge_enqueue code. Doing that now would > > complicate the diff though. > > Why? What's the gain? Avoiding a conditional in a hotpath? > > Anyway, how's this look, esp relative to the previous diff? > > It looks better but I'm afraid this diff still breaks stuff, I'd > suggest you to keep your fix simple. I'm afraid with all the > abstractions that you're introducing. They look like premature > optimizations to me. Fair enough. > One more comment below. > > > Index: net/if.c > > === > > RCS file: /cvs/src/sys/net/if.c,v > > retrieving revision 1.569 > > diff -u -p -r1.569 if.c > > --- net/if.c11 Dec 2018 22:08:57 - 1.569 > > +++ net/if.c19 Dec 2018 06:48:37 - > > @@ -683,34 +685,33 @@ if_qstart_compat(struct ifqueue *ifq) > > int > > if_enqueue(struct ifnet *ifp, struct mbuf *m) > > { > > - unsigned int idx; > > - struct ifqueue *ifq; > > - int error; > > - > > #if NPF > 0 > > if (m->m_pkthdr.pf.delay > 0) > > return (pf_delay_pkt(m, ifp->if_index)); > > -#endif > > - > > -#if NBRIDGE > 0 > > - if (ifp->if_bridgeport && (m->m_flags & M_PROTO1) == 0) { > > - KERNEL_LOCK(); > > - error = bridge_output(ifp, m, NULL, NULL); > > - KERNEL_UNLOCK(); > > - return (error); > > - } > > -#endif > > > > -#if NPF > 0 > > pf_pkt_addr_changed(m); > > #endif /* NPF > 0 */ > > Are you sure calling pf_pkt_addr_changed() before bridge_output() won't > create any unwanted side effect? In any case I'd suggest keeping this > separated, to be able to revert it easily :D bridge calls if_enqueue to send a packet out, so it'll get pf_pkt_addr_changed() called on the new interface either way. However, policy via pf on bridge would be affected. Unpreoptimising the code gets us back to the original order though. How's this look? Index: if.c === RCS file: /cvs/src/sys/net/if.c,v retrieving revision 1.569 diff -u -p -r1.569 if.c --- if.c11 Dec 2018 22:08:57 - 1.569 +++ if.c30 Dec 2018 02:56:06 - @@ -632,6 +632,8 @@ if_attach_common(struct ifnet *ifp) if (ifp->if_rtrequest == NULL) ifp->if_rtrequest = if_rtrequest_dummy; + if (ifp->if_enqueue == NULL) + ifp->if_enqueue = if_enqueue_ifq; ifp->if_llprio = IFQ_DEFPRIO; SRPL_INIT(>if_inputs); @@ -683,10 +685,6 @@ if_qstart_compat(struct ifqueue *ifq) int if_enqueue(struct ifnet *ifp, struct mbuf
Re: Another attempt to bypass ifqs on interfaces
On 20/12/18(Thu) 11:35, David Gwynne wrote: > My last attempt in "let etherip(4) output directly to the ip stack" > didn't go so well. This one at least doesn't break bridge(4) on > Ethernet. It breaks it on mpw(4). > This diff effectively turns if_enqueue into a per interface function > pointer. Things still call if_enqueue(), which does some pf stuff, and then > it calls ifp->if_enqueue. By default, that function pointer is > if_enqueue_ifq() which puts the packet on the interface send queues. > > Ethernet interfaces intercept the call to if_enqueue_ifq(). They get set > up to call ether_enqueue, which does the bridge stuff and then calls > ac->ac_enqueue, which shadows the ifp->if_enqueue functionality. Why do we need two abstraction layers? It feels weird to have pseudo-drivers rely on a pointer in 'struct arpcom', because they don't necessarily need this structure. > This takes Ethernet bridge(4) knowledge out of the generic interface > code. Non-Ethernet things that want to bypass ifqs can override > ifp->if_enqueue. Ethernet things can override ac->ac_enqueue. Why take it out of the generic? We spent a lot of energy putting it there, now you're taking the opposite direction. So let's spread NBRIDGE > 1 all over the stack again? > This diff includes a change to vlan(4) to take advantage of this. It > overrides ac_enqueue with vlan_enqueue(). So a packet sent out a vlan > interface will go into if_enqueue(), which will call ether_enqueue() > because ether_ifattach set that up. ether_enqueue() will call > ac->ac_enqueue which is vlan_enqueue on vlan interfaces, which will > eventually call if_enqueue() on the parent interface. I like the idea behind vlan_transmit(). > > There are some things I'd like to try if this goes in. eg, it may be > possible to have ether_enqueue simple call ac->ac_enqueue, and let > bridge(4) swap it out with bridge_enqueue code. Doing that now would > complicate the diff though. Why? What's the gain? > Anyway, how's this look, esp relative to the previous diff? It looks better but I'm afraid this diff still breaks stuff, I'd suggest you to keep your fix simple. I'm afraid with all the abstractions that you're introducing. They look like premature optimizations to me. One more comment below. > Index: net/if.c > === > RCS file: /cvs/src/sys/net/if.c,v > retrieving revision 1.569 > diff -u -p -r1.569 if.c > --- net/if.c 11 Dec 2018 22:08:57 - 1.569 > +++ net/if.c 19 Dec 2018 06:48:37 - > @@ -683,34 +685,33 @@ if_qstart_compat(struct ifqueue *ifq) > int > if_enqueue(struct ifnet *ifp, struct mbuf *m) > { > - unsigned int idx; > - struct ifqueue *ifq; > - int error; > - > #if NPF > 0 > if (m->m_pkthdr.pf.delay > 0) > return (pf_delay_pkt(m, ifp->if_index)); > -#endif > - > -#if NBRIDGE > 0 > - if (ifp->if_bridgeport && (m->m_flags & M_PROTO1) == 0) { > - KERNEL_LOCK(); > - error = bridge_output(ifp, m, NULL, NULL); > - KERNEL_UNLOCK(); > - return (error); > - } > -#endif > > -#if NPF > 0 > pf_pkt_addr_changed(m); > #endif /* NPF > 0 */ Are you sure calling pf_pkt_addr_changed() before bridge_output() won't create any unwanted side effect? In any case I'd suggest keeping this separated, to be able to revert it easily :D
Another attempt to bypass ifqs on interfaces
My last attempt in "let etherip(4) output directly to the ip stack" didn't go so well. This one at least doesn't break bridge(4) on Ethernet. This diff effectively turns if_enqueue into a per interface function pointer. Things still call if_enqueue(), which does some pf stuff, and then it calls ifp->if_enqueue. By default, that function pointer is if_enqueue_ifq() which puts the packet on the interface send queues. Ethernet interfaces intercept the call to if_enqueue_ifq(). They get set up to call ether_enqueue, which does the bridge stuff and then calls ac->ac_enqueue, which shadows the ifp->if_enqueue functionality. This takes Ethernet bridge(4) knowledge out of the generic interface code. Non-Ethernet things that want to bypass ifqs can override ifp->if_enqueue. Ethernet things can override ac->ac_enqueue. This diff includes a change to vlan(4) to take advantage of this. It overrides ac_enqueue with vlan_enqueue(). So a packet sent out a vlan interface will go into if_enqueue(), which will call ether_enqueue() because ether_ifattach set that up. ether_enqueue() will call ac->ac_enqueue which is vlan_enqueue on vlan interfaces, which will eventually call if_enqueue() on the parent interface. There are some things I'd like to try if this goes in. eg, it may be possible to have ether_enqueue simple call ac->ac_enqueue, and let bridge(4) swap it out with bridge_enqueue code. Doing that now would complicate the diff though. Anyway, how's this look, esp relative to the previous diff? Index: net/if.c === RCS file: /cvs/src/sys/net/if.c,v retrieving revision 1.569 diff -u -p -r1.569 if.c --- net/if.c11 Dec 2018 22:08:57 - 1.569 +++ net/if.c19 Dec 2018 06:48:37 - @@ -632,6 +632,8 @@ if_attach_common(struct ifnet *ifp) if (ifp->if_rtrequest == NULL) ifp->if_rtrequest = if_rtrequest_dummy; + if (ifp->if_enqueue == NULL) + ifp->if_enqueue = if_enqueue_ifq; ifp->if_llprio = IFQ_DEFPRIO; SRPL_INIT(>if_inputs); @@ -683,34 +685,33 @@ if_qstart_compat(struct ifqueue *ifq) int if_enqueue(struct ifnet *ifp, struct mbuf *m) { - unsigned int idx; - struct ifqueue *ifq; - int error; - #if NPF > 0 if (m->m_pkthdr.pf.delay > 0) return (pf_delay_pkt(m, ifp->if_index)); -#endif - -#if NBRIDGE > 0 - if (ifp->if_bridgeport && (m->m_flags & M_PROTO1) == 0) { - KERNEL_LOCK(); - error = bridge_output(ifp, m, NULL, NULL); - KERNEL_UNLOCK(); - return (error); - } -#endif -#if NPF > 0 pf_pkt_addr_changed(m); #endif /* NPF > 0 */ - /* -* use the operations on the first ifq to pick which of the array -* gets this mbuf. -*/ - idx = ifq_idx(>if_snd, ifp->if_nifqs, m); - ifq = ifp->if_ifqs[idx]; + return ((*ifp->if_enqueue)(ifp, m)); +} + +int +if_enqueue_ifq(struct ifnet *ifp, struct mbuf *m) +{ + struct ifqueue *ifq = >if_snd; + int error; + + if (ifp->if_nifqs > 1) { + unsigned int idx; + + /* +* use the operations on the first ifq to pick which of +* the array gets this mbuf. +*/ + + idx = ifq_idx(>if_snd, ifp->if_nifqs, m); + ifq = ifp->if_ifqs[idx]; + } error = ifq_enqueue(ifq, m); if (error) Index: net/if_ethersubr.c === RCS file: /cvs/src/sys/net/if_ethersubr.c,v retrieving revision 1.255 diff -u -p -r1.255 if_ethersubr.c --- net/if_ethersubr.c 12 Dec 2018 05:38:26 - 1.255 +++ net/if_ethersubr.c 19 Dec 2018 06:48:37 - @@ -121,6 +121,13 @@ didn't get a copy, you may request one f #include #endif /* MPLS */ +#include "bridge.h" +#if NBRIDGE > 0 +#include +#endif + +static int ether_enqueue(struct ifnet *, struct mbuf *); + u_int8_t etherbroadcastaddr[ETHER_ADDR_LEN] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; u_int8_t etheranyaddr[ETHER_ADDR_LEN] = @@ -335,6 +342,26 @@ ether_output(struct ifnet *ifp, struct m return (if_enqueue(ifp, m)); } +static int +ether_enqueue(struct ifnet *ifp, struct mbuf *m) +{ + struct arpcom *ac = (struct arpcom *)ifp; + +#if NBRIDGE > 0 + if (ifp->if_bridgeport && (m->m_flags & M_PROTO1) == 0) { + int error; + + KERNEL_LOCK(); + error = bridge_output(ifp, m, NULL, NULL); + KERNEL_UNLOCK(); + + return (error); + } +#endif + + return ((*ac->ac_enqueue)(ifp, m)); +} + /* * Process a received Ethernet packet; * the packet is in the mbuf chain m without @@ -518,6 +545,10 @@ ether_ifattach(struct ifnet *ifp) if (ifp->if_hardmtu == 0) ifp->if_hardmtu = ETHERMTU; + + ifp->if_enqueue = ether_enqueue; + if