Re: Another attempt to bypass ifqs on interfaces

2019-01-01 Thread David Gwynne
On Sun, Dec 23, 2018 at 01:53:48PM -0200, Martin Pieuchot wrote:
> On 20/12/18(Thu) 11:35, David Gwynne wrote:
> > My last attempt in "let etherip(4) output directly to the ip stack"
> > didn't go so well. This one at least doesn't break bridge(4) on
> > Ethernet.
> 
> It breaks it on mpw(4).

I argue that mpw(4) is broken anyway, but this would make it worse, yes.

> > This diff effectively turns if_enqueue into a per interface function
> > pointer. Things still call if_enqueue(), which does some pf stuff, and then
> > it calls ifp->if_enqueue. By default, that function pointer is
> > if_enqueue_ifq() which puts the packet on the interface send queues.
> > 
> > Ethernet interfaces intercept the call to if_enqueue_ifq(). They get set
> > up to call ether_enqueue, which does the bridge stuff and then calls
> > ac->ac_enqueue, which shadows the ifp->if_enqueue functionality.
> 
> Why do we need two abstraction layers?  It feels weird to have
> pseudo-drivers rely on a pointer in 'struct arpcom', because
> they don't necessarily need this structure.

If a pseudo-driver pretends to be Ethernet it will have arpcom. The main
reason to have two layers for Ethernet interfaces is to take bridge(4)
knowledge out of the generic interface code.

> > This takes Ethernet bridge(4) knowledge out of the generic interface
> > code. Non-Ethernet things that want to bypass ifqs can override
> > ifp->if_enqueue. Ethernet things can override ac->ac_enqueue.
> 
> Why take it out of the generic?  We spent a lot of energy putting it
> there, now you're taking the opposite direction.  So let's spread
> NBRIDGE > 1 all over the stack again?

The bridge stuff would just be in if_ethersubr.c. Hopefully.
if_isconnected() would be harder to pull apart.

> 
> > This diff includes a change to vlan(4) to take advantage of this. It
> > overrides ac_enqueue with vlan_enqueue(). So a packet sent out a vlan
> > interface will go into if_enqueue(), which will call ether_enqueue()
> > because ether_ifattach set that up. ether_enqueue() will call
> > ac->ac_enqueue which is vlan_enqueue on vlan interfaces, which will
> > eventually call if_enqueue() on the parent interface.
> 
> I like the idea behind vlan_transmit().  
> 
> > There are some things I'd like to try if this goes in. eg, it may be
> > possible to have ether_enqueue simple call ac->ac_enqueue, and let
> > bridge(4) swap it out with bridge_enqueue code. Doing that now would
> > complicate the diff though.
> 
> Why?  What's the gain?

Avoiding a conditional in a hotpath?

> > Anyway, how's this look, esp relative to the previous diff?
> 
> It looks better but I'm afraid this diff still breaks stuff, I'd
> suggest you to keep your fix simple.  I'm afraid with all the
> abstractions that you're introducing.  They look like premature
> optimizations to me.

Fair enough.

> One more comment below.
> 
> > Index: net/if.c
> > ===
> > RCS file: /cvs/src/sys/net/if.c,v
> > retrieving revision 1.569
> > diff -u -p -r1.569 if.c
> > --- net/if.c11 Dec 2018 22:08:57 -  1.569
> > +++ net/if.c19 Dec 2018 06:48:37 -
> > @@ -683,34 +685,33 @@ if_qstart_compat(struct ifqueue *ifq)
> >  int
> >  if_enqueue(struct ifnet *ifp, struct mbuf *m)
> >  {
> > -   unsigned int idx;
> > -   struct ifqueue *ifq;
> > -   int error;
> > -
> >  #if NPF > 0
> > if (m->m_pkthdr.pf.delay > 0)
> > return (pf_delay_pkt(m, ifp->if_index));
> > -#endif
> > -
> > -#if NBRIDGE > 0
> > -   if (ifp->if_bridgeport && (m->m_flags & M_PROTO1) == 0) {
> > -   KERNEL_LOCK();
> > -   error = bridge_output(ifp, m, NULL, NULL);
> > -   KERNEL_UNLOCK();
> > -   return (error);
> > -   }
> > -#endif
> >  
> > -#if NPF > 0
> > pf_pkt_addr_changed(m);
> >  #endif /* NPF > 0 */
> 
> Are you sure calling pf_pkt_addr_changed() before bridge_output() won't
> create any unwanted side effect?  In any case I'd suggest keeping this
> separated, to be able to revert it easily :D

bridge calls if_enqueue to send a packet out, so it'll get
pf_pkt_addr_changed() called on the new interface either way. However,
policy via pf on bridge would be affected.

Unpreoptimising the code gets us back to the original order though.

How's this look?

Index: if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.569
diff -u -p -r1.569 if.c
--- if.c11 Dec 2018 22:08:57 -  1.569
+++ if.c30 Dec 2018 02:56:06 -
@@ -632,6 +632,8 @@ if_attach_common(struct ifnet *ifp)
 
if (ifp->if_rtrequest == NULL)
ifp->if_rtrequest = if_rtrequest_dummy;
+   if (ifp->if_enqueue == NULL)
+   ifp->if_enqueue = if_enqueue_ifq;
ifp->if_llprio = IFQ_DEFPRIO;
 
SRPL_INIT(>if_inputs);
@@ -683,10 +685,6 @@ if_qstart_compat(struct ifqueue *ifq)
 int
 if_enqueue(struct ifnet *ifp, struct mbuf 

Re: Another attempt to bypass ifqs on interfaces

2018-12-23 Thread Martin Pieuchot
On 20/12/18(Thu) 11:35, David Gwynne wrote:
> My last attempt in "let etherip(4) output directly to the ip stack"
> didn't go so well. This one at least doesn't break bridge(4) on
> Ethernet.

It breaks it on mpw(4).

> This diff effectively turns if_enqueue into a per interface function
> pointer. Things still call if_enqueue(), which does some pf stuff, and then
> it calls ifp->if_enqueue. By default, that function pointer is
> if_enqueue_ifq() which puts the packet on the interface send queues.
> 
> Ethernet interfaces intercept the call to if_enqueue_ifq(). They get set
> up to call ether_enqueue, which does the bridge stuff and then calls
> ac->ac_enqueue, which shadows the ifp->if_enqueue functionality.

Why do we need two abstraction layers?  It feels weird to have
pseudo-drivers rely on a pointer in 'struct arpcom', because
they don't necessarily need this structure.

> This takes Ethernet bridge(4) knowledge out of the generic interface
> code. Non-Ethernet things that want to bypass ifqs can override
> ifp->if_enqueue. Ethernet things can override ac->ac_enqueue.

Why take it out of the generic?  We spent a lot of energy putting it
there, now you're taking the opposite direction.  So let's spread
NBRIDGE > 1 all over the stack again?

> This diff includes a change to vlan(4) to take advantage of this. It
> overrides ac_enqueue with vlan_enqueue(). So a packet sent out a vlan
> interface will go into if_enqueue(), which will call ether_enqueue()
> because ether_ifattach set that up. ether_enqueue() will call
> ac->ac_enqueue which is vlan_enqueue on vlan interfaces, which will
> eventually call if_enqueue() on the parent interface.

I like the idea behind vlan_transmit().  

> 
> There are some things I'd like to try if this goes in. eg, it may be
> possible to have ether_enqueue simple call ac->ac_enqueue, and let
> bridge(4) swap it out with bridge_enqueue code. Doing that now would
> complicate the diff though.

Why?  What's the gain?

> Anyway, how's this look, esp relative to the previous diff?

It looks better but I'm afraid this diff still breaks stuff, I'd
suggest you to keep your fix simple.  I'm afraid with all the
abstractions that you're introducing.  They look like premature
optimizations to me.

One more comment below.

> Index: net/if.c
> ===
> RCS file: /cvs/src/sys/net/if.c,v
> retrieving revision 1.569
> diff -u -p -r1.569 if.c
> --- net/if.c  11 Dec 2018 22:08:57 -  1.569
> +++ net/if.c  19 Dec 2018 06:48:37 -
> @@ -683,34 +685,33 @@ if_qstart_compat(struct ifqueue *ifq)
>  int
>  if_enqueue(struct ifnet *ifp, struct mbuf *m)
>  {
> - unsigned int idx;
> - struct ifqueue *ifq;
> - int error;
> -
>  #if NPF > 0
>   if (m->m_pkthdr.pf.delay > 0)
>   return (pf_delay_pkt(m, ifp->if_index));
> -#endif
> -
> -#if NBRIDGE > 0
> - if (ifp->if_bridgeport && (m->m_flags & M_PROTO1) == 0) {
> - KERNEL_LOCK();
> - error = bridge_output(ifp, m, NULL, NULL);
> - KERNEL_UNLOCK();
> - return (error);
> - }
> -#endif
>  
> -#if NPF > 0
>   pf_pkt_addr_changed(m);
>  #endif   /* NPF > 0 */

Are you sure calling pf_pkt_addr_changed() before bridge_output() won't
create any unwanted side effect?  In any case I'd suggest keeping this
separated, to be able to revert it easily :D



Another attempt to bypass ifqs on interfaces

2018-12-19 Thread David Gwynne
My last attempt in "let etherip(4) output directly to the ip stack"
didn't go so well. This one at least doesn't break bridge(4) on
Ethernet.

This diff effectively turns if_enqueue into a per interface function
pointer. Things still call if_enqueue(), which does some pf stuff, and then
it calls ifp->if_enqueue. By default, that function pointer is
if_enqueue_ifq() which puts the packet on the interface send queues.

Ethernet interfaces intercept the call to if_enqueue_ifq(). They get set
up to call ether_enqueue, which does the bridge stuff and then calls
ac->ac_enqueue, which shadows the ifp->if_enqueue functionality.

This takes Ethernet bridge(4) knowledge out of the generic interface
code. Non-Ethernet things that want to bypass ifqs can override
ifp->if_enqueue. Ethernet things can override ac->ac_enqueue.

This diff includes a change to vlan(4) to take advantage of this. It
overrides ac_enqueue with vlan_enqueue(). So a packet sent out a vlan
interface will go into if_enqueue(), which will call ether_enqueue()
because ether_ifattach set that up. ether_enqueue() will call
ac->ac_enqueue which is vlan_enqueue on vlan interfaces, which will
eventually call if_enqueue() on the parent interface.

There are some things I'd like to try if this goes in. eg, it may be
possible to have ether_enqueue simple call ac->ac_enqueue, and let
bridge(4) swap it out with bridge_enqueue code. Doing that now would
complicate the diff though.

Anyway, how's this look, esp relative to the previous diff?

Index: net/if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.569
diff -u -p -r1.569 if.c
--- net/if.c11 Dec 2018 22:08:57 -  1.569
+++ net/if.c19 Dec 2018 06:48:37 -
@@ -632,6 +632,8 @@ if_attach_common(struct ifnet *ifp)
 
if (ifp->if_rtrequest == NULL)
ifp->if_rtrequest = if_rtrequest_dummy;
+   if (ifp->if_enqueue == NULL)
+   ifp->if_enqueue = if_enqueue_ifq;
ifp->if_llprio = IFQ_DEFPRIO;
 
SRPL_INIT(>if_inputs);
@@ -683,34 +685,33 @@ if_qstart_compat(struct ifqueue *ifq)
 int
 if_enqueue(struct ifnet *ifp, struct mbuf *m)
 {
-   unsigned int idx;
-   struct ifqueue *ifq;
-   int error;
-
 #if NPF > 0
if (m->m_pkthdr.pf.delay > 0)
return (pf_delay_pkt(m, ifp->if_index));
-#endif
-
-#if NBRIDGE > 0
-   if (ifp->if_bridgeport && (m->m_flags & M_PROTO1) == 0) {
-   KERNEL_LOCK();
-   error = bridge_output(ifp, m, NULL, NULL);
-   KERNEL_UNLOCK();
-   return (error);
-   }
-#endif
 
-#if NPF > 0
pf_pkt_addr_changed(m);
 #endif /* NPF > 0 */
 
-   /*
-* use the operations on the first ifq to pick which of the array
-* gets this mbuf.
-*/
-   idx = ifq_idx(>if_snd, ifp->if_nifqs, m);
-   ifq = ifp->if_ifqs[idx];
+   return ((*ifp->if_enqueue)(ifp, m));
+}
+
+int
+if_enqueue_ifq(struct ifnet *ifp, struct mbuf *m)
+{
+   struct ifqueue *ifq = >if_snd;
+   int error;
+
+   if (ifp->if_nifqs > 1) {
+   unsigned int idx;
+
+   /*
+* use the operations on the first ifq to pick which of
+* the array gets this mbuf.
+*/
+
+   idx = ifq_idx(>if_snd, ifp->if_nifqs, m);
+   ifq = ifp->if_ifqs[idx];
+   }
 
error = ifq_enqueue(ifq, m);
if (error)
Index: net/if_ethersubr.c
===
RCS file: /cvs/src/sys/net/if_ethersubr.c,v
retrieving revision 1.255
diff -u -p -r1.255 if_ethersubr.c
--- net/if_ethersubr.c  12 Dec 2018 05:38:26 -  1.255
+++ net/if_ethersubr.c  19 Dec 2018 06:48:37 -
@@ -121,6 +121,13 @@ didn't get a copy, you may request one f
 #include 
 #endif /* MPLS */
 
+#include "bridge.h"
+#if NBRIDGE > 0
+#include 
+#endif
+
+static int ether_enqueue(struct ifnet *, struct mbuf *);
+
 u_int8_t etherbroadcastaddr[ETHER_ADDR_LEN] =
 { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff };
 u_int8_t etheranyaddr[ETHER_ADDR_LEN] =
@@ -335,6 +342,26 @@ ether_output(struct ifnet *ifp, struct m
return (if_enqueue(ifp, m));
 }
 
+static int
+ether_enqueue(struct ifnet *ifp, struct mbuf *m)
+{
+   struct arpcom *ac = (struct arpcom *)ifp;
+
+#if NBRIDGE > 0
+   if (ifp->if_bridgeport && (m->m_flags & M_PROTO1) == 0) {
+   int error;
+
+   KERNEL_LOCK();
+   error = bridge_output(ifp, m, NULL, NULL);
+   KERNEL_UNLOCK();
+
+   return (error);
+   }
+#endif
+
+   return ((*ac->ac_enqueue)(ifp, m));
+}
+
 /*
  * Process a received Ethernet packet;
  * the packet is in the mbuf chain m without
@@ -518,6 +545,10 @@ ether_ifattach(struct ifnet *ifp)
 
if (ifp->if_hardmtu == 0)
ifp->if_hardmtu = ETHERMTU;
+
+   ifp->if_enqueue = ether_enqueue;
+   if