On Wed, May 11, 2022 at 11:01:21AM +0200, Alexandr Nedvedicky wrote:
> Hello Hrvoje,
> 
> thank you for testing.
> On Wed, May 11, 2022 at 10:40:28AM +0200, Hrvoje Popovski wrote:
> > On 10.5.2022. 22:55, Alexander Bluhm wrote:
> > > Yes.  It is similar.
> > > 
> > > I have read the whole mail thread and the final fix got commited.
> > > But it looks incomplete, pf is still sleeping.
> > > 
> > > Hrvoje, can you run the tests again that triggered the panics a
> > > year ago?
> > 
> > Hi,
> > 
> > year ago panics was triggered when veb or tpmr bridged traffic. I've
> > tried that right now and I couldn't trigger that panics.
> > If I put vport and route traffic over veb than I can trigger panic with
> > or without vlans as child-iface for veb.
> > For panic i need to have pf enabled and to run
> > ifconfig veb destroy or ifconfig vlan destroy and sh netstart in loop.
> > 
> > 
> > this is panic without vlans
> > 
> > ddb{4}> show all locks
> > CPU 4:
> > exclusive sched_lock &sched_lock r = 0 (0xffffffff824819c0)
> > #0  witness_lock+0x311
> > #1  sleep_setup+0xa5
> > #2  rw_enter+0x211
> > #3  pf_test+0xcf0
> > #4  ip_output+0x6b7
> > #5  ip_forward+0x2da
> > #6  ip_input_if+0x353
> > #7  ipv4_input+0x39
> > #8  ether_input+0x3ad
> > #9  vport_if_enqueue+0x19
> > #10 veb_port_input+0x529
> 
>     I suspect we deal with broadcast and this time things start
>     to go downhill here in veb_broadcast():
> 
>  875         smr_read_enter();
>  876         SMR_TAILQ_FOREACH(tp, &sc->sc_ports.l_list, p_entry) {
>  877                 if (rp == tp || (rp->p_protected & tp->p_protected)) {
>  878                         /*
>  879                          * don't let Ethernet packets hairpin or
>  880                          * move between ports in the same protected
>  881                          * domain(s).
>  882                          */
>  883                         continue;
>  884                 }
>  885 
>  886                 ifp0 = tp->p_ifp0;
>  887                 if (!ISSET(ifp0->if_flags, IFF_RUNNING)) {
>  888                         /* don't waste time */
>  889                         continue;
>  890                 }
>  891 
>  892                 if (!ISSET(tp->p_bif_flags, IFBIF_DISCOVER) &&
>  893                     !ISSET(m0->m_flags, M_BCAST | M_MCAST)) {
>  894                         /* don't flood unknown unicast */
>  895                         continue;
>  896                 }
>  897 
>  898                 if (veb_rule_filter(tp, VEB_RULE_LIST_OUT, m0, src, dst))
>  899                         continue;
>  900 
>  901                 m = m_dup_pkt(m0, max_linkhdr + ETHER_ALIGN, M_NOWAIT);
>  902                 if (m == NULL) {
>  903                         /* XXX count error? */
>  904                         continue;
>  905                 }
>  906 
>  907                 (*tp->p_enqueue)(ifp0, m); /* XXX count error */
>  908         }
>  909         smr_read_leave();
> 
>     the veb_broadcast() is being called from veb_port_input():
> 1082 
> 1083                 /* unknown unicast address */
> 1084         } else {
> 1085                 SET(m->m_flags, ETH64_IS_BROADCAST(dst) ? M_BCAST : 
> M_MCAST);
> 1086         }
> 1087 
> 1088         veb_broadcast(sc, p, m, src, dst);
> 1089         return (NULL);
> 
>     I think I've spotted the place yesterday night in code, but could not
>     figure out how to wire things up to trigger the panic. Fortunately you've
>     managed to figure it out.
> 
> I think vport_if_enqueue() is a trap/land mine here. The function should 
> really
> be queueing a packet instead of dispatching it right away.
> 

I think the diff below may be enough to fix this issue. It drops the SMR
critical secition around the enqueue operation but uses a reference on the
port insteadt to ensure that the device can't be removed during the
enqueue. Once the enqueue is finished we enter the SMR critical section
again and drop the reference.

To make it clear that the SMR_TAILQ remains intact while a refcount is
held I moved refcnt_finalize() above SMR_TAILQ_REMOVE_LOCKED(). This is
not strictly needed since the next pointer remains valid up until the
smr_barrier() call but I find this a bit easier to understand.
First make sure nobody else holds a reference then remove the port from
the list.

I currently have no test setup to verify this but I hope someone else can
give this a spin.
-- 
:wq Claudio

Index: if_veb.c
===================================================================
RCS file: /cvs/src/sys/net/if_veb.c,v
retrieving revision 1.25
diff -u -p -r1.25 if_veb.c
--- if_veb.c    4 Jan 2022 06:32:39 -0000       1.25
+++ if_veb.c    12 May 2022 12:06:28 -0000
@@ -365,7 +365,11 @@ veb_span(struct veb_softc *sc, struct mb
                        continue;
                }
 
+               veb_eb_brport_take(p);
+               smr_read_leave();
                if_enqueue(ifp0, m); /* XXX count error */
+               smr_read_enter();
+               veb_eb_brport_rele(p);
        }
        smr_read_leave();
 }
@@ -904,7 +908,12 @@ veb_broadcast(struct veb_softc *sc, stru
                        continue;
                }
 
+
+               veb_eb_brport_take(tp);
+               smr_read_leave();
                (*tp->p_enqueue)(ifp0, m); /* XXX count error */
+               smr_read_enter();
+               veb_eb_brport_rele(tp);
        }
        smr_read_leave();
 
@@ -1934,10 +1943,11 @@ veb_p_dtor(struct veb_softc *sc, struct 
 
                port_list = &sc->sc_ports;
        }
+       refcnt_finalize(&p->p_refs, "vebpdtor");
+
        SMR_TAILQ_REMOVE_LOCKED(&port_list->l_list, p, p_entry);
        port_list->l_count--;
 
-       refcnt_finalize(&p->p_refs, "vebpdtor");
        smr_barrier();
 
        veb_rule_list_free(TAILQ_FIRST(&p->p_vrl));

Reply via email to