Re: [External] : Re: 7.1-Current crash with NET_TASKQ 4 and veb interface
On Thu, May 12, 2022 at 08:07:09PM +0200, Hrvoje Popovski wrote: > On 12.5.2022. 20:04, Hrvoje Popovski wrote: > > On 12.5.2022. 16:22, Hrvoje Popovski wrote: > >> On 12.5.2022. 14:48, Claudio Jeker wrote: > >>> I think the diff below may be enough to fix this issue. It drops the SMR > >>> critical secition around the enqueue operation but uses a reference on the > >>> port insteadt to ensure that the device can't be removed during the > >>> enqueue. Once the enqueue is finished we enter the SMR critical section > >>> again and drop the reference. > >>> > >>> To make it clear that the SMR_TAILQ remains intact while a refcount is > >>> held I moved refcnt_finalize() above SMR_TAILQ_REMOVE_LOCKED(). This is > >>> not strictly needed since the next pointer remains valid up until the > >>> smr_barrier() call but I find this a bit easier to understand. > >>> First make sure nobody else holds a reference then remove the port from > >>> the list. > >>> > >>> I currently have no test setup to verify this but I hope someone else can > >>> give this a spin. > >> Hi, > >> > >> for now veb seems stable and i can't panic box although it should, but > >> please give me few more hours to torture it properly. > > > > > > I can trigger panic in veb with this diff. > > > > Thank you .. > > > > > > I can't trigger ... :))) sorry .. sorry i'm late to the party. can you try this diff? this diff replaces the list of ports with an array/map of ports. the map takes references to all the ports, so the forwarding paths just have to hold a reference to the map to be able to use all the ports. the forwarding path uses smr to get hold of a map, takes a map ref, and then leaves the smr crit section before iterating over the map and pushing packets. this means we should only take and release a single refcnt when we're pushing packets out any number of ports. if no span ports are configured, then there's no span port map and we don't try and take a ref, we can just return early. we also only take and release a single refcnt when we forward the actual packet. forwarding to a single port provided by an etherbridge lookup already takes/releases the single port ref. if it falls through that for unknown unicast or broadcast/multicast, then it's a single refcnt for the current map of all ports. Index: if_veb.c === RCS file: /cvs/src/sys/net/if_veb.c,v retrieving revision 1.25 diff -u -p -r1.25 if_veb.c --- if_veb.c4 Jan 2022 06:32:39 - 1.25 +++ if_veb.c13 May 2022 02:01:43 - @@ -139,13 +139,13 @@ struct veb_port { struct veb_rule_list p_vr_list[2]; #define VEB_RULE_LIST_OUT 0 #define VEB_RULE_LIST_IN 1 - - SMR_TAILQ_ENTRY(veb_port)p_entry; }; struct veb_ports { - SMR_TAILQ_HEAD(, veb_port) l_list; - unsigned int l_count; + struct refcntm_refs; + unsigned int m_count; + + /* followed by an array of veb_port pointers */ }; struct veb_softc { @@ -155,8 +155,8 @@ struct veb_softc { struct etherbridge sc_eb; struct rwlocksc_rule_lock; - struct veb_ports sc_ports; - struct veb_ports sc_spans; + struct veb_ports*sc_ports; + struct veb_ports*sc_spans; }; #define DPRINTF(_sc, fmt...)do { \ @@ -184,8 +184,25 @@ static int veb_p_ioctl(struct ifnet *, u static int veb_p_output(struct ifnet *, struct mbuf *, struct sockaddr *, struct rtentry *); -static voidveb_p_dtor(struct veb_softc *, struct veb_port *, - const char *); +static inline size_t +veb_ports_size(unsigned int n) +{ + /* use of _ALIGN is inspired by CMSGs */ + return _ALIGN(sizeof(struct veb_ports)) + + n * sizeof(struct veb_port *); +} + +static inline struct veb_port ** +veb_ports_array(struct veb_ports *m) +{ + return (struct veb_port **)((caddr_t)m + _ALIGN(sizeof(*m))); +} + +static voidveb_ports_free(struct veb_ports *); + +static voidveb_p_unlink(struct veb_softc *, struct veb_port *); +static voidveb_p_fini(struct veb_port *); +static voidveb_p_dtor(struct veb_softc *, struct veb_port *); static int veb_add_port(struct veb_softc *, const struct ifbreq *, unsigned int); static int veb_del_port(struct veb_softc *, @@ -271,8 +288,8 @@ veb_clone_create(struct if_clone *ifc, i return (ENOMEM); rw_init(>sc_rule_lock, "vebrlk"); - SMR_TAILQ_INIT(>sc_ports.l_list); - SMR_TAILQ_INIT(>sc_spans.l_list); + sc->sc_ports = NULL; + sc->sc_spans = NULL; ifp = >sc_if; @@ -314,7 +331,10 @@ static int veb_clone_destroy(struct ifnet *ifp) { struct veb_softc *sc = ifp->if_softc; - struct
Re: [External] : Re: 7.1-Current crash with NET_TASKQ 4 and veb interface
On 12.5.2022. 20:04, Hrvoje Popovski wrote: > On 12.5.2022. 16:22, Hrvoje Popovski wrote: >> On 12.5.2022. 14:48, Claudio Jeker wrote: >>> I think the diff below may be enough to fix this issue. It drops the SMR >>> critical secition around the enqueue operation but uses a reference on the >>> port insteadt to ensure that the device can't be removed during the >>> enqueue. Once the enqueue is finished we enter the SMR critical section >>> again and drop the reference. >>> >>> To make it clear that the SMR_TAILQ remains intact while a refcount is >>> held I moved refcnt_finalize() above SMR_TAILQ_REMOVE_LOCKED(). This is >>> not strictly needed since the next pointer remains valid up until the >>> smr_barrier() call but I find this a bit easier to understand. >>> First make sure nobody else holds a reference then remove the port from >>> the list. >>> >>> I currently have no test setup to verify this but I hope someone else can >>> give this a spin. >> Hi, >> >> for now veb seems stable and i can't panic box although it should, but >> please give me few more hours to torture it properly. > > > I can trigger panic in veb with this diff. > > Thank you .. > > I can't trigger ... :))) sorry ..
Re: [External] : Re: 7.1-Current crash with NET_TASKQ 4 and veb interface
On 12.5.2022. 16:22, Hrvoje Popovski wrote: > On 12.5.2022. 14:48, Claudio Jeker wrote: >> I think the diff below may be enough to fix this issue. It drops the SMR >> critical secition around the enqueue operation but uses a reference on the >> port insteadt to ensure that the device can't be removed during the >> enqueue. Once the enqueue is finished we enter the SMR critical section >> again and drop the reference. >> >> To make it clear that the SMR_TAILQ remains intact while a refcount is >> held I moved refcnt_finalize() above SMR_TAILQ_REMOVE_LOCKED(). This is >> not strictly needed since the next pointer remains valid up until the >> smr_barrier() call but I find this a bit easier to understand. >> First make sure nobody else holds a reference then remove the port from >> the list. >> >> I currently have no test setup to verify this but I hope someone else can >> give this a spin. > Hi, > > for now veb seems stable and i can't panic box although it should, but > please give me few more hours to torture it properly. I can trigger panic in veb with this diff. Thank you ..
Re: [External] : Re: 7.1-Current crash with NET_TASKQ 4 and veb interface
Hello, > > I think the diff below may be enough to fix this issue. It drops the SMR > critical secition around the enqueue operation but uses a reference on the > port insteadt to ensure that the device can't be removed during the > enqueue. Once the enqueue is finished we enter the SMR critical section > again and drop the reference. > > To make it clear that the SMR_TAILQ remains intact while a refcount is > held I moved refcnt_finalize() above SMR_TAILQ_REMOVE_LOCKED(). This is > not strictly needed since the next pointer remains valid up until the > smr_barrier() call but I find this a bit easier to understand. > First make sure nobody else holds a reference then remove the port from > the list. > > I currently have no test setup to verify this but I hope someone else can > give this a spin. I agree this diff is far better than mine version. I have not realized we can actually do a dance with smr_read_enter()/smr_read_leave(). I believe it should work and fix a problem. Here is my OK (if my OK counts here). OK sashan > -- > :wq Claudio > > Index: if_veb.c > === > RCS file: /cvs/src/sys/net/if_veb.c,v > retrieving revision 1.25 > diff -u -p -r1.25 if_veb.c > --- if_veb.c 4 Jan 2022 06:32:39 - 1.25 > +++ if_veb.c 12 May 2022 12:06:28 - > @@ -365,7 +365,11 @@ veb_span(struct veb_softc *sc, struct mb > continue; > } > > + veb_eb_brport_take(p); > + smr_read_leave(); > if_enqueue(ifp0, m); /* XXX count error */ > + smr_read_enter(); > + veb_eb_brport_rele(p); > } > smr_read_leave(); > } > @@ -904,7 +908,12 @@ veb_broadcast(struct veb_softc *sc, stru > continue; > } > > + > + veb_eb_brport_take(tp); > + smr_read_leave(); > (*tp->p_enqueue)(ifp0, m); /* XXX count error */ > + smr_read_enter(); > + veb_eb_brport_rele(tp); > } > smr_read_leave(); > > @@ -1934,10 +1943,11 @@ veb_p_dtor(struct veb_softc *sc, struct > > port_list = >sc_ports; > } > + refcnt_finalize(>p_refs, "vebpdtor"); > + > SMR_TAILQ_REMOVE_LOCKED(_list->l_list, p, p_entry); > port_list->l_count--; > > - refcnt_finalize(>p_refs, "vebpdtor"); > smr_barrier(); > > veb_rule_list_free(TAILQ_FIRST(>p_vrl));
Re: [External] : Re: 7.1-Current crash with NET_TASKQ 4 and veb interface
On 12.5.2022. 14:48, Claudio Jeker wrote: > I think the diff below may be enough to fix this issue. It drops the SMR > critical secition around the enqueue operation but uses a reference on the > port insteadt to ensure that the device can't be removed during the > enqueue. Once the enqueue is finished we enter the SMR critical section > again and drop the reference. > > To make it clear that the SMR_TAILQ remains intact while a refcount is > held I moved refcnt_finalize() above SMR_TAILQ_REMOVE_LOCKED(). This is > not strictly needed since the next pointer remains valid up until the > smr_barrier() call but I find this a bit easier to understand. > First make sure nobody else holds a reference then remove the port from > the list. > > I currently have no test setup to verify this but I hope someone else can > give this a spin. Hi, for now veb seems stable and i can't panic box although it should, but please give me few more hours to torture it properly. I'm doing this in loop ifconfig veb1 destroy sh /etc/netstart ifconfig veb0 destroy sh /etc/netstart ifconfig vport1 destroy sh /etc/netstart ifconfig vport0 destroy sh /etc/netstart my config veb1: flags=a843 index 25 llprio 3 groups: veb ix1 flags=3 port 2 ifpriority 0 ifcost 0 vport1 flags=3 port 27 ifpriority 0 ifcost 0 veb0: flags=a843 index 26 llprio 3 groups: veb ix0 flags=3 port 1 ifpriority 0 ifcost 0 vport0 flags=3 port 28 ifpriority 0 ifcost 0 ix2 flags=100 vport1: flags=8943 mtu 1500 lladdr ec:f4:bb:da:f7:fa index 27 priority 0 llprio 3 groups: vport inet 192.168.111.11 netmask 0xff00 broadcast 192.168.111.255 vport0: flags=8943 mtu 1500 lladdr ec:f4:bb:da:f7:f8 index 28 priority 0 llprio 3 groups: vport inet 192.168.100.11 netmask 0xff00 broadcast 192.168.100.255
Re: [External] : Re: 7.1-Current crash with NET_TASKQ 4 and veb interface
On Wed, May 11, 2022 at 11:01:21AM +0200, Alexandr Nedvedicky wrote: > Hello Hrvoje, > > thank you for testing. > On Wed, May 11, 2022 at 10:40:28AM +0200, Hrvoje Popovski wrote: > > On 10.5.2022. 22:55, Alexander Bluhm wrote: > > > Yes. It is similar. > > > > > > I have read the whole mail thread and the final fix got commited. > > > But it looks incomplete, pf is still sleeping. > > > > > > Hrvoje, can you run the tests again that triggered the panics a > > > year ago? > > > > Hi, > > > > year ago panics was triggered when veb or tpmr bridged traffic. I've > > tried that right now and I couldn't trigger that panics. > > If I put vport and route traffic over veb than I can trigger panic with > > or without vlans as child-iface for veb. > > For panic i need to have pf enabled and to run > > ifconfig veb destroy or ifconfig vlan destroy and sh netstart in loop. > > > > > > this is panic without vlans > > > > ddb{4}> show all locks > > CPU 4: > > exclusive sched_lock _lock r = 0 (0x824819c0) > > #0 witness_lock+0x311 > > #1 sleep_setup+0xa5 > > #2 rw_enter+0x211 > > #3 pf_test+0xcf0 > > #4 ip_output+0x6b7 > > #5 ip_forward+0x2da > > #6 ip_input_if+0x353 > > #7 ipv4_input+0x39 > > #8 ether_input+0x3ad > > #9 vport_if_enqueue+0x19 > > #10 veb_port_input+0x529 > > I suspect we deal with broadcast and this time things start > to go downhill here in veb_broadcast(): > > 875 smr_read_enter(); > 876 SMR_TAILQ_FOREACH(tp, >sc_ports.l_list, p_entry) { > 877 if (rp == tp || (rp->p_protected & tp->p_protected)) { > 878 /* > 879 * don't let Ethernet packets hairpin or > 880 * move between ports in the same protected > 881 * domain(s). > 882 */ > 883 continue; > 884 } > 885 > 886 ifp0 = tp->p_ifp0; > 887 if (!ISSET(ifp0->if_flags, IFF_RUNNING)) { > 888 /* don't waste time */ > 889 continue; > 890 } > 891 > 892 if (!ISSET(tp->p_bif_flags, IFBIF_DISCOVER) && > 893 !ISSET(m0->m_flags, M_BCAST | M_MCAST)) { > 894 /* don't flood unknown unicast */ > 895 continue; > 896 } > 897 > 898 if (veb_rule_filter(tp, VEB_RULE_LIST_OUT, m0, src, dst)) > 899 continue; > 900 > 901 m = m_dup_pkt(m0, max_linkhdr + ETHER_ALIGN, M_NOWAIT); > 902 if (m == NULL) { > 903 /* XXX count error? */ > 904 continue; > 905 } > 906 > 907 (*tp->p_enqueue)(ifp0, m); /* XXX count error */ > 908 } > 909 smr_read_leave(); > > the veb_broadcast() is being called from veb_port_input(): > 1082 > 1083 /* unknown unicast address */ > 1084 } else { > 1085 SET(m->m_flags, ETH64_IS_BROADCAST(dst) ? M_BCAST : > M_MCAST); > 1086 } > 1087 > 1088 veb_broadcast(sc, p, m, src, dst); > 1089 return (NULL); > > I think I've spotted the place yesterday night in code, but could not > figure out how to wire things up to trigger the panic. Fortunately you've > managed to figure it out. > > I think vport_if_enqueue() is a trap/land mine here. The function should > really > be queueing a packet instead of dispatching it right away. > I think the diff below may be enough to fix this issue. It drops the SMR critical secition around the enqueue operation but uses a reference on the port insteadt to ensure that the device can't be removed during the enqueue. Once the enqueue is finished we enter the SMR critical section again and drop the reference. To make it clear that the SMR_TAILQ remains intact while a refcount is held I moved refcnt_finalize() above SMR_TAILQ_REMOVE_LOCKED(). This is not strictly needed since the next pointer remains valid up until the smr_barrier() call but I find this a bit easier to understand. First make sure nobody else holds a reference then remove the port from the list. I currently have no test setup to verify this but I hope someone else can give this a spin. -- :wq Claudio Index: if_veb.c === RCS file: /cvs/src/sys/net/if_veb.c,v retrieving revision 1.25 diff -u -p -r1.25 if_veb.c --- if_veb.c4 Jan 2022 06:32:39 - 1.25 +++ if_veb.c12 May 2022 12:06:28 - @@ -365,7 +365,11 @@ veb_span(struct veb_softc *sc, struct mb continue; } + veb_eb_brport_take(p); + smr_read_leave(); if_enqueue(ifp0, m); /* XXX count error */ + smr_read_enter(); +