On Thu, Aug 6, 2020 at 8:11 AM Vitaliy Makkoveev <m...@openbsd.org> wrote:
>
> On Thu, Aug 06, 2020 at 01:25:14PM +0200, Martin Pieuchot wrote:
> > On 05/08/20(Wed) 12:50, Vitaliy Makkoveev wrote:
> > > pipex(4) and pppx(4) are ready to became a little bit more MP capable.
> > > Diff below moves pppx(4) related `ifnet' out of KERNEL_LOCK().
> >
> > Nice, one comment below.
> >
> > > Index: sys/net/if_pppx.c
> > >
> > > [skip]
> > >
> > > +           NET_LOCK();
> > >             pipex_ppp_output(m, pxi->pxi_session, proto);
> > > +           NET_UNLOCK();
> >
> > This means the lock is taken and released for every packet.  It would be
> > better to grab it outside the loop.
>
> Ok, fixed.
>
> Index: sys/net/if_pppx.c
> ===================================================================
> RCS file: /cvs/src/sys/net/if_pppx.c,v
> retrieving revision 1.98
> diff -u -p -r1.98 if_pppx.c
> --- sys/net/if_pppx.c   28 Jul 2020 09:53:36 -0000      1.98
> +++ sys/net/if_pppx.c   6 Aug 2020 11:54:44 -0000
> @@ -191,7 +191,7 @@ int         pppx_set_session_descr(struct pppx_
>                     struct pipex_session_descr_req *);
>
>  void           pppx_if_destroy(struct pppx_dev *, struct pppx_if *);
> -void           pppx_if_start(struct ifnet *);
> +void           pppx_if_qstart(struct ifqueue *);
>  int            pppx_if_output(struct ifnet *, struct mbuf *,
>                     struct sockaddr *, struct rtentry *);
>  int            pppx_if_ioctl(struct ifnet *, u_long, caddr_t);
> @@ -683,13 +683,12 @@ pppx_add_session(struct pppx_dev *pxd, s
>         snprintf(ifp->if_xname, sizeof(ifp->if_xname), "%s%d", "pppx", unit);
>         ifp->if_mtu = req->pr_peer_mru; /* XXX */
>         ifp->if_flags = IFF_POINTOPOINT | IFF_MULTICAST | IFF_UP;
> -       ifp->if_xflags = IFXF_CLONED;
> -       ifp->if_start = pppx_if_start;
> +       ifp->if_xflags = IFXF_CLONED | IFXF_MPSAFE;
> +       ifp->if_qstart = pppx_if_qstart;
>         ifp->if_output = pppx_if_output;
>         ifp->if_ioctl = pppx_if_ioctl;
>         ifp->if_rtrequest = p2p_rtrequest;
>         ifp->if_type = IFT_PPP;
> -       ifq_set_maxlen(&ifp->if_snd, 1);
>         ifp->if_softc = pxi;
>         /* ifp->if_rdomain = req->pr_rdomain; */
>
> @@ -864,21 +863,15 @@ pppx_if_destroy(struct pppx_dev *pxd, st
>  }
>
>  void
> -pppx_if_start(struct ifnet *ifp)
> +pppx_if_qstart(struct ifqueue *ifq)
>  {
> +       struct ifnet *ifp = ifq->ifq_if;
>         struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc;
>         struct mbuf *m;
>         int proto;
>
> -       if (!ISSET(ifp->if_flags, IFF_RUNNING))
> -               return;
> -
> -       for (;;) {
> -               m = ifq_dequeue(&ifp->if_snd);
> -
> -               if (m == NULL)
> -                       break;
> -
> +       NET_LOCK();
> +       while ((m = ifq_dequeue(ifq)) != NULL) {
>                 proto = *mtod(m, int *);
>                 m_adj(m, sizeof(proto));
>
> @@ -887,6 +880,7 @@ pppx_if_start(struct ifnet *ifp)
>
>                 pipex_ppp_output(m, pxi->pxi_session, proto);
>         }
> +       NET_UNLOCK();
>  }
>
>  int
>

Hey ,

I think putting things out of locks
while there is known race condition in code, is hurting.
I was candidly asking  polite questions because I create those races sometimes,
but apparently the work in progress must be kept secret .

Because it is a work in progress reading code may just mislead into :
<<yeah we know that , we are fixing it >>
but I think it is hurting the project to have known issues like that,
in the network stack.
Never saw OpenBSD do DDB> outside some strange drivers or snapshots, since
like 3.5 ? But way more recently.

Currently I'm asking myself why each call of
if_hooks_run in if.c are surrounded by NET_LOCK
and then do another mutex : mtx_enter(&if_hooks_mtx);

But then call (*func)(arg);

While some deletion of func are not protected by any lock

like here, in XX_detach which is not NET_LOCK nor &if_hooks_mtx locked:

int
lacp_detach(struct trunk_softc *sc)
{
    struct lacp_softc *lsc = LACP_SOFTC(sc);

    KASSERT(TAILQ_EMPTY(&lsc->lsc_aggregators));
    KASSERT(lsc->lsc_active_aggregator == NULL);

    sc->tr_psc = NULL;

Most enthusiastic users would suffer a lot from a failing PPP driver ( not me ).
I understand the work is complex ( because as usual OpenBSD is trying
to do the right thing ).

Maybe a define 'MP_NET_STACK' could activate/deactivate optimization
for MP architecture
so users can easily switch between tests and stable for this complex task.
Because validating Races is not an easy task especially on modern processors.

Thanks for the great MP optimization work.

Have a good weekend y'all.

Reply via email to