On Wed, Jul 27, 2022 at 08:53:38PM +0200, Mark Kettenis wrote:
> > Date: Wed, 27 Jul 2022 00:11:19 +0200
> > From: Alexander Bluhm <[email protected]>
> >
> > On Tue, Jul 26, 2022 at 11:19:27PM +0200, Mark Kettenis wrote:
> > > > Date: Tue, 26 Jul 2022 18:11:01 +0200
> > > > From: Alexander Bluhm <[email protected]>
> > > >
> > > > On Fri, Jul 22, 2022 at 06:13:04PM +0200, Alexander Bluhm wrote:
> > > > > But I am fine with committing ifmedia, gem(4) and bge(4) now. Then
> > > > > we can decide on a per driver basis. As long kernel lock is not
> > > > > removed from the ifmedia layer, this diff is not strictly necessary.
> > > > > I want to commit it anyway to show how MP in ifmedia should work.
> > > >
> > > > This is the part for gem(4) and bge(4). I have tested them.
> > > > ifmedia_current() is MP safe and provides the data which all the
> > > > drivers need.
> > > >
> > > > After commiting this, I can look for more hardware to test.
> > > >
> > > > ok?
> > >
> > > I still don't understand what this fixes. Most drivers create a list
> > > of media when the PHY attaches during autoconf. That list never
> > > changes, which means that the pointer to the ifmedia struct remains
> > > valid until the driver detaches.
> > >
> > > There is of course a race when you use ifconfig to change the media
> > > type. But your new interface doesn't really fix that race. Code that
> > > uses ifmedia_current() may still end up with the old information and
> > > use that at a point where that information is no longer accurate.
> >
> > It does not fix a real bug. The diff tries to keep internal data
> > structures within a file, instead of spreading them over all drivers.
> > MP work is easier if each struct field has a clear lock that is
> > responsible for it. 70 drivers that should change it only during
> > attach is not that clear.
>
> But making data structures "mpsafe" is not a goal in itself. The way
> the data is used must be safe as well. If you don't think that
> through, you run the risk of adding mutexes that don't actually help.
> And this is starting to smell like one of those to me.
>
> > The ifmedia_list is protected by mutex. ifm_cur is a pointer to
> > an element of the list. It would be natural to protect it with the
> > same mutex. ifm_cur is changed by ifmedia_set() and ifmedia_ioctl()
> > and ifmedia_delete_instance().
> >
> > Look at bnxt_hwrm_port_phy_qcfg(). There the list is changed by
> > bnxt_media_status() which is called from ifmedia_ioctl().
> > ixgbe_handle_msf() changes the list during SPF interrupt. But I
> > have to admit that these drivers do not access ifm_cur.
> >
> > Why do you refuse to make the ifmedia layer MP safe by itself?
> > It is just one function and a mutex. I cannot see harm.
>
> I looked a bit at what NetBSD does. I don't think what they do is a
> good idea (in particular having pointers to locks). But they actually
> hold a lock across the callbacks. Which made me realize that if you
> don't do that you don't really fix the race between changes detected
> by the PHY and manual changes through ioctl. And I think the
> callbacks can sleep. So using a mutex may simply be wrong.
>
>
NetBSD uses spin mutex by default for ifmedia protection. But it could
use driver's mutex for that purpose. That's why they have pointer to
lock.
I don't like to keep lock while callback called. It introduces lock
inconsistency like we have with (*if_qstart)() callbacks. May be we
should pass the new media to the media change callback directly, like
below:
if (ifm->new_ifm_change_cb)
error = (*ifm->new_ifm_change_cb)(ifm, ifp, match,
newmedia);
else {
oldentry = ifm->ifm_cur;
oldmedia = ifm->ifm_media;
/* ... */
error = (*ifm->ifm_change_cb)(ifp);
if (error && error != ENETRESET) {
mtx_enter(&ifmedia_mtx);
if (ifm->ifm_cur == match) {
ifm->ifm_cur = oldentry;
ifm->ifm_cur = oldentry;
}
mtx_leave(&ifmedia_mtx);
}
}
This keeps the media change atomicy, and leaves `ifm' protection to the
driver, but without pointers to locks and callback calls with lock held.