> On 6 Jul 2020, at 12:17, Martin Pieuchot <m...@openbsd.org> wrote:
> 
> On 01/07/20(Wed) 00:02, Vitaliy Makkoveev wrote:
>> On Tue, Jun 30, 2020 at 03:48:22PM +0300, Vitaliy Makkoveev wrote:
>>> On Tue, Jun 30, 2020 at 12:08:03PM +0200, Martin Pieuchot wrote:
>>>> On 29/06/20(Mon) 11:59, Vitaliy Makkoveev wrote:
>>>>> [...] 
>>>>> I reworked tool for reproduce. Now I avoided fork()/exec() route and it
>>>>> takes couple of minutes to take panic on 4 cores. Also some screenshots
>>>>> attached.
>>>> 
>>>> Setting kern.pool_debug=2 makes the race reproducible in seconds.
>> 
>> Unfortunately you will catch splassert() caused by kern/sched_bsd.c:304.
>> malloc() will call yield() while we are holding NET_LOCK(). I attached
>> screenshot with splassertion to this mail.
> 
> With kern.splassert < 3 it is fine. 
> 
>>>> Could you turn this test into something committable in regress/?  We can
>>>> link it to the build once a fix is committed.
>>>> 
>>> 
>>> We have 3 races with cloned interfaces:
>>> 1. if_clone_create() vs if_clone_create()
>>> 2. if_clone_destroy() vs if_clone_destroy()
>>> 3. if_clone_destroy() vs the rest of stack
>>> 
>>> It makes sences to commit unified test to regress/, so I suggest to wait
>>> a little.
>> 
>> The another solution.
>> 
>> Diff below introduces per-`ifc' serialization for if_clone_create() and
>> if_clone_destroy(). There is no index bitmap anymore.
> 
> I like the simplification.  More comments below:
> 
>> +/*
>> + * Lock a clone network interface.
>> + */
>> +int
>> +if_clone_lock(struct if_clone *ifc)
>> +{
>> +    int error;
>> +
>> +    rw_enter_write(&ifc->ifc_lock);
>> +
>> +    while (ifc->ifc_flags & IFC_CREATE_LOCKED) {
>> +            ifc->ifc_flags |= IFC_CREATE_LOCKWAIT;
>> +            error = rwsleep_nsec(&ifc->ifc_flags, &ifc->ifc_lock,
>> +                PWAIT|PCATCH, "ifclk", INFSLP);
>> +            if(error != 0) {
>> +                    ifc->ifc_flags &= ~IFC_CREATE_LOCKWAIT;
>> +                    rw_exit_write(&ifc->ifc_lock);
>> +                    return error;
>> +            }
>> +    }
>> +    ifc->ifc_flags |= IFC_CREATE_LOCKED;
>> +    ifc->ifc_flags &= ~IFC_CREATE_LOCKWAIT;
>> +
>> +    rw_exit_write(&ifc->ifc_lock);
>> +    
>> +    return 0;
>> +}
> 
> This is like re-implementing a rwlock but loosing the debugging ability of
> WITNESS.

The reason to do this is to avoid call `ifc_create’ with rwlock held.
We have unique sleep points for each underlaying routine for `ifc_create’,
so this "rwlock reimplementation" looks better. Also this lock is used in
one place only and impact of loosing debugging ability is not such big.

> 
> I also don't see any reason for having a per-ifc lock.  If, at least one
> of the problems, is a double insert in `ifnet' then we should be able to
> assert that a lock is held when doing such assertion.

This race breaks ifunit() not `if_list’. I mean LIST_*() operation are
not broken because `le_{prev,next}’ are valid, but list is inconsistent of 
course.

Since only "ifconfig clonerA0 create& ifconfig clonerA0 create” will
break, I see no reason to deny simultaneous execution of
“ifconfig clonerA0 create& ifconfig clonerB0 create”.

Let this lock be per `ifc’ not global.

> 
> Assertions and documentation are more important than preventing races
> because they allow to build awareness and elegant solutions instead of
> hacking diffs until stuff work without knowing why.
> 
> There are two cases where `ifp' are inserted into `ifnet':
> 1. by autoconf during boot or hotplug
> 2. by cloning ioctl
> 
> In the second case it is always about pseudo-devices.  So the assertion
> should be conditional like:
> 
>       if (ISSET(ifp->if_xflags, IFXF_CLONED))
>               rw_assert_wrlock(&ifc_lock);
> 
> In other words this fixes serializes insertions/removal on the global
> list `ifnet', the KERNEL_LOCK() being still required for reading it.
> 
> Is there any other data structure which ends up being protected by this
> approach and could be documented?

We should be sure there is no multiple `ifnet’s in `if_list’ with the same
`if_xname’. And the assertion you proposed looks not obvious here.
Assertion like below looks more reasonable but introduces performance
impact.

---- cut begin ----
void
if_attach(struct ifnet *ifp)
{
        if_attach_common(ifp);
        NET_LOCK();
        KASSERT(ifunit(ifp->if_xname) == NULL);
        TAILQ_INSERT_TAIL(&ifnet, ifp, if_list);
        if_attachsetup(ifp);
        NET_UNLOCK();
}
---- cut end ----

I guess the commentary within if_clone_create() is the best solution.
Something like this: “Deny simultaneous execution to prevent multiple
creation of interfaces with the same name”.

Reply via email to