On Wed, Apr 08, 2020 at 11:35:06AM +0200, Martin Pieuchot wrote:
> On 08/04/20(Wed) 12:11, Vitaliy Makkoveev wrote:
> > On Wed, Apr 08, 2020 at 09:51:45AM +0200, Martin Pieuchot wrote:
> > [...] 
> > As I see (pseudo code):
> > [...] 
> > So, I fixed this issue :)
> 
> We fix what we see.  Bugs are always in the unseen :)  This mess is
> clearly complicated.  Mixing 3 different locks in different orders
> with different pieces of code is complex.
They are not in different orders. The order always is:

KERNEL_LOCK();
NET_LOCK();
lock(pppx_ifs_lk);
unlock(pppx_ifs_lk);
NET_UNLOCK();
KERNEL_UNLOCK();

we are not holding pppx_ifs_lk while we releasing NET_LOCK()

and while we dance with NET_LOCK in pppx_add_session() and
pppx_del_session() order is:

KERNEL_LOCK();
NET_LOCK();
ppp_xxx_session()
{
        lock(pppx_ifs_lk);
        unlock(pppx_ifs_lk);

        NET_UNLOCK();
        KERNEL_UNLOCK() /* probably (1) */
        /* play with netif */
        KERNEL_LOCK() /* only if (1) */
        NET_LOCK();

        lock(pppx_ifs_lk);
        unlock(pppx_ifs_lk);
}
NET_UNLOCK();
KERNEL_UNLOCK();

concurency as I see is:

we don't release
------------------------------------
thread a                    thread b
KERNEL_LOCK();              try to grab KERNEL_LOCK();
NET_LOCK();

lock(pppx_ifs_lk)
unlock(pppx_ifs_lk)
NET_UNLOCK();
/* play with netif */
NET_LOCK();

NET_UNLOCK();
KERNEL_UNLOCK();
                            KERNEL_LOCK();
------------------------------------
if we release once
------------------------------------
thread a                    thread b
KERNEL_LOCK();              try to grab KERNEL_LOCK();
NET_LOCK();

lock(pppx_ifs_lk)
unlock(pppx_ifs_lk)
NET_UNLOCK();
KERNEL_UNLOCK();
/* play with netif */       KERNEL_LOCK();
try to grab KERNEL_LOCK();  NET_LOCK();
                            lock(pppx_ifs_lk);
                            unlock(pppx_ifs_lk);
                            NET_UNLOCK();
                            /* play with netif */
                            NET_LOCK();
                            KERNEL_UNLOCK();
KERNEL_LOCK();
NET_LOCK();

NET_UNLOCK();
KERNEL_UNLOCK();
------------------------------------
or if we release twice
------------------------------------
thread a                    thread b
KERNEL_LOCK();              try to grab KERNEL_LOCK();
NET_LOCK();

lock(pppx_ifs_lk)
unlock(pppx_ifs_lk)
NET_UNLOCK();
KERNEL_UNLOCK();
/* play with netif */       KERNEL_LOCK();
try to grab KERNEL_LOCK();  NET_LOCK();
                            lock(pppx_ifs_lk);
                            unlock(pppx_ifs_lk);
                            NET_UNLOCK();
                            KERNEL_UNLOCK();
KERNEL_LOCK()               /* play with netif */
NET_LOCK()                  try to grab KERNEL_LOCK();


lock(pppx_ifs_lk)
unlock(pppx_ifs_lk)

NET_UNLOCK();
KERNEL_UNLOCK();            KERNEL_LOCK()
                            NET_LOCK()
------------------------------------

> 
> So let's ask this question again: what is the NET_LOCK() protecting
> in this code?  Is it needed?  What is the `pppx_ifs_lk' protecting?
> From whom?  Are they needed or do they introduce more problem than they
> are solving?  We see the problem of context switch breaking atomicity
> of code, the simplest way to get rid of this problem is to get rid of
> the locks, or at least reduce their number.
NET_LOCK() should be held around pipex(4) access. It's not required to
hold it for pppx(4) related things. Really, since pipexintr() is under
KERNEL_LOCK() too, we can kill NET_LOCK() from in all pipex(4)/pppx(4)
code except, iirc, ipv[46]_input() and direct 'ifp->if_ipackets++' and
similar. 

> 
> So does one see how useless is `pppx_ifs_lk' since all the code it
> "protects" is executed under KERNEL_LOCK() and doesn't contain any
> context switch?
it's useless :) someone added it for future, but I dont see any reason
to kill it. It's fine.

> 
> Historically the NET_LOCK() has been "pushed" under all pseudo-device
> ioctl(2).  This was a mechanical change until somebody figures out if
> it is needed or not.  That's the real question.  So let's ask it :o)
>
It's paifully to touch this pre-MP era code :) Big NET_LOCK() here is
not needed. 

Reply via email to