On Wed, Apr 08, 2020 at 11:35:06AM +0200, Martin Pieuchot wrote: > On 08/04/20(Wed) 12:11, Vitaliy Makkoveev wrote: > > On Wed, Apr 08, 2020 at 09:51:45AM +0200, Martin Pieuchot wrote: > > [...] > > As I see (pseudo code): > > [...] > > So, I fixed this issue :) > > We fix what we see. Bugs are always in the unseen :) This mess is > clearly complicated. Mixing 3 different locks in different orders > with different pieces of code is complex. They are not in different orders. The order always is:
KERNEL_LOCK(); NET_LOCK(); lock(pppx_ifs_lk); unlock(pppx_ifs_lk); NET_UNLOCK(); KERNEL_UNLOCK(); we are not holding pppx_ifs_lk while we releasing NET_LOCK() and while we dance with NET_LOCK in pppx_add_session() and pppx_del_session() order is: KERNEL_LOCK(); NET_LOCK(); ppp_xxx_session() { lock(pppx_ifs_lk); unlock(pppx_ifs_lk); NET_UNLOCK(); KERNEL_UNLOCK() /* probably (1) */ /* play with netif */ KERNEL_LOCK() /* only if (1) */ NET_LOCK(); lock(pppx_ifs_lk); unlock(pppx_ifs_lk); } NET_UNLOCK(); KERNEL_UNLOCK(); concurency as I see is: we don't release ------------------------------------ thread a thread b KERNEL_LOCK(); try to grab KERNEL_LOCK(); NET_LOCK(); lock(pppx_ifs_lk) unlock(pppx_ifs_lk) NET_UNLOCK(); /* play with netif */ NET_LOCK(); NET_UNLOCK(); KERNEL_UNLOCK(); KERNEL_LOCK(); ------------------------------------ if we release once ------------------------------------ thread a thread b KERNEL_LOCK(); try to grab KERNEL_LOCK(); NET_LOCK(); lock(pppx_ifs_lk) unlock(pppx_ifs_lk) NET_UNLOCK(); KERNEL_UNLOCK(); /* play with netif */ KERNEL_LOCK(); try to grab KERNEL_LOCK(); NET_LOCK(); lock(pppx_ifs_lk); unlock(pppx_ifs_lk); NET_UNLOCK(); /* play with netif */ NET_LOCK(); KERNEL_UNLOCK(); KERNEL_LOCK(); NET_LOCK(); NET_UNLOCK(); KERNEL_UNLOCK(); ------------------------------------ or if we release twice ------------------------------------ thread a thread b KERNEL_LOCK(); try to grab KERNEL_LOCK(); NET_LOCK(); lock(pppx_ifs_lk) unlock(pppx_ifs_lk) NET_UNLOCK(); KERNEL_UNLOCK(); /* play with netif */ KERNEL_LOCK(); try to grab KERNEL_LOCK(); NET_LOCK(); lock(pppx_ifs_lk); unlock(pppx_ifs_lk); NET_UNLOCK(); KERNEL_UNLOCK(); KERNEL_LOCK() /* play with netif */ NET_LOCK() try to grab KERNEL_LOCK(); lock(pppx_ifs_lk) unlock(pppx_ifs_lk) NET_UNLOCK(); KERNEL_UNLOCK(); KERNEL_LOCK() NET_LOCK() ------------------------------------ > > So let's ask this question again: what is the NET_LOCK() protecting > in this code? Is it needed? What is the `pppx_ifs_lk' protecting? > From whom? Are they needed or do they introduce more problem than they > are solving? We see the problem of context switch breaking atomicity > of code, the simplest way to get rid of this problem is to get rid of > the locks, or at least reduce their number. NET_LOCK() should be held around pipex(4) access. It's not required to hold it for pppx(4) related things. Really, since pipexintr() is under KERNEL_LOCK() too, we can kill NET_LOCK() from in all pipex(4)/pppx(4) code except, iirc, ipv[46]_input() and direct 'ifp->if_ipackets++' and similar. > > So does one see how useless is `pppx_ifs_lk' since all the code it > "protects" is executed under KERNEL_LOCK() and doesn't contain any > context switch? it's useless :) someone added it for future, but I dont see any reason to kill it. It's fine. > > Historically the NET_LOCK() has been "pushed" under all pseudo-device > ioctl(2). This was a mechanical change until somebody figures out if > it is needed or not. That's the real question. So let's ask it :o) > It's paifully to touch this pre-MP era code :) Big NET_LOCK() here is not needed.