On Fri, May 29, 2020 at 09:09:22AM +0200, Martin Pieuchot wrote: > On 28/05/20(Thu) 15:27, Vitaliy Makkoveev wrote: > > On Thu, May 28, 2020 at 12:26:39PM +0200, Martin Pieuchot wrote: > > > On 27/05/20(Wed) 11:54, Vitaliy Makkoveev wrote: > > > > pipex(4) is simultaneously protected by KERNEL_LOCK() and NET_LOCK() and > > > > the last is not required. I guess to start remove NET_LOCK(). Diff below > > > > drops NET_LOCK() in pipex_ioctl() and underlaying routines. At least > > > > this helps to kill unlock/lock mess in pppx_add_session() and > > > > pppx_if_destroy(). > > > > > > Getting rid of the NET_LOCK() might indeed help to untangle the current > > > locking mess. However we should aim towards removing the KERNEL_LOCK() > > > from, at least, the packet processing path starting with pipexintr(). > > > > I guess we can made `pipexoutq' processing NET_LOCK() free. Also > > `pipexinq' processing requires a little amount of places where NET_LOCK() > > is required. So we can implement special locks for pipex(4). > > After second though, it seems that the easiest way forward is to protect > `pipex_session_list' by the NET_LOCK(). > > The rational is that this global list is dereferenced in pipexintr() and > its members are compared to the content of `ph_cookie'. > > There's currently no mechanism to ensure the reference saved in `ph_cookie' > is still valid. That means what ensures the pointer is kind-of correct > is the NET_LOCK(). I'm saying "kind-of" because comparing pointers is > questionable, especially if sessions can be destroyed without purging > `pipexoutq' or `pipexinq'. > > Since the KERNEL_LOCK() is not always held when the network stack runs, > `ph_cookie' can be modified by a CPU without holding it. So what > actually protects the data structures here is the NET_LOCK(). > > Does that make sense? >
This time nothing protects `ph_cookie'. I already pointed this issue. pipex_ppp_enqueue() (net/pipex.c:737) is the only place where we set `ph_cookie'. But between pipex_ppp_enqueue() and pipexintr() calls we do context switch where we can kill session. There are some cases how we can do it: 1. by pipex_ioctl() with PIPEXDSESSION command. We call pipex_close_session(), it sets sessions state to PIPEX_STATE_CLOSED and it's all. pipex_timer() will delete this session by pipex_destroy_session() but only if `pipexinq' and `pipexoutq' are empty. This check is required to be sure there is no `mbuf' with `ph_cookie' pointed to destroyed session. This is the only safe way to destroy session and the only pppac(4) do this but not always. 2. by pipex_iface_stop(). We travel through all sessions and call pipex_destroy_session() with our sessions. We don't check `pipexinq' and `pipexoutq' state. So, hypothetically, we can have `mbuf' with `ph_cookie' pointed to session we killing and this is potencial use-after-free. pppac(4) do this by pppacclose() which calls pipex_iface_fini() which calls pipex_iface_stop(). Hypothetically, npppd(8) can be killed while it has a lot of active connections and some of them can be referenced by `pipexinq' or `pipexoutq'. I guess in normal shutdown npppd(8) does pipex_ioctl() with PIPEXDSESSION command as described in (1) before close correspondind /dev/pppac and there is no sessions while we do pppacclose(). 3. by pppx_if_destroy(). pppx_if has reference to it's own pipex(4) session and this reference will be killed by pppx_if_destroy() without `pipexinq' and `pipexoutq' check. Also potential use-after-free. And this is the only way how pppx(4) can destroy session. We always setting (and destroying) and procesing `ph_cookie' in different threads. So we will always drop lock which protects the whole pipex(4) layer. I guess the only way to be sure the data referenced by `ph_cookie' is still valid is to implement reference counters to pipex(4) session and rework pipex(4) session destruction.