[PATCH] virtio_vsock: Fix race condition in virtio_transport_recv_pkt
When client on the host tries to connect(SOCK_STREAM, O_NONBLOCK) to the server on the guest, there will be a panic on a ThunderX2 (armv8a server): [ 463.718844] Unable to handle kernel NULL pointer dereference at virtual address [ 463.718848] Mem abort info: [ 463.718849] ESR = 0x9644 [ 463.718852] EC = 0x25: DABT (current EL), IL = 32 bits [ 463.718853] SET = 0, FnV = 0 [ 463.718854] EA = 0, S1PTW = 0 [ 463.718855] Data abort info: [ 463.718856] ISV = 0, ISS = 0x0044 [ 463.718857] CM = 0, WnR = 1 [ 463.718859] user pgtable: 4k pages, 48-bit VAs, pgdp=008f6f6e9000 [ 463.718861] [] pgd= [ 463.718866] Internal error: Oops: 9644 [#1] SMP [...] [ 463.718977] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G O 5.7.0-rc7+ #139 [ 463.718980] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00, BIOS F06 09/25/2018 [ 463.718982] pstate: 6049 (nZCv daif +PAN -UAO) [ 463.718995] pc : virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common] [ 463.718999] lr : virtio_transport_recv_pkt+0x1fc/0xd40 [vmw_vsock_virtio_transport_common] [ 463.719000] sp : 80002dbe3c40 [...] [ 463.719025] Call trace: [ 463.719030] virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common] [ 463.719034] vhost_vsock_handle_tx_kick+0x360/0x408 [vhost_vsock] [ 463.719041] vhost_worker+0x100/0x1a0 [vhost] [ 463.719048] kthread+0x128/0x130 [ 463.719052] ret_from_fork+0x10/0x18 The race condition is as follows: Task1Task2 == __sock_release virtio_transport_recv_pkt __vsock_release vsock_find_bound_socket (found sk) lock_sock_nested vsock_remove_sock sock_orphan sk_set_socket(sk, NULL) sk->sk_shutdown = SHUTDOWN_MASK ... release_sock lock_sock virtio_transport_recv_connecting sk->sk_socket->state (panic!) The root cause is that vsock_find_bound_socket can't hold the lock_sock, so there is a small race window between vsock_find_bound_socket() and lock_sock(). If __vsock_release() is running in another task, sk->sk_socket will be set to NULL inadvertently. Thus check the data structure member “sk_shutdown” (suggested by Stefano) after a call of the function “lock_sock” since this field is set to “SHUTDOWN_MASK” under the protection of “lock_sock_nested”. Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko") Signed-off-by: Jia He Cc: sta...@vger.kernel.org Cc: Asias He Reviewed-by: Stefano Garzarella --- v4: refine the commit msg (from Markus) net/vmw_vsock/virtio_transport_common.c | 8 1 file changed, 8 insertions(+) diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 69efc891885f..0edda1edf988 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -1132,6 +1132,14 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, lock_sock(sk); + /* Check if sk has been released before lock_sock */ + if (sk->sk_shutdown == SHUTDOWN_MASK) { + (void)virtio_transport_reset_no_sock(t, pkt); + release_sock(sk); + sock_put(sk); + goto free_pkt; + } + /* Update CID in case it has changed after a transport reset event */ vsk->local_addr.svm_cid = dst.svm_cid; -- 2.17.1
RE: [PATCH] virtio_vsock: Fix race condition in virtio_transport_recv_pkt
Hi Stefano > -Original Message- > From: Stefano Garzarella > Sent: Friday, May 29, 2020 10:11 PM > To: Justin He > Cc: Stefan Hajnoczi ; David S. Miller > ; Jakub Kicinski ; > k...@vger.kernel.org; virtualizat...@lists.linux-foundation.org; > net...@vger.kernel.org; linux-kernel@vger.kernel.org; Kaly Xin > ; sta...@vger.kernel.org > Subject: Re: [PATCH] virtio_vsock: Fix race condition in > virtio_transport_recv_pkt > > Hi Jia, > thanks for the patch! I have some comments. > > On Fri, May 29, 2020 at 09:31:23PM +0800, Jia He wrote: > > When client tries to connect(SOCK_STREAM) the server in the guest with > NONBLOCK > > mode, there will be a panic on a ThunderX2 (armv8a server): > > [ 463.718844][ T5040] Unable to handle kernel NULL pointer dereference at > virtual address > > [ 463.718848][ T5040] Mem abort info: > > [ 463.718849][ T5040] ESR = 0x9644 > > [ 463.718852][ T5040] EC = 0x25: DABT (current EL), IL = 32 bits > > [ 463.718853][ T5040] SET = 0, FnV = 0 > > [ 463.718854][ T5040] EA = 0, S1PTW = 0 > > [ 463.718855][ T5040] Data abort info: > > [ 463.718856][ T5040] ISV = 0, ISS = 0x0044 > > [ 463.718857][ T5040] CM = 0, WnR = 1 > > [ 463.718859][ T5040] user pgtable: 4k pages, 48-bit VAs, > pgdp=008f6f6e9000 > > [ 463.718861][ T5040] [] pgd= > > [ 463.718866][ T5040] Internal error: Oops: 9644 [#1] SMP > > [...] > > [ 463.718977][ T5040] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G > O 5.7.0-rc7+ #139 > > [ 463.718980][ T5040] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00, > BIOS F06 09/25/2018 > > [ 463.718982][ T5040] pstate: 6049 (nZCv daif +PAN -UAO) > > [ 463.718995][ T5040] pc : virtio_transport_recv_pkt+0x4c8/0xd40 > [vmw_vsock_virtio_transport_common] > > [ 463.718999][ T5040] lr : virtio_transport_recv_pkt+0x1fc/0xd40 > [vmw_vsock_virtio_transport_common] > > [ 463.719000][ T5040] sp : 80002dbe3c40 > > [...] > > [ 463.719025][ T5040] Call trace: > > [ 463.719030][ T5040] virtio_transport_recv_pkt+0x4c8/0xd40 > [vmw_vsock_virtio_transport_common] > > [ 463.719034][ T5040] vhost_vsock_handle_tx_kick+0x360/0x408 > [vhost_vsock] > > [ 463.719041][ T5040] vhost_worker+0x100/0x1a0 [vhost] > > [ 463.719048][ T5040] kthread+0x128/0x130 > > [ 463.719052][ T5040] ret_from_fork+0x10/0x18 > > > > The race condition as follows: > > Task1Task2 > > == > > __sock_release virtio_transport_recv_pkt > > __vsock_release vsock_find_bound_socket (found) > > lock_sock_nested > > vsock_remove_sock > > sock_orphan > > sk_set_socket(sk, NULL) > > ... > > release_sock > > lock_sock > >virtio_transport_recv_connecting > > sk->sk_socket->state (panic) > > > > This fixes it by checking vsk again whether it is in bound/connected table. > > > > Signed-off-by: Jia He > > Cc: sta...@vger.kernel.org > > --- > > net/vmw_vsock/virtio_transport_common.c | 11 +++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/net/vmw_vsock/virtio_transport_common.c > b/net/vmw_vsock/virtio_transport_common.c > > index 69efc891885f..0dbd6a45f0ed 100644 > > --- a/net/vmw_vsock/virtio_transport_common.c > > +++ b/net/vmw_vsock/virtio_transport_common.c > > @@ -1132,6 +1132,17 @@ void virtio_transport_recv_pkt(struct > virtio_transport *t, > > > > lock_sock(sk); > > > > +/* Check it again if vsk is removed by vsock_remove_sock */ > > +spin_lock_bh(_table_lock); > > +if (!__vsock_in_bound_table(vsk) > && !__vsock_in_connected_table(vsk)) { > > +spin_unlock_bh(_table_lock); > > +(void)virtio_transport_reset_no_sock(t, pkt); > > +release_sock(sk); > > +sock_put(sk); > > +goto free_pkt; > > +} > > +spin_unlock_bh(_table_lock); > > + > > As an a simpler alternative, can we check the sk_shutdown or the socket > state without check again both bound and connected tables? > > This is a data path, so we should take it faster. > > I mean something like this: > > if (sk->sk_shutdown == SHUTDOWN_MASK) { > ... > } > Thanks for the suggestion, I verified it worked fine. And it is a more lightweight checking than mine. I will send v2 with above change -- Cheers, Justin (Jia He) > or > > if (sock_flag(sk, SOCK_DEAD
Re: [PATCH] virtio_vsock: Fix race condition in virtio_transport_recv_pkt
Hi Jia, thanks for the patch! I have some comments. On Fri, May 29, 2020 at 09:31:23PM +0800, Jia He wrote: > When client tries to connect(SOCK_STREAM) the server in the guest with > NONBLOCK > mode, there will be a panic on a ThunderX2 (armv8a server): > [ 463.718844][ T5040] Unable to handle kernel NULL pointer dereference at > virtual address > [ 463.718848][ T5040] Mem abort info: > [ 463.718849][ T5040] ESR = 0x9644 > [ 463.718852][ T5040] EC = 0x25: DABT (current EL), IL = 32 bits > [ 463.718853][ T5040] SET = 0, FnV = 0 > [ 463.718854][ T5040] EA = 0, S1PTW = 0 > [ 463.718855][ T5040] Data abort info: > [ 463.718856][ T5040] ISV = 0, ISS = 0x0044 > [ 463.718857][ T5040] CM = 0, WnR = 1 > [ 463.718859][ T5040] user pgtable: 4k pages, 48-bit VAs, > pgdp=008f6f6e9000 > [ 463.718861][ T5040] [] pgd= > [ 463.718866][ T5040] Internal error: Oops: 9644 [#1] SMP > [...] > [ 463.718977][ T5040] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G > O 5.7.0-rc7+ #139 > [ 463.718980][ T5040] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00, BIOS > F06 09/25/2018 > [ 463.718982][ T5040] pstate: 6049 (nZCv daif +PAN -UAO) > [ 463.718995][ T5040] pc : virtio_transport_recv_pkt+0x4c8/0xd40 > [vmw_vsock_virtio_transport_common] > [ 463.718999][ T5040] lr : virtio_transport_recv_pkt+0x1fc/0xd40 > [vmw_vsock_virtio_transport_common] > [ 463.719000][ T5040] sp : 80002dbe3c40 > [...] > [ 463.719025][ T5040] Call trace: > [ 463.719030][ T5040] virtio_transport_recv_pkt+0x4c8/0xd40 > [vmw_vsock_virtio_transport_common] > [ 463.719034][ T5040] vhost_vsock_handle_tx_kick+0x360/0x408 [vhost_vsock] > [ 463.719041][ T5040] vhost_worker+0x100/0x1a0 [vhost] > [ 463.719048][ T5040] kthread+0x128/0x130 > [ 463.719052][ T5040] ret_from_fork+0x10/0x18 > > The race condition as follows: > Task1Task2 > == > __sock_release virtio_transport_recv_pkt > __vsock_release vsock_find_bound_socket (found) > lock_sock_nested > vsock_remove_sock > sock_orphan > sk_set_socket(sk, NULL) > ... > release_sock > lock_sock >virtio_transport_recv_connecting > sk->sk_socket->state (panic) > > This fixes it by checking vsk again whether it is in bound/connected table. > > Signed-off-by: Jia He > Cc: sta...@vger.kernel.org > --- > net/vmw_vsock/virtio_transport_common.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/net/vmw_vsock/virtio_transport_common.c > b/net/vmw_vsock/virtio_transport_common.c > index 69efc891885f..0dbd6a45f0ed 100644 > --- a/net/vmw_vsock/virtio_transport_common.c > +++ b/net/vmw_vsock/virtio_transport_common.c > @@ -1132,6 +1132,17 @@ void virtio_transport_recv_pkt(struct virtio_transport > *t, > > lock_sock(sk); > > + /* Check it again if vsk is removed by vsock_remove_sock */ > + spin_lock_bh(_table_lock); > + if (!__vsock_in_bound_table(vsk) && !__vsock_in_connected_table(vsk)) { > + spin_unlock_bh(_table_lock); > + (void)virtio_transport_reset_no_sock(t, pkt); > + release_sock(sk); > + sock_put(sk); > + goto free_pkt; > + } > + spin_unlock_bh(_table_lock); > + As an a simpler alternative, can we check the sk_shutdown or the socket state without check again both bound and connected tables? This is a data path, so we should take it faster. I mean something like this: if (sk->sk_shutdown == SHUTDOWN_MASK) { ... } or if (sock_flag(sk, SOCK_DEAD)) { ... } I prefer the first option, but I think also the second option should work. Thanks, Stefano > /* Update CID in case it has changed after a transport reset event */ > vsk->local_addr.svm_cid = dst.svm_cid; > > -- > 2.17.1 >
[PATCH] virtio_vsock: Fix race condition in virtio_transport_recv_pkt
When client tries to connect(SOCK_STREAM) the server in the guest with NONBLOCK mode, there will be a panic on a ThunderX2 (armv8a server): [ 463.718844][ T5040] Unable to handle kernel NULL pointer dereference at virtual address [ 463.718848][ T5040] Mem abort info: [ 463.718849][ T5040] ESR = 0x9644 [ 463.718852][ T5040] EC = 0x25: DABT (current EL), IL = 32 bits [ 463.718853][ T5040] SET = 0, FnV = 0 [ 463.718854][ T5040] EA = 0, S1PTW = 0 [ 463.718855][ T5040] Data abort info: [ 463.718856][ T5040] ISV = 0, ISS = 0x0044 [ 463.718857][ T5040] CM = 0, WnR = 1 [ 463.718859][ T5040] user pgtable: 4k pages, 48-bit VAs, pgdp=008f6f6e9000 [ 463.718861][ T5040] [] pgd= [ 463.718866][ T5040] Internal error: Oops: 9644 [#1] SMP [...] [ 463.718977][ T5040] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G O 5.7.0-rc7+ #139 [ 463.718980][ T5040] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00, BIOS F06 09/25/2018 [ 463.718982][ T5040] pstate: 6049 (nZCv daif +PAN -UAO) [ 463.718995][ T5040] pc : virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common] [ 463.718999][ T5040] lr : virtio_transport_recv_pkt+0x1fc/0xd40 [vmw_vsock_virtio_transport_common] [ 463.719000][ T5040] sp : 80002dbe3c40 [...] [ 463.719025][ T5040] Call trace: [ 463.719030][ T5040] virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common] [ 463.719034][ T5040] vhost_vsock_handle_tx_kick+0x360/0x408 [vhost_vsock] [ 463.719041][ T5040] vhost_worker+0x100/0x1a0 [vhost] [ 463.719048][ T5040] kthread+0x128/0x130 [ 463.719052][ T5040] ret_from_fork+0x10/0x18 The race condition as follows: Task1Task2 == __sock_release virtio_transport_recv_pkt __vsock_release vsock_find_bound_socket (found) lock_sock_nested vsock_remove_sock sock_orphan sk_set_socket(sk, NULL) ... release_sock lock_sock virtio_transport_recv_connecting sk->sk_socket->state (panic) This fixes it by checking vsk again whether it is in bound/connected table. Signed-off-by: Jia He Cc: sta...@vger.kernel.org --- net/vmw_vsock/virtio_transport_common.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 69efc891885f..0dbd6a45f0ed 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -1132,6 +1132,17 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, lock_sock(sk); + /* Check it again if vsk is removed by vsock_remove_sock */ + spin_lock_bh(_table_lock); + if (!__vsock_in_bound_table(vsk) && !__vsock_in_connected_table(vsk)) { + spin_unlock_bh(_table_lock); + (void)virtio_transport_reset_no_sock(t, pkt); + release_sock(sk); + sock_put(sk); + goto free_pkt; + } + spin_unlock_bh(_table_lock); + /* Update CID in case it has changed after a transport reset event */ vsk->local_addr.svm_cid = dst.svm_cid; -- 2.17.1