RE: [PATCH net v2] vsock: Fix a lockdep warning in __vsock_release()

2019-09-30 Thread Dexuan Cui
> From: Stefano Garzarella 
> Sent: Monday, September 30, 2019 6:51 AM
> ... 
> Feel free to add:
> 
> Tested-by: Stefano Garzarella 

Thanks, Stefano!

I'll post a v3 with your suggestion "lock_sock_nested(sk, level);".
It does look better than v2 to me. :-)

Thanks,
-- Dexuan


Re: [PATCH net v2] vsock: Fix a lockdep warning in __vsock_release()

2019-09-30 Thread Stefano Garzarella
On Fri, Sep 27, 2019 at 05:37:20AM +, Dexuan Cui wrote:
> > From: linux-hyperv-ow...@vger.kernel.org
> >  On Behalf Of Stefano Garzarella
> > Sent: Thursday, September 26, 2019 12:48 AM
> > 
> > Hi Dexuan,
> > 
> > On Thu, Sep 26, 2019 at 01:11:27AM +, Dexuan Cui wrote:
> > > ...
> > > NOTE: I only tested the code on Hyper-V. I can not test the code for
> > > virtio socket, as I don't have a KVM host. :-( Sorry.
> > >
> > > @Stefan, @Stefano: please review & test the patch for virtio socket,
> > > and let me know if the patch breaks anything. Thanks!
> > 
> > Comment below, I'll test it ASAP!
> 
> Stefano, Thank you!
> 
> BTW, this is how I tested the patch:
> 1. write a socket server program in the guest. The program calls listen()
> and then calls sleep(1 seconds). Note: accept() is not called.
> 
> 2. create some connections to the server program in the guest.
> 
> 3. kill the server program by Ctrl+C, and "dmesg" will show the scary
> call-trace, if the kernel is built with 
>   CONFIG_LOCKDEP=y
>   CONFIG_LOCKDEP_SUPPORT=y
> 
> 4. Apply the patch, do the same test and we should no longer see the 
> call-trace.
> 

Hi Dexuan,
I tested on virtio socket and it works as expected!

With your patch applied I don't have issues and call-trace. Without
the patch I have a very similar call-trace (as expected):

WARNING: possible recursive locking detected
5.3.0-vsock #17 Not tainted

python3/872 is trying to acquire lock:
88802b650110 (sk_lock-AF_VSOCK){+.+.}, at: 
virtio_transport_release+0x34/0x330 [vmw_vsock_virtio_transport_common]

but task is already holding lock:
88803597ce10 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x3f/0x130 
[vsock]

other info that might help us debug this:
 Possible unsafe locking scenario:

   CPU0
   
  lock(sk_lock-AF_VSOCK);
  lock(sk_lock-AF_VSOCK);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

2 locks held by python3/872:
 #0: 88802c957180 (>s_type->i_mutex_key#8){+.+.}, at: 
__sock_release+0x2d/0xb0
 #1: 88803597ce10 (sk_lock-AF_VSOCK){+.+.}, at: 
__vsock_release+0x3f/0x130 [vsock]

stack backtrace:
CPU: 0 PID: 872 Comm: python3 Not tainted 5.3.0-vsock #17
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 
04/01/2014
Call Trace:
 dump_stack+0x85/0xc0
 __lock_acquire.cold+0xad/0x22b
 lock_acquire+0xc4/0x1a0
 ? virtio_transport_release+0x34/0x330 [vmw_vsock_virtio_transport_common]
 lock_sock_nested+0x5d/0x80
 ? virtio_transport_release+0x34/0x330 [vmw_vsock_virtio_transport_common]
 virtio_transport_release+0x34/0x330 [vmw_vsock_virtio_transport_common]
 ? mark_held_locks+0x49/0x70
 ? _raw_spin_unlock_irqrestore+0x44/0x60
 __vsock_release+0x2d/0x130 [vsock]
 __vsock_release+0xb9/0x130 [vsock]
 vsock_release+0x12/0x30 [vsock]
 __sock_release+0x3d/0xb0
 sock_close+0x14/0x20
 __fput+0xc1/0x250
 task_work_run+0x93/0xb0
 exit_to_usermode_loop+0xd3/0xe0
 syscall_return_slowpath+0x205/0x310
 entry_SYSCALL_64_after_hwframe+0x49/0xbe


Feel free to add:

Tested-by: Stefano Garzarella 


Re: [PATCH net v2] vsock: Fix a lockdep warning in __vsock_release()

2019-09-27 Thread Stefano Garzarella
On Fri, Sep 27, 2019 at 05:37:20AM +, Dexuan Cui wrote:
> > From: linux-hyperv-ow...@vger.kernel.org
> >  On Behalf Of Stefano Garzarella
> > Sent: Thursday, September 26, 2019 12:48 AM
> > 
> > Hi Dexuan,
> > 
> > On Thu, Sep 26, 2019 at 01:11:27AM +, Dexuan Cui wrote:
> > > ...
> > > NOTE: I only tested the code on Hyper-V. I can not test the code for
> > > virtio socket, as I don't have a KVM host. :-( Sorry.
> > >
> > > @Stefan, @Stefano: please review & test the patch for virtio socket,
> > > and let me know if the patch breaks anything. Thanks!
> > 
> > Comment below, I'll test it ASAP!
> 
> Stefano, Thank you!
> 
> BTW, this is how I tested the patch:
> 1. write a socket server program in the guest. The program calls listen()
> and then calls sleep(1 seconds). Note: accept() is not called.
> 
> 2. create some connections to the server program in the guest.
> 
> 3. kill the server program by Ctrl+C, and "dmesg" will show the scary
> call-trace, if the kernel is built with 
>   CONFIG_LOCKDEP=y
>   CONFIG_LOCKDEP_SUPPORT=y
> 
> 4. Apply the patch, do the same test and we should no longer see the 
> call-trace.

Thanks very useful! I'll follow these steps!

> 
> > > - lock_sock(sk);
> > > + /* When "level" is 2, use the nested version to avoid the
> > > +  * warning "possible recursive locking detected".
> > > +  */
> > > + if (level == 1)
> > > + lock_sock(sk);
> > 
> > Since lock_sock() calls lock_sock_nested(sk, 0), could we use directly
> > lock_sock_nested(sk, level) with level = 0 in vsock_release() and
> > level = SINGLE_DEPTH_NESTING here in the while loop?
> > 
> > Thanks,
> > Stefano
> 
> IMHO it's better to make the lock usage more explicit, as the patch does.
> 
> lock_sock_nested(sk, level) or lock_sock_nested(sk, 0) seems a little
> odd to me. But I'm open to your suggestion: if any of the network
> maintainers, e.g. davem, also agrees with you, I'll change the code 
> as you suggested. :-)

Sure!

Just to be clear, I'm proposing this (plus the changes in the transports
and yours useful comments):

--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -638,7 +638,7 @@ struct sock *__vsock_create(struct net *net,
 }
 EXPORT_SYMBOL_GPL(__vsock_create);

-static void __vsock_release(struct sock *sk)
+static void __vsock_release(struct sock *sk, int level)
 {
if (sk) {
struct sk_buff *skb;
@@ -650,7 +650,7 @@ static void __vsock_release(struct sock *sk)

transport->release(vsk);

-   lock_sock(sk);
+   lock_sock_nested(sk, level);
sock_orphan(sk);
sk->sk_shutdown = SHUTDOWN_MASK;

@@ -659,7 +659,7 @@ static void __vsock_release(struct sock *sk)

/* Clean up any sockets that never were accepted. */
while ((pending = vsock_dequeue_accept(sk)) != NULL) {
-   __vsock_release(pending);
+   __vsock_release(pending, SINGLE_DEPTH_NESTING);
sock_put(pending);
}

@@ -708,7 +708,7 @@ EXPORT_SYMBOL_GPL(vsock_stream_has_space);

 static int vsock_release(struct socket *sock)
 {
-   __vsock_release(sock->sk);
+   __vsock_release(sock->sk, 0);
sock->sk = NULL;
sock->state = SS_FREE;

Thanks,
Stefano


RE: [PATCH net v2] vsock: Fix a lockdep warning in __vsock_release()

2019-09-26 Thread Dexuan Cui
> From: linux-hyperv-ow...@vger.kernel.org
>  On Behalf Of Stefano Garzarella
> Sent: Thursday, September 26, 2019 12:48 AM
> 
> Hi Dexuan,
> 
> On Thu, Sep 26, 2019 at 01:11:27AM +, Dexuan Cui wrote:
> > ...
> > NOTE: I only tested the code on Hyper-V. I can not test the code for
> > virtio socket, as I don't have a KVM host. :-( Sorry.
> >
> > @Stefan, @Stefano: please review & test the patch for virtio socket,
> > and let me know if the patch breaks anything. Thanks!
> 
> Comment below, I'll test it ASAP!

Stefano, Thank you!

BTW, this is how I tested the patch:
1. write a socket server program in the guest. The program calls listen()
and then calls sleep(1 seconds). Note: accept() is not called.

2. create some connections to the server program in the guest.

3. kill the server program by Ctrl+C, and "dmesg" will show the scary
call-trace, if the kernel is built with 
CONFIG_LOCKDEP=y
CONFIG_LOCKDEP_SUPPORT=y

4. Apply the patch, do the same test and we should no longer see the call-trace.

> > -   lock_sock(sk);
> > +   /* When "level" is 2, use the nested version to avoid the
> > +* warning "possible recursive locking detected".
> > +*/
> > +   if (level == 1)
> > +   lock_sock(sk);
> 
> Since lock_sock() calls lock_sock_nested(sk, 0), could we use directly
> lock_sock_nested(sk, level) with level = 0 in vsock_release() and
> level = SINGLE_DEPTH_NESTING here in the while loop?
> 
> Thanks,
> Stefano

IMHO it's better to make the lock usage more explicit, as the patch does.

lock_sock_nested(sk, level) or lock_sock_nested(sk, 0) seems a little
odd to me. But I'm open to your suggestion: if any of the network
maintainers, e.g. davem, also agrees with you, I'll change the code 
as you suggested. :-)

Thanks,
-- Dexuan


Re: [PATCH net v2] vsock: Fix a lockdep warning in __vsock_release()

2019-09-26 Thread Stefano Garzarella
Hi Dexuan,

On Thu, Sep 26, 2019 at 01:11:27AM +, Dexuan Cui wrote:
> Lockdep is unhappy if two locks from the same class are held.
> 
> Fix the below warning for hyperv and virtio sockets (vmci socket code
> doesn't have the issue) by using lock_sock_nested() when __vsock_release()
> is called recursively:
> 
> 
> WARNING: possible recursive locking detected
> 5.3.0+ #1 Not tainted
> 
> server/1795 is trying to acquire lock:
> 8880c5158990 (sk_lock-AF_VSOCK){+.+.}, at: hvs_release+0x10/0x120 
> [hv_sock]
> 
> but task is already holding lock:
> 8880c5158150 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 
> [vsock]
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>CPU0
>
>   lock(sk_lock-AF_VSOCK);
>   lock(sk_lock-AF_VSOCK);
> 
>  *** DEADLOCK ***
> 
>  May be due to missing lock nesting notation
> 
> 2 locks held by server/1795:
>  #0: 8880c5d05ff8 (>s_type->i_mutex_key#10){+.+.}, at: 
> __sock_release+0x2d/0xa0
>  #1: 8880c5158150 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 
> [vsock]
> 
> stack backtrace:
> CPU: 5 PID: 1795 Comm: server Not tainted 5.3.0+ #1
> Call Trace:
>  dump_stack+0x67/0x90
>  __lock_acquire.cold.67+0xd2/0x20b
>  lock_acquire+0xb5/0x1c0
>  lock_sock_nested+0x6d/0x90
>  hvs_release+0x10/0x120 [hv_sock]
>  __vsock_release+0x24/0xf0 [vsock]
>  __vsock_release+0xa0/0xf0 [vsock]
>  vsock_release+0x12/0x30 [vsock]
>  __sock_release+0x37/0xa0
>  sock_close+0x14/0x20
>  __fput+0xc1/0x250
>  task_work_run+0x98/0xc0
>  do_exit+0x344/0xc60
>  do_group_exit+0x47/0xb0
>  get_signal+0x15c/0xc50
>  do_signal+0x30/0x720
>  exit_to_usermode_loop+0x50/0xa0
>  do_syscall_64+0x24e/0x270
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x7f4184e85f31
> 
> Signed-off-by: Dexuan Cui 
> ---
> 
> NOTE: I only tested the code on Hyper-V. I can not test the code for
> virtio socket, as I don't have a KVM host. :-( Sorry.
> 
> @Stefan, @Stefano: please review & test the patch for virtio socket,
> and let me know if the patch breaks anything. Thanks!

Comment below, I'll test it ASAP!

> 
> Changes in v2:
>   Avoid the duplication of code in v1: https://lkml.org/lkml/2019/8/19/1361
>   Also fix virtio socket code.
> 
>  net/vmw_vsock/af_vsock.c| 19 +++
>  net/vmw_vsock/hyperv_transport.c|  2 +-
>  net/vmw_vsock/virtio_transport_common.c |  2 +-
>  3 files changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> index ab47bf3ab66e..dbae4373cbab 100644
> --- a/net/vmw_vsock/af_vsock.c
> +++ b/net/vmw_vsock/af_vsock.c
> @@ -638,8 +638,10 @@ struct sock *__vsock_create(struct net *net,
>  }
>  EXPORT_SYMBOL_GPL(__vsock_create);
>  
> -static void __vsock_release(struct sock *sk)
> +static void __vsock_release(struct sock *sk, int level)
>  {
> + WARN_ON(level != 1 && level != 2);
> +
>   if (sk) {
>   struct sk_buff *skb;
>   struct sock *pending;
> @@ -648,9 +650,18 @@ static void __vsock_release(struct sock *sk)
>   vsk = vsock_sk(sk);
>   pending = NULL; /* Compiler warning. */
>  
> + /* The release call is supposed to use lock_sock_nested()
> +  * rather than lock_sock(), if a sock lock should be acquired.
> +  */
>   transport->release(vsk);
>  
> - lock_sock(sk);
> + /* When "level" is 2, use the nested version to avoid the
> +  * warning "possible recursive locking detected".
> +  */
> + if (level == 1)
> + lock_sock(sk);

Since lock_sock() calls lock_sock_nested(sk, 0), could we use directly
lock_sock_nested(sk, level) with level = 0 in vsock_release() and
level = SINGLE_DEPTH_NESTING here in the while loop?

> + else
> + lock_sock_nested(sk, SINGLE_DEPTH_NESTING);
>   sock_orphan(sk);
>   sk->sk_shutdown = SHUTDOWN_MASK;
>  
> @@ -659,7 +670,7 @@ static void __vsock_release(struct sock *sk)
>  
>   /* Clean up any sockets that never were accepted. */
>   while ((pending = vsock_dequeue_accept(sk)) != NULL) {
> - __vsock_release(pending);
> + __vsock_release(pending, 2);
>   sock_put(pending);
>   }
>  
> @@ -708,7 +719,7 @@ EXPORT_SYMBOL_GPL(vsock_stream_has_space);
>  
>  static int vsock_release(struct socket *sock)
>  {
> - __vsock_release(sock->sk);
> + __vsock_release(sock->sk, 1);
>   sock->sk = NULL;
>   sock->state = SS_FREE;
>  
> diff --git a/net/vmw_vsock/hyperv_transport.c 
> b/net/vmw_vsock/hyperv_transport.c
> index 261521d286d6..c443db7af8d4 100644
> --- a/net/vmw_vsock/hyperv_transport.c
> +++ b/net/vmw_vsock/hyperv_transport.c
> @@ -559,7 +559,7 @@ static