Re: Deadlock between bind and splice

2015-11-23 Thread Dmitry Vyukov
On Tue, Nov 10, 2015 at 3:59 AM, Al Viro  wrote:
> On Tue, Nov 10, 2015 at 02:38:54AM +, Al Viro wrote:
>> On Fri, Nov 06, 2015 at 07:42:15AM -0800, Eric Dumazet wrote:
>>
>> > Thank you for this report.
>> >
>> > pipe is part of fs, not net ;)
>>
>> AF_UNIX bind() vs. socketpair() interplay, OTOH...
>
> FWIW, BSD folks unlock the socket for the duration of mknod - mark it as
> "somebody's trying to bind it" to avoid the fun with racing double bind(),
> but that's about it.  Tempting, to be honest...
>
> BTW, why does unix_autobind() do allocation under ->readlock?  The allocation
> will be normally used - that if (u->addr) return; part is just dealing with
> an unlikely race, as far as I can see...


Hello,

This is still happening periodically for me. Is there a proposed fix?
I could test it.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deadlock between bind and splice

2015-11-23 Thread Hannes Frederic Sowa
On Mon, Nov 23, 2015, at 09:32, Dmitry Vyukov wrote:
> On Tue, Nov 10, 2015 at 3:59 AM, Al Viro  wrote:
> > On Tue, Nov 10, 2015 at 02:38:54AM +, Al Viro wrote:
> >> On Fri, Nov 06, 2015 at 07:42:15AM -0800, Eric Dumazet wrote:
> >>
> >> > Thank you for this report.
> >> >
> >> > pipe is part of fs, not net ;)
> >>
> >> AF_UNIX bind() vs. socketpair() interplay, OTOH...
> >
> > FWIW, BSD folks unlock the socket for the duration of mknod - mark it as
> > "somebody's trying to bind it" to avoid the fun with racing double bind(),
> > but that's about it.  Tempting, to be honest...
> >
> > BTW, why does unix_autobind() do allocation under ->readlock?  The 
> > allocation
> > will be normally used - that if (u->addr) return; part is just dealing with
> > an unlikely race, as far as I can see...
> 
> 
> Hello,
> 
> This is still happening periodically for me. Is there a proposed fix?
> I could test it.

No, we currently have no fix for that report. :/
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deadlock between bind and splice

2015-11-09 Thread Al Viro
On Fri, Nov 06, 2015 at 07:42:15AM -0800, Eric Dumazet wrote:

> Thank you for this report.
> 
> pipe is part of fs, not net ;)

AF_UNIX bind() vs. socketpair() interplay, OTOH...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deadlock between bind and splice

2015-11-09 Thread Al Viro
On Tue, Nov 10, 2015 at 02:38:54AM +, Al Viro wrote:
> On Fri, Nov 06, 2015 at 07:42:15AM -0800, Eric Dumazet wrote:
> 
> > Thank you for this report.
> > 
> > pipe is part of fs, not net ;)
> 
> AF_UNIX bind() vs. socketpair() interplay, OTOH...

FWIW, BSD folks unlock the socket for the duration of mknod - mark it as
"somebody's trying to bind it" to avoid the fun with racing double bind(),
but that's about it.  Tempting, to be honest...

BTW, why does unix_autobind() do allocation under ->readlock?  The allocation
will be normally used - that if (u->addr) return; part is just dealing with
an unlikely race, as far as I can see...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deadlock between bind and splice

2015-11-09 Thread Al Viro
On Fri, Nov 06, 2015 at 01:58:27PM +0100, Dmitry Vyukov wrote:
> Hello,
> 
> I am on revision d1e41ff11941784f469f17795a4d9425c2eb4b7a (Nov 5) and
> seeing the following lockdep reports. I don't have exact reproducer
> program as it is caused by several independent programs (state
> accumulated in kernel across invocations); if the report is not enough
> I can try to cook a reproducer.
> 
> Thanks.
> 
> [ INFO: possible circular locking dependency detected ]
> 4.3.0+ #30 Not tainted
> ---
> a.out/9972 is trying to acquire lock:
>  (>mutex/1){+.+.+.}, at: [< inline >] pipe_lock_nested
> fs/pipe.c:59
>  (>mutex/1){+.+.+.}, at: []
> pipe_lock+0x56/0x70 fs/pipe.c:67
> 
> but task is already holding lock:
>  (sb_writers#5){.+.+.+}, at: []
> __sb_start_write+0xec/0x130 fs/super.c:1198
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #2
[AF_UNIX bind() does sb_start_write() while holding unix_sock locked]

> -> #1
[splice() to AF_UNIX socket is trying to lock unix_sock while holding
the pipe locked]

> -> #0 (>mutex/1){+.+.+.}:
[splice() to regular file is locking the pipe under sb_start_write()]

Cute...  The first impression is that in #1 you need the socket to
be connected, or it won't even reach that attempt to lock unix_sock,
while bind() on the same sucker ought to bugger off before getting
around to touching the filesystem, so it looks like a false positive,
but... socketpair() yields a connected socket and AFAICS there's
nothing in unix_bind() to bugger off on such.

So the scenario ought to be:
(a while ago) A: socketpair()
B: splice() from a pipe to /mnt/regular_file
does sb_start_write() on /mnt
C: try to freeze /mnt
wait for B to finish with /mnt
A: bind() try to bind our socket to /mnt/new_socket_name
lock our socket, see it not bound yet
decide that it needs to create something in /mnt
try to do sb_start_write() on /mnt, block (it's
waiting for C).
D: splice() from the same pipe to our socket
lock the pipe, see that socket is connected
try to lock the socket, block waiting for A
B:  get around to actually feeding a chunk from
pipe to file, try to lock the pipe.  Deadlock.

Locking the socket is interruptible, though, so killing D will
untangle that mess - it's not quite a hopeless deadlock.

Deadlock or not, should bind() actually work on connected sockets?
AFAICS, socketpair() is the only way for it to happen...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deadlock between bind and splice

2015-11-06 Thread Eric Dumazet
On Fri, 2015-11-06 at 13:58 +0100, Dmitry Vyukov wrote:
> Hello,
> 
> I am on revision d1e41ff11941784f469f17795a4d9425c2eb4b7a (Nov 5) and
> seeing the following lockdep reports. I don't have exact reproducer
> program as it is caused by several independent programs (state
> accumulated in kernel across invocations); if the report is not enough
> I can try to cook a reproducer.
> 
> Thanks.
> 
> [ INFO: possible circular locking dependency detected ]
> 4.3.0+ #30 Not tainted
> ---
> a.out/9972 is trying to acquire lock:
>  (>mutex/1){+.+.+.}, at: [< inline >] pipe_lock_nested
> fs/pipe.c:59
>  (>mutex/1){+.+.+.}, at: []
> pipe_lock+0x56/0x70 fs/pipe.c:67
> 
> but task is already holding lock:
>  (sb_writers#5){.+.+.+}, at: []
> __sb_start_write+0xec/0x130 fs/super.c:1198
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #2 (sb_writers#5){.+.+.+}:
>[] lock_acquire+0x16d/0x2f0
> kernel/locking/lockdep.c:3585
>[] percpu_down_read+0x3c/0xa0
> kernel/locking/percpu-rwsem.c:73
>[] __sb_start_write+0xec/0x130 fs/super.c:1198
>[< inline >] sb_start_write include/linux/fs.h:1449
>[] mnt_want_write+0x3f/0xb0 fs/namespace.c:386
>[] filename_create+0x106/0x450 fs/namei.c:3425
>[] kern_path_create+0x33/0x40 fs/namei.c:3471
>[< inline >] unix_mknod net/unix/af_unix.c:849
>[] unix_bind+0x41b/0xa10 net/unix/af_unix.c:917
>[] SYSC_bind+0x1ea/0x250 net/socket.c:1383
>[] SyS_bind+0x24/0x30 net/socket.c:1369
>[] entry_SYSCALL_64_fastpath+0x31/0x9a
> arch/x86/entry/entry_64.S:187
> 
> -> #1 (>readlock){+.+.+.}:
>[] lock_acquire+0x16d/0x2f0
> kernel/locking/lockdep.c:3585
>[< inline >] __mutex_lock_common kernel/locking/mutex.c:518
>[] mutex_lock_interruptible_nested+0xa9/0xa30
> kernel/locking/mutex.c:647
>[] unix_stream_sendpage+0x23c/0x700
> net/unix/af_unix.c:1768
>[] kernel_sendpage+0x90/0xe0 net/socket.c:3278
>[] sock_sendpage+0xa5/0xd0 net/socket.c:765
>[] pipe_to_sendpage+0x26a/0x320 fs/splice.c:720
>[< inline >] splice_from_pipe_feed fs/splice.c:772
>[] __splice_from_pipe+0x268/0x740 fs/splice.c:889
>[] splice_from_pipe+0xf7/0x140 fs/splice.c:924
>[] generic_splice_sendpage+0x40/0x50 fs/splice.c:1097
>[< inline >] do_splice_from fs/splice.c:1116
>[< inline >] do_splice fs/splice.c:1392
>[< inline >] SYSC_splice fs/splice.c:1695
>[] SyS_splice+0x845/0x17c0 fs/splice.c:1678
>[] entry_SYSCALL_64_fastpath+0x31/0x9a
> arch/x86/entry/entry_64.S:187
> 
> -> #0 (>mutex/1){+.+.+.}:
>[< inline >] check_prev_add kernel/locking/lockdep.c:1853
>[< inline >] check_prevs_add kernel/locking/lockdep.c:1958
>[< inline >] validate_chain kernel/locking/lockdep.c:2144
>[] __lock_acquire+0x36d9/0x40e0
> kernel/locking/lockdep.c:3206
>[] lock_acquire+0x16d/0x2f0
> kernel/locking/lockdep.c:3585
>[< inline >] __mutex_lock_common kernel/locking/mutex.c:518
>[] mutex_lock_nested+0x9c/0x8f0
> kernel/locking/mutex.c:618
>[< inline >] pipe_lock_nested fs/pipe.c:59
>[] pipe_lock+0x56/0x70 fs/pipe.c:67
>[] iter_file_splice_write+0x199/0xb20 fs/splice.c:962
>[< inline >] do_splice_from fs/splice.c:1116
>[< inline >] do_splice fs/splice.c:1392
>[< inline >] SYSC_splice fs/splice.c:1695
>[] SyS_splice+0x845/0x17c0 fs/splice.c:1678
>[] entry_SYSCALL_64_fastpath+0x31/0x9a
> arch/x86/entry/entry_64.S:187
> 
> other info that might help us debug this:
> 
> Chain exists of:
>   >mutex/1 --> >readlock --> sb_writers#5
> 
>  Possible unsafe locking scenario:
> 
>CPU0CPU1
>
>   lock(sb_writers#5);
>lock(>readlock);
>lock(sb_writers#5);
>   lock(>mutex/1);
> 
>  *** DEADLOCK ***
> 
> 1 lock held by a.out/9972:
>  #0:  (sb_writers#5){.+.+.+}, at: []
> __sb_start_write+0xec/0x130 fs/super.c:1198
> 
> stack backtrace:
> CPU: 1 PID: 9972 Comm: a.out Not tainted 4.3.0+ #30
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>   88003d777938 81aad406 846046a0
>  84606860 846086c0 88003d777980 811ec511
>  88003d777a80 3cf79640 88003cf79df0 88003cf79e12
> Call Trace:
>  [< inline >] __dump_stack lib/dump_stack.c:15
>  [] dump_stack+0x68/0x92 lib/dump_stack.c:50
>  [] print_circular_bug+0x2d1/0x390
> kernel/locking/lockdep.c:1226
>  [< inline >] check_prev_add kernel/locking/lockdep.c:1853
>  [< inline >] check_prevs_add 

Deadlock between bind and splice

2015-11-06 Thread Dmitry Vyukov
Hello,

I am on revision d1e41ff11941784f469f17795a4d9425c2eb4b7a (Nov 5) and
seeing the following lockdep reports. I don't have exact reproducer
program as it is caused by several independent programs (state
accumulated in kernel across invocations); if the report is not enough
I can try to cook a reproducer.

Thanks.

[ INFO: possible circular locking dependency detected ]
4.3.0+ #30 Not tainted
---
a.out/9972 is trying to acquire lock:
 (>mutex/1){+.+.+.}, at: [< inline >] pipe_lock_nested
fs/pipe.c:59
 (>mutex/1){+.+.+.}, at: []
pipe_lock+0x56/0x70 fs/pipe.c:67

but task is already holding lock:
 (sb_writers#5){.+.+.+}, at: []
__sb_start_write+0xec/0x130 fs/super.c:1198

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (sb_writers#5){.+.+.+}:
   [] lock_acquire+0x16d/0x2f0
kernel/locking/lockdep.c:3585
   [] percpu_down_read+0x3c/0xa0
kernel/locking/percpu-rwsem.c:73
   [] __sb_start_write+0xec/0x130 fs/super.c:1198
   [< inline >] sb_start_write include/linux/fs.h:1449
   [] mnt_want_write+0x3f/0xb0 fs/namespace.c:386
   [] filename_create+0x106/0x450 fs/namei.c:3425
   [] kern_path_create+0x33/0x40 fs/namei.c:3471
   [< inline >] unix_mknod net/unix/af_unix.c:849
   [] unix_bind+0x41b/0xa10 net/unix/af_unix.c:917
   [] SYSC_bind+0x1ea/0x250 net/socket.c:1383
   [] SyS_bind+0x24/0x30 net/socket.c:1369
   [] entry_SYSCALL_64_fastpath+0x31/0x9a
arch/x86/entry/entry_64.S:187

-> #1 (>readlock){+.+.+.}:
   [] lock_acquire+0x16d/0x2f0
kernel/locking/lockdep.c:3585
   [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
   [] mutex_lock_interruptible_nested+0xa9/0xa30
kernel/locking/mutex.c:647
   [] unix_stream_sendpage+0x23c/0x700
net/unix/af_unix.c:1768
   [] kernel_sendpage+0x90/0xe0 net/socket.c:3278
   [] sock_sendpage+0xa5/0xd0 net/socket.c:765
   [] pipe_to_sendpage+0x26a/0x320 fs/splice.c:720
   [< inline >] splice_from_pipe_feed fs/splice.c:772
   [] __splice_from_pipe+0x268/0x740 fs/splice.c:889
   [] splice_from_pipe+0xf7/0x140 fs/splice.c:924
   [] generic_splice_sendpage+0x40/0x50 fs/splice.c:1097
   [< inline >] do_splice_from fs/splice.c:1116
   [< inline >] do_splice fs/splice.c:1392
   [< inline >] SYSC_splice fs/splice.c:1695
   [] SyS_splice+0x845/0x17c0 fs/splice.c:1678
   [] entry_SYSCALL_64_fastpath+0x31/0x9a
arch/x86/entry/entry_64.S:187

-> #0 (>mutex/1){+.+.+.}:
   [< inline >] check_prev_add kernel/locking/lockdep.c:1853
   [< inline >] check_prevs_add kernel/locking/lockdep.c:1958
   [< inline >] validate_chain kernel/locking/lockdep.c:2144
   [] __lock_acquire+0x36d9/0x40e0
kernel/locking/lockdep.c:3206
   [] lock_acquire+0x16d/0x2f0
kernel/locking/lockdep.c:3585
   [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
   [] mutex_lock_nested+0x9c/0x8f0
kernel/locking/mutex.c:618
   [< inline >] pipe_lock_nested fs/pipe.c:59
   [] pipe_lock+0x56/0x70 fs/pipe.c:67
   [] iter_file_splice_write+0x199/0xb20 fs/splice.c:962
   [< inline >] do_splice_from fs/splice.c:1116
   [< inline >] do_splice fs/splice.c:1392
   [< inline >] SYSC_splice fs/splice.c:1695
   [] SyS_splice+0x845/0x17c0 fs/splice.c:1678
   [] entry_SYSCALL_64_fastpath+0x31/0x9a
arch/x86/entry/entry_64.S:187

other info that might help us debug this:

Chain exists of:
  >mutex/1 --> >readlock --> sb_writers#5

 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(sb_writers#5);
   lock(>readlock);
   lock(sb_writers#5);
  lock(>mutex/1);

 *** DEADLOCK ***

1 lock held by a.out/9972:
 #0:  (sb_writers#5){.+.+.+}, at: []
__sb_start_write+0xec/0x130 fs/super.c:1198

stack backtrace:
CPU: 1 PID: 9972 Comm: a.out Not tainted 4.3.0+ #30
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  88003d777938 81aad406 846046a0
 84606860 846086c0 88003d777980 811ec511
 88003d777a80 3cf79640 88003cf79df0 88003cf79e12
Call Trace:
 [< inline >] __dump_stack lib/dump_stack.c:15
 [] dump_stack+0x68/0x92 lib/dump_stack.c:50
 [] print_circular_bug+0x2d1/0x390
kernel/locking/lockdep.c:1226
 [< inline >] check_prev_add kernel/locking/lockdep.c:1853
 [< inline >] check_prevs_add kernel/locking/lockdep.c:1958
 [< inline >] validate_chain kernel/locking/lockdep.c:2144
 [] __lock_acquire+0x36d9/0x40e0 kernel/locking/lockdep.c:3206
 [] lock_acquire+0x16d/0x2f0 kernel/locking/lockdep.c:3585
 [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
 []