Re: BUG at net/sctp/socket.c:7425
On 29.01.2017 13:40, Marcelo Ricardo Leitner wrote: > On Sun, Jan 29, 2017 at 03:35:31AM +0300, Alexander Popov wrote: >> Hello, >> >> I'm running the syzkaller fuzzer for v4.10-rc4 >> (0aa0313f9d576affd7747cc3f179feb097d28990) >> and have such a crash in sctp code: >> > ... >> >> Unfortunately, I didn't manage to get a C program reproducing the crash >> (looks like race). >> However, I stably hit it on my setup - so I can help fixing the issue. >> >> The crash happens here: >> /* Let another process have a go. Since we are going >> * to sleep anyway. >> */ >> release_sock(sk); >> current_timeo = schedule_timeout(current_timeo); >>> BUG_ON(sk != asoc->base.sk); >> lock_sock(sk); >> >> I've added some debugging output and see, that the original value of >> asoc->base.sk is >> changed to the address of another struct sock, which appeared in >> sctp_endpoint_init() >> shortly before the crash. > > You need some threading for this to happen. asoc->base.sk will change > if you peeloff the association. > It seems you had one thread waiting for some sndbuf to be available on a > sendmsg() call and another thread did a peeloff on the association that > the first thread was using. > Yeah I think this will reproduce it. > And in this case, it's probably better if we just return -EPIPE as the > association doesn't exist in that socket anymore instead of the BUG_ON. Thanks for your reply and patch, Marcelo. I've checked your explanation and agree with it. The situation looks like this: ... [ 55.719561] sctp_endpoint_init: sk 88006718c8c0 [ 55.721158] sctp_association_init: asoc 880059e96818, base.sk = 88006718c8c0 ... [ 56.144070] sctp_wait_for_sndbuf: asoc:880059e96818, timeo:9223372036854775807, msg_len:24 [ 56.148650] sctp_endpoint_init: sk 880068bca480 [ 56.149216] sctp_sock_migrate: asoc 880059e96818 from oldsk 88006718c8c0 to newsk 880068bca480 [ 56.150442] sctp_assoc_migrate: asoc 880059e96818 to newsk 880068bca480 [ 56.168827] crash!!! asoc 880059e96818: sk 88006718c8c0 != base.sk 880068bca480 [ 56.169801] [ cut here ] [ 56.170151] kernel BUG at net/sctp/socket.c:7433! ... > ---8<--- > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c > index 26a514269b92..e9870aead88b 100644 > --- a/net/sctp/socket.c > +++ b/net/sctp/socket.c > @@ -6838,7 +6838,8 @@ static int sctp_wait_for_sndbuf(struct sctp_association > *asoc, long *timeo_p, >*/ > sctp_release_sock(sk); > current_timeo = schedule_timeout(current_timeo); > - BUG_ON(sk != asoc->base.sk); > + if (sk != asoc->base.sk) > + goto do_error; > sctp_lock_sock(sk); > > *timeo_p = current_timeo; Tested your fix. Acked-by: Alexander Popov
Re: BUG at net/sctp/socket.c:7425
On Sun, Jan 29, 2017 at 03:35:31AM +0300, Alexander Popov wrote: > Hello, > > I'm running the syzkaller fuzzer for v4.10-rc4 > (0aa0313f9d576affd7747cc3f179feb097d28990) > and have such a crash in sctp code: > ... > > Unfortunately, I didn't manage to get a C program reproducing the crash > (looks like race). > However, I stably hit it on my setup - so I can help fixing the issue. > > The crash happens here: > /* Let another process have a go. Since we are going >* to sleep anyway. >*/ > release_sock(sk); > current_timeo = schedule_timeout(current_timeo); > > BUG_ON(sk != asoc->base.sk); > lock_sock(sk); > > I've added some debugging output and see, that the original value of > asoc->base.sk is > changed to the address of another struct sock, which appeared in > sctp_endpoint_init() > shortly before the crash. You need some threading for this to happen. asoc->base.sk will change if you peeloff the association. It seems you had one thread waiting for some sndbuf to be available on a sendmsg() call and another thread did a peeloff on the association that the first thread was using. Yeah I think this will reproduce it. And in this case, it's probably better if we just return -EPIPE as the association doesn't exist in that socket anymore instead of the BUG_ON. Marcelo ---8<--- diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 26a514269b92..e9870aead88b 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -6838,7 +6838,8 @@ static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p, */ sctp_release_sock(sk); current_timeo = schedule_timeout(current_timeo); - BUG_ON(sk != asoc->base.sk); + if (sk != asoc->base.sk) + goto do_error; sctp_lock_sock(sk); *timeo_p = current_timeo;
BUG at net/sctp/socket.c:7425
Hello, I'm running the syzkaller fuzzer for v4.10-rc4 (0aa0313f9d576affd7747cc3f179feb097d28990) and have such a crash in sctp code: [ 38.423932] [ cut here ] [ 38.424298] kernel BUG at net/sctp/socket.c:7425! [ 38.424583] invalid opcode: [#1] SMP KASAN [ 38.424839] Dumping ftrace buffer: [ 38.425031](ftrace buffer empty) [ 38.425232] Modules linked in: sctp libcrc32c snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_intel8x0 snd_ens1370 snd_ac97_codec gameport snd_rawmidi snd_hwdep snd_seq_device ac97_bus snd_pcm hid_generic joydev usbmouse snd_timer psmouse usbhid e1000 snd hid parport_pc i2c_piix4 soundcore serio_raw parport input_leds pcspkr floppy evbug mac_hid [ 38.427058] CPU: 0 PID: 1930 Comm: syz-executor12 Not tainted 4.10.0-rc4+ #2 [ 38.427457] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 38.427999] task: 88006945ca00 task.stack: 880053e4 [ 38.428364] RIP: 0010:sctp_sendmsg+0x29b3/0x3030 [sctp] [ 38.428719] RSP: 0018:880053e478f8 EFLAGS: 00010297 [ 38.429062] RAX: 88006945ca00 RBX: 880048d148c0 RCX: [ 38.429636] RDX: RSI: RDI: 88006d022c88 [ 38.430051] RBP: 880053e47b70 R08: 0560 R09: 88007ffda680 [ 38.430473] R10: 000a R11: 1d400032be05 R12: dc00 [ 38.430915] R13: 880048d148c0 R14: R15: 880059ad9160 [ 38.431390] FS: 7f984a645700() GS:88006d00() knlGS: [ 38.431979] CS: 0010 DS: ES: CR0: 80050033 [ 38.432405] CR2: 20005fe0 CR3: 6400a000 CR4: 06f0 [ 38.432827] DR0: DR1: DR2: [ 38.433253] DR3: DR6: fffe0ff0 DR7: 0400 [ 38.433765] Call Trace: [ 38.433938] ? sctp_id2assoc+0x330/0x330 [sctp] [ 38.434245] ? wake_atomic_t_function+0x2b0/0x2b0 [ 38.434545] inet_sendmsg+0x128/0x3a0 [ 38.434758] ? inet_recvmsg+0x420/0x420 [ 38.434983] sock_sendmsg+0xcf/0x110 [ 38.435192] sock_write_iter+0x222/0x3c0 [ 38.435421] ? sock_sendmsg+0x110/0x110 [ 38.435644] ? iov_iter_init+0xaf/0x1d0 [ 38.435867] __vfs_write+0x3cb/0x640 [ 38.436075] ? do_iter_readv_writev+0x4c0/0x4c0 [ 38.436338] ? apparmor_file_permission+0x27/0x30 [ 38.436618] ? rw_verify_area+0xea/0x2b0 [ 38.436853] vfs_write+0x175/0x4e0 [ 38.437053] SyS_write+0xd8/0x1b0 [ 38.437283] ? SyS_read+0x1b0/0x1b0 [ 38.437522] entry_SYSCALL_64_fastpath+0x1e/0xad [ 38.437820] RIP: 0033:0x44f869 [ 38.438013] RSP: 002b:7f984a644b58 EFLAGS: 0212 ORIG_RAX: 0001 [ 38.438464] RAX: ffda RBX: 7f984a645700 RCX: 0044f869 [ 38.438886] RDX: 0018 RSI: 20ac4fe8 RDI: 0004 [ 38.439305] RBP: 7ffe1d7be490 R08: R09: [ 38.439712] R10: R11: 0212 R12: [ 38.440145] R13: 7ffe1d7be40f R14: 7f984a6459c0 R15: [ 38.440563] Code: c7 c7 10 1a 5c a0 e8 4d fb 76 e1 c6 44 24 68 01 e9 a2 f2 ff ff e8 be 34 e1 e0 8b 9c 24 98 00 00 00 e9 06 fd ff ff e8 ad 34 e1 e0 <0f> 0b e8 a6 34 e1 e0 4c 8b 4c 24 78 4c 8b 44 24 68 4c 89 f9 48 [ 38.441881] RIP: sctp_sendmsg+0x29b3/0x3030 [sctp] RSP: 880053e478f8 [ 38.442341] ---[ end trace c704b04c884389c0 ]--- [ 38.442634] Kernel panic - not syncing: Fatal exception [ 38.443084] Dumping ftrace buffer: [ 38.443335](ftrace buffer empty) [ 38.443590] Kernel Offset: disabled Unfortunately, I didn't manage to get a C program reproducing the crash (looks like race). However, I stably hit it on my setup - so I can help fixing the issue. The crash happens here: /* Let another process have a go. Since we are going * to sleep anyway. */ release_sock(sk); current_timeo = schedule_timeout(current_timeo); > BUG_ON(sk != asoc->base.sk); lock_sock(sk); I've added some debugging output and see, that the original value of asoc->base.sk is changed to the address of another struct sock, which appeared in sctp_endpoint_init() shortly before the crash. Hope for some assistance. Best regards, Alexander