On Sun, Mar 11, 2018 at 3:04 AM, Neil Horman <nhor...@tuxdriver.com> wrote:
> On Sun, Mar 11, 2018 at 12:22:32AM +0800, Xin Long wrote:
>> On Sat, Mar 10, 2018 at 9:13 PM, Neil Horman <nhor...@tuxdriver.com> wrote:
>> > On Sat, Mar 10, 2018 at 03:58:04PM +0800, Xin Long wrote:
>> >> On Sat, Mar 10, 2018 at 6:08 AM, Neil Horman <nhor...@tuxdriver.com> 
>> >> wrote:
>> >> > On Fri, Mar 09, 2018 at 12:59:06PM -0800, syzbot wrote:
>> >> >> Hello,
>> >> >>
>> >> >> syzbot hit the following crash on net-next commit
>> >> >> fd372a7a9e5e9d8011a0222d10edd3523abcd3b1 (Thu Mar 8 19:43:48 2018 
>> >> >> +0000)
>> >> >> Merge tag 'mlx5-updates-2018-02-28-2' of
>> >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
>> >> >>
>> >> >> So far this crash happened 2 times on net-next.
>> >> >> C reproducer is attached.
>> >> >> syzkaller reproducer is attached.
>> >> >> Raw console output is attached.
>> >> >> compiler: gcc (GCC) 7.1.1 20170620
>> >> >> .config is attached.
>> >> >>
>> >> >> IMPORTANT: if you fix the bug, please add the following tag to the 
>> >> >> commit:
>> >> >> Reported-by: syzbot+a4e4112c3aff00c8c...@syzkaller.appspotmail.com
>> >> >> It will help syzbot understand when the bug is fixed. See footer for
>> >> >> details.
>> >> >> If you forward the report, please keep this part and the footer.
>> >> >>
>> >> >> IPVS: ftp: loaded support on port[0] = 21
>> >> >> IPVS: ftp: loaded support on port[0] = 21
>> >> >> IPVS: ftp: loaded support on port[0] = 21
>> >> >> IPVS: ftp: loaded support on port[0] = 21
>> >> >> ==================================================================
>> >> >> BUG: KASAN: use-after-free in sctp_association_free+0x7b7/0x930
>> >> >> net/sctp/associola.c:332
>> >> >> Read of size 8 at addr ffff8801d8006ae0 by task syzkaller914861/4202
>> >> >>
>> >> >> CPU: 1 PID: 4202 Comm: syzkaller914861 Not tainted 4.16.0-rc4+ #258
>> >> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> >> >> Google 01/01/2011
>> >> >> Call Trace:
>> >> >>  __dump_stack lib/dump_stack.c:17 [inline]
>> >> >>  dump_stack+0x194/0x24d lib/dump_stack.c:53
>> >> >>  print_address_description+0x73/0x250 mm/kasan/report.c:256
>> >> >>  kasan_report_error mm/kasan/report.c:354 [inline]
>> >> >>  kasan_report+0x23c/0x360 mm/kasan/report.c:412
>> >> >>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>> >> >>  sctp_association_free+0x7b7/0x930 net/sctp/associola.c:332
>> >> >>  sctp_sendmsg+0xc67/0x1a80 net/sctp/socket.c:2075
>> >> >>  inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
>> >> >>  sock_sendmsg_nosec net/socket.c:629 [inline]
>> >> >>  sock_sendmsg+0xca/0x110 net/socket.c:639
>> >> >>  SYSC_sendto+0x361/0x5c0 net/socket.c:1748
>> >> >>  SyS_sendto+0x40/0x50 net/socket.c:1716
>> >> >>  do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
>> >> >>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
>> >> >> RIP: 0033:0x446d09
>> >> >> RSP: 002b:00007f5dbac21da8 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
>> >> >> RAX: ffffffffffffffda RBX: 00000000006e29fc RCX: 0000000000446d09
>> >> >> RDX: 0000000000000001 RSI: 0000000020000340 RDI: 0000000000000003
>> >> >> RBP: 00000000006e29f8 R08: 00000000204d9000 R09: 000000000000001c
>> >> >> R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000000
>> >> >> R13: 00007fff7b26fb1f R14: 00007f5dbac229c0 R15: 00000000006e2b60
>> >> >>
>> >> > I think we have a corner case with 
>> >> > a0ff660058b88d12625a783ce9e5c1371c87951f
>> >> > here.  If a peeloff event happens during a wait for sendbuf space, 
>> >> > EPIPE will be
>> >> > returned, and the code path appears to call sctp_association_put twice, 
>> >> > leading
>> >> > to the use after free situation.  I'll write a patch this weekend
>> >> Hi, Neil, you're right.
>> >>
>> >> I didn't expect peeloff can be done on a NEW asoc, as peeloff needs
>> >> assoc_id, which can only be set when connecting has started.
>> >>
>> >> But I realized that:
>> >> f84af33 sctp: factor out sctp_sendmsg_to_asoc from sctp_sendmsg
>> >>
>> >> moved sctp_primitive_ASSOCIATE(connecting) before sctp_wait_for_sndbuf
>> >> (snd buffer waiting). It means peeloff can be done on a NEW asoc.
>> >> So you may want to move it back.
>> >>
>> > I agree with the root cause, but I'm not sure I agree with just moving the
>> > wait_for_sndbuf call back above the call to associate.  I'm not sure I like
>> > relying on placing a call in a spcific order solely to avoid an error 
>> > condition
>> > that might legitimately occur.  I think would rather check the return code 
>> > at
>> > the call site for the complete set of conditions for which we should not 
>> > free
>> > the association.  Something like this:
>> >
>> > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
>> > index 7d3476a4860d..a68846d2b0ef 100644
>> > --- a/net/sctp/socket.c
>> > +++ b/net/sctp/socket.c
>> > @@ -2071,8 +2071,9 @@ static int sctp_sendmsg(struct sock *sk, struct 
>> > msghdr *msg, size_t msg_len)
>> >
>> >         /* Send msg to the asoc */
>> >         err = sctp_sendmsg_to_asoc(asoc, msg, msg_len, transport, sinfo);
>> > -       if (err < 0 && err != -ESRCH && new)
>> > -               sctp_association_free(asoc);
>> > +       if ((err != -ESRCH) && (err != -EPIPE))
>> > +               if (err < 0 && new)
>> > +                       sctp_association_free(asoc);
>> >
>> >  out_unlock:
>> >         release_sock(sk);
>> >
>> > Which I think also avoids the noted conflict.
>> >
>> > Thoughts?
>> If sctp_association_free is called for general asoc, yes, I agree with you.
>> But it's actually only for NEW asoc, a special case, not worth a extra check.
>> 'err != -ESRCH' is already kind of ugly there (I couldn't find a nicer way 
>> :D),
>> I don't hope there will be more like that.
>>
> I agree with you on the uglyness aspect of the return code check, but I really
> don't like the idea of placing the call to wait_for_sndbuf to guarantee a 
> given
> error code isn't returned, It just feels rickety to me.
I understand, let's not count this moving back as the official fix
for this, but only for the compatibility :-)

Then we can start a new one for improving it later as you said below,
FYI, we've tried to pass 'new' parameter into sctp_sendmsg_to_asoc
or use &asoc as the parameter instead. But it seems not good, and
also sctp_association_free will have to be done in sctp_sendmsg_to_asoc,
which looks worse.

So yes, pls think about it, you may have a better idea on this.

Thanks.

>
>> And also, I think I moved sctp_primitive_ASSOCIATE before
>> sctp_wait_for_sndbuf by accident in that patch(Sorry), which may
>> already change some behavior. We have to move it back.
>>
> It seemed to me you did that specifically to enable the check for peeloff
> events.  That is to say, if you check the socket write space first on a new
> association, and then  get a peeloff event, the write space might have changed
> (though I suppose in fairness, we will always be operating the old socket, so
> the write space will only grow larger).  Either way there should be nothing
> really wrong with checking write space after we preform the association.
>
> But I suppose that moving it back is the safer approach.  It would be nice to
> eliminate that odd error check as well if we can, but I'll start with just
> returning the positioning to the way it was, and look to improve it later.  
> I'll
> post the patch once I confirm the build works
>
> Neil
>

Reply via email to