On Thu, Apr 13, 2017 at 11:31 AM, Joe Stringer <j...@ovn.org> wrote: > On 6 April 2017 at 17:18, Andy Zhou <az...@ovn.org> wrote: >> From: Eric Dumazet <eduma...@google.com> >> >> Upstream commit: >> ipv6: orphan skbs in reassembly unit >> >> Andrey reported a use-after-free in IPv6 stack. >> >> Issue here is that we free the socket while it still has skb >> in TX path and in some queues. >> >> It happens here because IPv6 reassembly unit messes skb->truesize, >> breaking skb_set_owner_w() badly. >> >> We fixed a similar issue for IPV4 in commit 8282f27449bf ("inet: frag: >> Always orphan skbs inside ip_defrag()") >> Acked-by: Joe Stringer <j...@ovn.org> >> >> ================================================================== >> BUG: KASAN: use-after-free in sock_wfree+0x118/0x120 >> Read of size 8 at addr ffff880062da0060 by task a.out/4140 >> >> page:ffffea00018b6800 count:1 mapcount:0 mapping: (null) >> index:0x0 compound_mapcount: 0 >> flags: 0x100000000008100(slab|head) >> raw: 0100000000008100 0000000000000000 0000000000000000 0000000180130013 >> raw: dead000000000100 dead000000000200 ffff88006741f140 0000000000000000 >> page dumped because: kasan: bad access detected >> >> CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs >> 01/01/2011 >> Call Trace: >> __dump_stack lib/dump_stack.c:15 >> dump_stack+0x292/0x398 lib/dump_stack.c:51 >> describe_address mm/kasan/report.c:262 >> kasan_report_error+0x121/0x560 mm/kasan/report.c:370 >> kasan_report mm/kasan/report.c:392 >> __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413 >> sock_flag ./arch/x86/include/asm/bitops.h:324 >> sock_wfree+0x118/0x120 net/core/sock.c:1631 >> skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655 >> skb_release_all+0x15/0x60 net/core/skbuff.c:668 >> __kfree_skb+0x15/0x20 net/core/skbuff.c:684 >> kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705 >> inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304 >> inet_frag_put ./include/net/inet_frag.h:133 >> nf_ct_frag6_gather+0x1125/0x38b0 >> net/ipv6/netfilter/nf_conntrack_reasm.c:617 >> ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68 >> nf_hook_entry_hookfn ./include/linux/netfilter.h:102 >> nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310 >> nf_hook ./include/linux/netfilter.h:212 >> __ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160 >> ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170 >> ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722 >> ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742 >> rawv6_push_pending_frames net/ipv6/raw.c:613 >> rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927 >> inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744 >> sock_sendmsg_nosec net/socket.c:635 >> sock_sendmsg+0xca/0x110 net/socket.c:645 >> sock_write_iter+0x326/0x620 net/socket.c:848 >> new_sync_write fs/read_write.c:499 >> __vfs_write+0x483/0x760 fs/read_write.c:512 >> vfs_write+0x187/0x530 fs/read_write.c:560 >> SYSC_write fs/read_write.c:607 >> SyS_write+0xfb/0x230 fs/read_write.c:599 >> entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203 >> RIP: 0033:0x7ff26e6f5b79 >> RSP: 002b:00007ff268e0ed98 EFLAGS: 00000206 ORIG_RAX: 0000000000000001 >> RAX: ffffffffffffffda RBX: 00007ff268e0f9c0 RCX: 00007ff26e6f5b79 >> RDX: 0000000000000010 RSI: 0000000020f50fe1 RDI: 0000000000000003 >> RBP: 00007ff26ebc1220 R08: 0000000000000000 R09: 0000000000000000 >> R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000 >> R13: 00007ff268e0f9c0 R14: 00007ff26efec040 R15: 0000000000000003 >> >> The buggy address belongs to the object at ffff880062da0000 >> which belongs to the cache RAWv6 of size 1504 >> The buggy address ffff880062da0060 is located 96 bytes inside >> of 1504-byte region [ffff880062da0000, ffff880062da05e0) >> >> Freed by task 4113: >> save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57 >> save_stack+0x43/0xd0 mm/kasan/kasan.c:502 >> set_track mm/kasan/kasan.c:514 >> kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578 >> slab_free_hook mm/slub.c:1352 >> slab_free_freelist_hook mm/slub.c:1374 >> slab_free mm/slub.c:2951 >> kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973 >> sk_prot_free net/core/sock.c:1377 >> __sk_destruct+0x49c/0x6e0 net/core/sock.c:1452 >> sk_destruct+0x47/0x80 net/core/sock.c:1460 >> __sk_free+0x57/0x230 net/core/sock.c:1468 >> sk_free+0x23/0x30 net/core/sock.c:1479 >> sock_put ./include/net/sock.h:1638 >> sk_common_release+0x31e/0x4e0 net/core/sock.c:2782 >> rawv6_close+0x54/0x80 net/ipv6/raw.c:1214 >> inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425 >> inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431 >> sock_release+0x8d/0x1e0 net/socket.c:599 >> sock_close+0x16/0x20 net/socket.c:1063 >> __fput+0x332/0x7f0 fs/file_table.c:208 >> ____fput+0x15/0x20 fs/file_table.c:244 >> task_work_run+0x19b/0x270 kernel/task_work.c:116 >> exit_task_work ./include/linux/task_work.h:21 >> do_exit+0x186b/0x2800 kernel/exit.c:839 >> do_group_exit+0x149/0x420 kernel/exit.c:943 >> SYSC_exit_group kernel/exit.c:954 >> SyS_exit_group+0x1d/0x20 kernel/exit.c:952 >> entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203 >> >> Allocated by task 4115: >> save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57 >> save_stack+0x43/0xd0 mm/kasan/kasan.c:502 >> set_track mm/kasan/kasan.c:514 >> kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605 >> kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544 >> slab_post_alloc_hook mm/slab.h:432 >> slab_alloc_node mm/slub.c:2708 >> slab_alloc mm/slub.c:2716 >> kmem_cache_alloc+0x1af/0x250 mm/slub.c:2721 >> sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1334 >> sk_alloc+0x105/0x1010 net/core/sock.c:1396 >> inet6_create+0x44d/0x1150 net/ipv6/af_inet6.c:183 >> __sock_create+0x4f6/0x880 net/socket.c:1199 >> sock_create net/socket.c:1239 >> SYSC_socket net/socket.c:1269 >> SyS_socket+0xf9/0x230 net/socket.c:1249 >> entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203 >> >> Memory state around the buggy address: >> ffff880062d9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >> ffff880062d9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >> >ffff880062da0000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >> ^ >> ffff880062da0080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >> ffff880062da0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >> ================================================================== >> >> Reported-by: Andrey Konovalov <andreyk...@google.com> >> Signed-off-by: Eric Dumazet <eduma...@google.com> >> Signed-off-by: David S. Miller <da...@davemloft.net> >> >> This patch is a bugfix, and will be progressively backported to earlier >> kernels. If it is backported to any kernel 4.5 through 4.10, then users >> use that updated kernel with the OVS kernel module prior to this patch, it >> could cause a crash. The compat code here resolves such issues. >> >> Upstream: 48cac18ecf1d ("ipv6: orphan skbs in reassembly unit") >> Signed-off-by: Joe Stringer <j...@ovn.org> >> Signed-off-by: Andy Zhou <az...@ovn.org> >> --- >> AUTHORS.rst | 1 + >> acinclude.m4 | 4 ---- >> datapath/conntrack.c | 1 - >> .../linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h | 11 >> +++++++---- >> datapath/linux/compat/nf_conntrack_reasm.c | 9 ++++++++- >> 5 files changed, 16 insertions(+), 10 deletions(-) >> >> diff --git a/AUTHORS.rst b/AUTHORS.rst >> index 03196c15d115..cb3b9b0710b0 100644 >> --- a/AUTHORS.rst >> +++ b/AUTHORS.rst >> @@ -116,6 +116,7 @@ Aymerich Edward edward.aymer...@hpe.com >> Edward Tomasz NapieraĆa tr...@freebsd.org >> Eitan Eliahu elia...@vmware.com >> Eohyung Lee liquidnu...@gmail.com >> +Eric Dumazet eduma...@google.com >> Eric Garver e...@erig.me >> Eric Sesterhenn eric.sesterh...@lsexperts.de >> Ethan J. Jackson e...@eecs.berkeley.edu >> diff --git a/acinclude.m4 b/acinclude.m4 >> index 744d8f89525c..a7aaf48b5d2f 100644 >> --- a/acinclude.m4 >> +++ b/acinclude.m4 >> @@ -548,10 +548,6 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ >> [OVS_DEFINE([HAVE_NF_CONNLABELS_GET_TAKES_BIT])]) >> OVS_FIND_FIELD_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_labels.h], >> [nf_conn_labels], [words]) >> - OVS_GREP_IFELSE([$KSRC/include/net/netfilter/ipv6/nf_defrag_ipv6.h], >> - [nf_ct_frag6_consume_orig]) >> - OVS_GREP_IFELSE([$KSRC/include/net/netfilter/ipv6/nf_defrag_ipv6.h], >> - [nf_ct_frag6_output]) >> OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_nat.h], >> [nf_ct_nat_ext_add]) >> OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_nat.h], >> [nf_nat_alloc_null_binding]) >> OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_seqadj.h], >> [nf_ct_seq_adjust]) >> diff --git a/datapath/conntrack.c b/datapath/conntrack.c >> index a47525355534..015e15f8600c 100644 >> --- a/datapath/conntrack.c >> +++ b/datapath/conntrack.c >> @@ -519,7 +519,6 @@ static int handle_fragments(struct net *net, struct >> sw_flow_key *key, >> } else if (key->eth.type == htons(ETH_P_IPV6)) { >> enum ip6_defrag_users user = IP6_DEFRAG_CONNTRACK_IN + zone; >> >> - skb_orphan(skb); >> memset(IP6CB(skb), 0, sizeof(struct inet6_skb_parm)); >> err = nf_ct_frag6_gather(net, skb, user); >> if (err) { >> diff --git >> a/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h >> b/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h >> index c65e7f2feb03..2ab6c0aa79a1 100644 >> --- a/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h >> +++ b/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h >> @@ -5,11 +5,14 @@ >> #include_next <net/netfilter/ipv6/nf_defrag_ipv6.h> >> >> /* Upstream commit 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free >> clone >> - * operations") changed the semantics of nf_ct_frag6_gather(), so we >> backport >> - * it for all prior kernels. >> + * operations") changed the semantics of nf_ct_frag6_gather(), so we need >> + * to backport for all prior kernels, i.e. kernel < 4.5.0. >> + * >> + * Upstream commit 48cac18ecf1d ("ipv6: orphan skbs in reassembly unit") >> fixes >> + * a bug that requires all kernels prior to this fix, i.e. kernel < 4.11.0 >> + * to be backported. >> */ >> -#if defined(HAVE_NF_CT_FRAG6_CONSUME_ORIG) || \ >> - defined(HAVE_NF_CT_FRAG6_OUTPUT) >> +#if LINUX_VERSION_CODE < KERNEL_VERSION(4,11,0) > > I lament to see us extending backport wholesale like this to a bunch > of newer kernels, but I don't really see a better way. Thanks for > investigating this. FWIW, I looked at this code between version 4.2 and 4.11 -- It seems safe. In principle, I share your concern about wholesale change. > >> diff --git a/datapath/linux/compat/nf_conntrack_reasm.c >> b/datapath/linux/compat/nf_conntrack_reasm.c >> index e633443d16e0..6cf4ef434cbb 100644 >> --- a/datapath/linux/compat/nf_conntrack_reasm.c >> +++ b/datapath/linux/compat/nf_conntrack_reasm.c >> @@ -498,7 +498,8 @@ find_prev_fhdr(struct sk_buff *skb, u8 *prevhdrp, int >> *prevhoff, int *fhoff) >> return 0; >> } >> >> -int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user) >> +#undef nf_ct_frag6_gather >> +static int nf_ct_frag6_gather__(struct net *net, struct sk_buff *skb, u32 >> user) > > We can drop the #undef, and change the function name to > rpl_nf_ct_frag6_gather().
Sure. > >> { >> struct net_device *dev = skb->dev; >> int fhoff, nhoff, ret; >> @@ -530,6 +531,7 @@ int nf_ct_frag6_gather(struct net *net, struct sk_buff >> *skb, u32 user) >> local_bh_enable(); >> #endif >> >> + skb_orphan(skb); >> fq = fq_find(net, fhdr->identification, user, &hdr->saddr, >> &hdr->daddr, >> ip6_frag_ecn(hdr)); >> if (fq == NULL) >> @@ -557,6 +559,11 @@ out_unlock: >> return ret; >> } >> >> +int rpl_nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user) >> +{ >> + return nf_ct_frag6_gather__(net, skb, user); >> +} >> + > > This function doesn't seem to serve any purpose, this should compile > fine with the change I suggest above, and drop this hunk. O.K. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev