Re: KASAN: use-after-free Read in rds_tcp_dev_event
On 11/7/17 12:28 PM, syzbot wrote: Hello, syzkaller hit the following crash on 287683d027a3ff83feb6c7044430c79881664ecf git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master compiler: gcc (GCC) 7.1.1 20170620 .config is attached Raw console output is attached. == BUG: KASAN: use-after-free in rds_tcp_kill_sock net/rds/tcp.c:530 [inline] BUG: KASAN: use-after-free in rds_tcp_dev_event+0xc01/0xc90 net/rds/tcp.c:568 Read of size 8 at addr 8801cd879200 by task kworker/u4:3/147 CPU: 0 PID: 147 Comm: kworker/u4:3 Not tainted 4.14.0-rc7+ #156 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:52 print_address_description+0x73/0x250 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x25b/0x340 mm/kasan/report.c:409 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430 rds_tcp_kill_sock net/rds/tcp.c:530 [inline] rds_tcp_dev_event+0xc01/0xc90 net/rds/tcp.c:568 The issue here is that we are trying to reference a network namespace (struct net *) that is long gone (i.e., L532 below -- c_net is the culprit). 528 spin_lock_irq(_tcp_conn_lock); 529 list_for_each_entry_safe(tc, _tc, _tcp_conn_list, t_tcp_node) { 530 struct net *c_net = tc->t_cpath->cp_conn->c_net; 531 532 if (net != c_net || !tc->t_sock) 533 continue; 534 if (!list_has_conn(_list, tc->t_cpath->cp_conn)) 535 list_move_tail(>t_tcp_node, _list); 536 } 537 spin_unlock_irq(_tcp_conn_lock); 538 list_for_each_entry_safe(tc, _tc, _list, t_tcp_node) { 539 rds_tcp_conn_paths_destroy(tc->t_cpath->cp_conn); 540 rds_conn_destroy(tc->t_cpath->cp_conn); 541 } When a network namespace is deleted, devices within that namespace are unregistered and removed one by one. RDS is notified about this event through rds_tcp_dev_event() callback. When the loopback device is removed from the namespace, the above RDS callback function destroys all the RDS connections in that namespace. The loop@L529 above walks through each of the rds_tcp connection in the global list (rds_tcp_conn_list) to see if that connection belongs to the namespace in question. It collects all such connections and destroys them (L538-540). However, it leaves behind some of the rds_tcp connections that shared the same underlying RDS connection (L534 and 535). These connections with pointer to stale network namespace are left behind in the global list. When the 2nd network namespace is deleted, we will hit the above stale pointer and hit UAF panic. I think we should move away from global list to a per-namespace list. The global list are used only in two places (both of which are per-namespace operations): - to destroy all the RDS connections belonging to a namespace when the network namespace is being deleted. - to reset all the RDS connections when socket parameters for a namespace are modified using sysctl Thanks, ~Girish
Re: KASAN: use-after-free Read in rds_tcp_dev_event
On 11/7/17 12:28 PM, syzbot wrote: Hello, syzkaller hit the following crash on 287683d027a3ff83feb6c7044430c79881664ecf git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master compiler: gcc (GCC) 7.1.1 20170620 .config is attached Raw console output is attached. == BUG: KASAN: use-after-free in rds_tcp_kill_sock net/rds/tcp.c:530 [inline] BUG: KASAN: use-after-free in rds_tcp_dev_event+0xc01/0xc90 net/rds/tcp.c:568 Read of size 8 at addr 8801cd879200 by task kworker/u4:3/147 CPU: 0 PID: 147 Comm: kworker/u4:3 Not tainted 4.14.0-rc7+ #156 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:52 print_address_description+0x73/0x250 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x25b/0x340 mm/kasan/report.c:409 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430 rds_tcp_kill_sock net/rds/tcp.c:530 [inline] rds_tcp_dev_event+0xc01/0xc90 net/rds/tcp.c:568 The issue here is that we are trying to reference a network namespace (struct net *) that is long gone (i.e., L532 below -- c_net is the culprit). 528 spin_lock_irq(_tcp_conn_lock); 529 list_for_each_entry_safe(tc, _tc, _tcp_conn_list, t_tcp_node) { 530 struct net *c_net = tc->t_cpath->cp_conn->c_net; 531 532 if (net != c_net || !tc->t_sock) 533 continue; 534 if (!list_has_conn(_list, tc->t_cpath->cp_conn)) 535 list_move_tail(>t_tcp_node, _list); 536 } 537 spin_unlock_irq(_tcp_conn_lock); 538 list_for_each_entry_safe(tc, _tc, _list, t_tcp_node) { 539 rds_tcp_conn_paths_destroy(tc->t_cpath->cp_conn); 540 rds_conn_destroy(tc->t_cpath->cp_conn); 541 } When a network namespace is deleted, devices within that namespace are unregistered and removed one by one. RDS is notified about this event through rds_tcp_dev_event() callback. When the loopback device is removed from the namespace, the above RDS callback function destroys all the RDS connections in that namespace. The loop@L529 above walks through each of the rds_tcp connection in the global list (rds_tcp_conn_list) to see if that connection belongs to the namespace in question. It collects all such connections and destroys them (L538-540). However, it leaves behind some of the rds_tcp connections that shared the same underlying RDS connection (L534 and 535). These connections with pointer to stale network namespace are left behind in the global list. When the 2nd network namespace is deleted, we will hit the above stale pointer and hit UAF panic. I think we should move away from global list to a per-namespace list. The global list are used only in two places (both of which are per-namespace operations): - to destroy all the RDS connections belonging to a namespace when the network namespace is being deleted. - to reset all the RDS connections when socket parameters for a namespace are modified using sysctl Thanks, ~Girish
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On 11/8/17 10:34 PM, Cong Wang wrote: On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wuwrote: Hi Alex, So looking over the trace the panic seems to be happening after a decnet interface is getting deleted. Is there any chance we could try compiling the kernel without decnet support to see if that is the source of these issues? I don't know if anyone on the Intel Wired Lan team is testing with that enabled so if we can eliminate that as a possible cause that would be useful. Sure and thank you for the suggestion! It looks disabling DECNET still triggers the vlan_device_event BUG. However when looking at the dmesgs, I find another warning just before the vlan_device_event BUG. Not sure if it's related one or independent now-fixed issue. Those decnet symbols are probably noises. Right. This is a 32-bit Kernel compiled with CONFIG_PREEMPT=y (I am guessing that this has exposed some lock bug). Also, VLAN (8021q) is compiled into the kernel, so it registers a vlan_device_event() callback on boot. There may not be a VLAN device per-se. Upon receiving NETDEV_DOWN event, we are calling vlan_vid_del(dev, htons(ETH_P_8021Q), 0); which in turn calls call_rcu() to queue vlan_info_free_rcu() to be called at some point. This free function frees the array[] (vlan_info.vlan_grp.vn_devices_array). My guess is that vlan_info_free_rcu() is being called first and then the array[] is being accessed in vlan_device_event(). The netifd daemon in OpenWRT is marking the interface down and that is why it is generating NETDEV_DOWN event. And it uses ioctl(SIOCSIFFLAGS, ~IFF_UP) on a AF_UNIX socket. This results in a call to dev_ifsioc() in the kernel with only rtnl_lock() held and it is not in RCU read critical section. ~Girish How do you reproduce it? And what is your setup? Vlan device on top of your eth0 (e1000)?
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On 11/8/17 10:34 PM, Cong Wang wrote: On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wu wrote: Hi Alex, So looking over the trace the panic seems to be happening after a decnet interface is getting deleted. Is there any chance we could try compiling the kernel without decnet support to see if that is the source of these issues? I don't know if anyone on the Intel Wired Lan team is testing with that enabled so if we can eliminate that as a possible cause that would be useful. Sure and thank you for the suggestion! It looks disabling DECNET still triggers the vlan_device_event BUG. However when looking at the dmesgs, I find another warning just before the vlan_device_event BUG. Not sure if it's related one or independent now-fixed issue. Those decnet symbols are probably noises. Right. This is a 32-bit Kernel compiled with CONFIG_PREEMPT=y (I am guessing that this has exposed some lock bug). Also, VLAN (8021q) is compiled into the kernel, so it registers a vlan_device_event() callback on boot. There may not be a VLAN device per-se. Upon receiving NETDEV_DOWN event, we are calling vlan_vid_del(dev, htons(ETH_P_8021Q), 0); which in turn calls call_rcu() to queue vlan_info_free_rcu() to be called at some point. This free function frees the array[] (vlan_info.vlan_grp.vn_devices_array). My guess is that vlan_info_free_rcu() is being called first and then the array[] is being accessed in vlan_device_event(). The netifd daemon in OpenWRT is marking the interface down and that is why it is generating NETDEV_DOWN event. And it uses ioctl(SIOCSIFFLAGS, ~IFF_UP) on a AF_UNIX socket. This results in a call to dev_ifsioc() in the kernel with only rtnl_lock() held and it is not in RCU read critical section. ~Girish How do you reproduce it? And what is your setup? Vlan device on top of your eth0 (e1000)?
Re: [PATCH] drivers: net: wimax: i2400m: i2400m-usb: Use time_after for time comparison
On 5/8/17 2:26 PM, David Miller wrote: From: Karim EshapaDate: Mon, 8 May 2017 18:58:02 +0200 diff --git a/drivers/net/wimax/i2400m/i2400m-usb.h b/drivers/net/wimax/i2400m/i2400m-usb.h index 649ecad..6fc941c 100644 --- a/drivers/net/wimax/i2400m/i2400m-usb.h +++ b/drivers/net/wimax/i2400m/i2400m-usb.h @@ -131,7 +131,7 @@ static inline int edc_inc(struct edc *edc, u16 max_err, u16 timeframe) unsigned long now; now = jiffies; - if (now - edc->timestart > timeframe) { + if (time_after(now - edc->timestart, (unsigned long)timeframe)) { This is far from correct. time_after() compares two "jiffies" timestamp values. The second argument here is not a jiffies timestamp value. Perhaps, what is needed here is: + if (time_after(jiffies, edc->timestart + timeframe)) { where in 'timeframe' is set in HZ in all the callers I checked (for the most part it is set to EDC_ERROR_TIMEFRAME which is 1HZ). ~Girish
Re: [PATCH] drivers: net: wimax: i2400m: i2400m-usb: Use time_after for time comparison
On 5/8/17 2:26 PM, David Miller wrote: From: Karim Eshapa Date: Mon, 8 May 2017 18:58:02 +0200 diff --git a/drivers/net/wimax/i2400m/i2400m-usb.h b/drivers/net/wimax/i2400m/i2400m-usb.h index 649ecad..6fc941c 100644 --- a/drivers/net/wimax/i2400m/i2400m-usb.h +++ b/drivers/net/wimax/i2400m/i2400m-usb.h @@ -131,7 +131,7 @@ static inline int edc_inc(struct edc *edc, u16 max_err, u16 timeframe) unsigned long now; now = jiffies; - if (now - edc->timestart > timeframe) { + if (time_after(now - edc->timestart, (unsigned long)timeframe)) { This is far from correct. time_after() compares two "jiffies" timestamp values. The second argument here is not a jiffies timestamp value. Perhaps, what is needed here is: + if (time_after(jiffies, edc->timestart + timeframe)) { where in 'timeframe' is set in HZ in all the callers I checked (for the most part it is set to EDC_ERROR_TIMEFRAME which is 1HZ). ~Girish