Re: [PATCH net] cls_u32: fix use after free in u32_destroy_key()

2018-02-02 Thread Paolo Abeni
On Fri, 2018-02-02 at 14:20 +0100, Paolo Abeni wrote:
> Li Shuang reported an Oops with cls_u32 due to an use-after-free
> in u32_destroy_key(). The use-after-free can be triggered with:
> 
> dev=lo
> tc qdisc add dev $dev root handle 1: htb default 10
> tc filter add dev $dev parent 1: prio 5 handle 1: protocol ip u32 divisor 256
> tc filter add dev $dev protocol ip parent 1: prio 5 u32 ht 800:: match ip dst\
>  10.0.0.0/8 hashkey mask 0xff00 at 16 link 1:
> tc qdisc del dev $dev root
> 
> Which causes the following kasan splat:
> 
>  ==
>  BUG: KASAN: use-after-free in u32_destroy_key.constprop.21+0x117/0x140 
> [cls_u32]
>  Read of size 4 at addr 881b83dae618 by task kworker/u48:5/571
> 
>  CPU: 17 PID: 571 Comm: kworker/u48:5 Not tainted 4.15.0+ #87
>  Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
>  Workqueue: tc_filter_workqueue u32_delete_key_freepf_work [cls_u32]
>  Call Trace:
>   dump_stack+0xd6/0x182
>   ? dma_virt_map_sg+0x22e/0x22e
>   print_address_description+0x73/0x290
>   kasan_report+0x277/0x360
>   ? u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
>   u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
>   u32_delete_key_freepf_work+0x1c/0x30 [cls_u32]
>   process_one_work+0xae0/0x1c80
>   ? sched_clock+0x5/0x10
>   ? pwq_dec_nr_in_flight+0x3c0/0x3c0
>   ? _raw_spin_unlock_irq+0x29/0x40
>   ? trace_hardirqs_on_caller+0x381/0x570
>   ? _raw_spin_unlock_irq+0x29/0x40
>   ? finish_task_switch+0x1e5/0x760
>   ? finish_task_switch+0x208/0x760
>   ? preempt_notifier_dec+0x20/0x20
>   ? __schedule+0x839/0x1ee0
>   ? check_noncircular+0x20/0x20
>   ? firmware_map_remove+0x73/0x73
>   ? find_held_lock+0x39/0x1c0
>   ? worker_thread+0x434/0x1820
>   ? lock_contended+0xee0/0xee0
>   ? lock_release+0x1100/0x1100
>   ? init_rescuer.part.16+0x150/0x150
>   ? retint_kernel+0x10/0x10
>   worker_thread+0x216/0x1820
>   ? process_one_work+0x1c80/0x1c80
>   ? lock_acquire+0x1a5/0x540
>   ? lock_downgrade+0x6b0/0x6b0
>   ? sched_clock+0x5/0x10
>   ? lock_release+0x1100/0x1100
>   ? compat_start_thread+0x80/0x80
>   ? do_raw_spin_trylock+0x190/0x190
>   ? _raw_spin_unlock_irq+0x29/0x40
>   ? trace_hardirqs_on_caller+0x381/0x570
>   ? _raw_spin_unlock_irq+0x29/0x40
>   ? finish_task_switch+0x1e5/0x760
>   ? finish_task_switch+0x208/0x760
>   ? preempt_notifier_dec+0x20/0x20
>   ? __schedule+0x839/0x1ee0
>   ? kmem_cache_alloc_trace+0x143/0x320
>   ? firmware_map_remove+0x73/0x73
>   ? sched_clock+0x5/0x10
>   ? sched_clock_cpu+0x18/0x170
>   ? find_held_lock+0x39/0x1c0
>   ? schedule+0xf3/0x3b0
>   ? lock_downgrade+0x6b0/0x6b0
>   ? __schedule+0x1ee0/0x1ee0
>   ? do_wait_intr_irq+0x340/0x340
>   ? do_raw_spin_trylock+0x190/0x190
>   ? _raw_spin_unlock_irqrestore+0x32/0x60
>   ? process_one_work+0x1c80/0x1c80
>   ? process_one_work+0x1c80/0x1c80
>   kthread+0x312/0x3d0
>   ? kthread_create_worker_on_cpu+0xc0/0xc0
>   ret_from_fork+0x3a/0x50
> 
>  Allocated by task 1688:
>   kasan_kmalloc+0xa0/0xd0
>   __kmalloc+0x162/0x380
>   u32_change+0x1220/0x3c9e [cls_u32]
>   tc_ctl_tfilter+0x1ba6/0x2f80
>   rtnetlink_rcv_msg+0x4f0/0x9d0
>   netlink_rcv_skb+0x124/0x320
>   netlink_unicast+0x430/0x600
>   netlink_sendmsg+0x8fa/0xd60
>   sock_sendmsg+0xb1/0xe0
>   ___sys_sendmsg+0x678/0x980
>   __sys_sendmsg+0xc4/0x210
>   do_syscall_64+0x232/0x7f0
>   return_from_SYSCALL_64+0x0/0x75
> 
>  Freed by task 112:
>   kasan_slab_free+0x71/0xc0
>   kfree+0x114/0x320
>   rcu_process_callbacks+0xc3f/0x1600
>   __do_softirq+0x2bf/0xc06
> 
>  The buggy address belongs to the object at 881b83dae600
>   which belongs to the cache kmalloc-4096 of size 4096
>  The buggy address is located 24 bytes inside of
>   4096-byte region [881b83dae600, 881b83daf600)
>  The buggy address belongs to the page:
>  page:ea006e0f6a00 count:1 mapcount:0 mapping:  (null) index:0x0 
> compound_mapcount: 0
>  flags: 0x17c0008100(slab|head)
>  raw: 0017c0008100   000100070007
>  raw: dead0100 dead0200 880187c0e600 
>  page dumped because: kasan: bad access detected
> 
>  Memory state around the buggy address:
>   881b83dae500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>   881b83dae580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  >881b83dae600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ^
>   881b83dae680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>   881b83dae700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ==
> 
> The problem is that the htnode is freed before the linked knodes and the
> latter will try to access the first at u32_destroy_key() time.
> This change addresses the issue using the htnode refcnt to guarantee
> the correct free order. While at it also add a pedantic and possibly
> unneeded RCU annotation, 

[PATCH net] cls_u32: fix use after free in u32_destroy_key()

2018-02-02 Thread Paolo Abeni
Li Shuang reported an Oops with cls_u32 due to an use-after-free
in u32_destroy_key(). The use-after-free can be triggered with:

dev=lo
tc qdisc add dev $dev root handle 1: htb default 10
tc filter add dev $dev parent 1: prio 5 handle 1: protocol ip u32 divisor 256
tc filter add dev $dev protocol ip parent 1: prio 5 u32 ht 800:: match ip dst\
 10.0.0.0/8 hashkey mask 0xff00 at 16 link 1:
tc qdisc del dev $dev root

Which causes the following kasan splat:

 ==
 BUG: KASAN: use-after-free in u32_destroy_key.constprop.21+0x117/0x140 
[cls_u32]
 Read of size 4 at addr 881b83dae618 by task kworker/u48:5/571

 CPU: 17 PID: 571 Comm: kworker/u48:5 Not tainted 4.15.0+ #87
 Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
 Workqueue: tc_filter_workqueue u32_delete_key_freepf_work [cls_u32]
 Call Trace:
  dump_stack+0xd6/0x182
  ? dma_virt_map_sg+0x22e/0x22e
  print_address_description+0x73/0x290
  kasan_report+0x277/0x360
  ? u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
  u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
  u32_delete_key_freepf_work+0x1c/0x30 [cls_u32]
  process_one_work+0xae0/0x1c80
  ? sched_clock+0x5/0x10
  ? pwq_dec_nr_in_flight+0x3c0/0x3c0
  ? _raw_spin_unlock_irq+0x29/0x40
  ? trace_hardirqs_on_caller+0x381/0x570
  ? _raw_spin_unlock_irq+0x29/0x40
  ? finish_task_switch+0x1e5/0x760
  ? finish_task_switch+0x208/0x760
  ? preempt_notifier_dec+0x20/0x20
  ? __schedule+0x839/0x1ee0
  ? check_noncircular+0x20/0x20
  ? firmware_map_remove+0x73/0x73
  ? find_held_lock+0x39/0x1c0
  ? worker_thread+0x434/0x1820
  ? lock_contended+0xee0/0xee0
  ? lock_release+0x1100/0x1100
  ? init_rescuer.part.16+0x150/0x150
  ? retint_kernel+0x10/0x10
  worker_thread+0x216/0x1820
  ? process_one_work+0x1c80/0x1c80
  ? lock_acquire+0x1a5/0x540
  ? lock_downgrade+0x6b0/0x6b0
  ? sched_clock+0x5/0x10
  ? lock_release+0x1100/0x1100
  ? compat_start_thread+0x80/0x80
  ? do_raw_spin_trylock+0x190/0x190
  ? _raw_spin_unlock_irq+0x29/0x40
  ? trace_hardirqs_on_caller+0x381/0x570
  ? _raw_spin_unlock_irq+0x29/0x40
  ? finish_task_switch+0x1e5/0x760
  ? finish_task_switch+0x208/0x760
  ? preempt_notifier_dec+0x20/0x20
  ? __schedule+0x839/0x1ee0
  ? kmem_cache_alloc_trace+0x143/0x320
  ? firmware_map_remove+0x73/0x73
  ? sched_clock+0x5/0x10
  ? sched_clock_cpu+0x18/0x170
  ? find_held_lock+0x39/0x1c0
  ? schedule+0xf3/0x3b0
  ? lock_downgrade+0x6b0/0x6b0
  ? __schedule+0x1ee0/0x1ee0
  ? do_wait_intr_irq+0x340/0x340
  ? do_raw_spin_trylock+0x190/0x190
  ? _raw_spin_unlock_irqrestore+0x32/0x60
  ? process_one_work+0x1c80/0x1c80
  ? process_one_work+0x1c80/0x1c80
  kthread+0x312/0x3d0
  ? kthread_create_worker_on_cpu+0xc0/0xc0
  ret_from_fork+0x3a/0x50

 Allocated by task 1688:
  kasan_kmalloc+0xa0/0xd0
  __kmalloc+0x162/0x380
  u32_change+0x1220/0x3c9e [cls_u32]
  tc_ctl_tfilter+0x1ba6/0x2f80
  rtnetlink_rcv_msg+0x4f0/0x9d0
  netlink_rcv_skb+0x124/0x320
  netlink_unicast+0x430/0x600
  netlink_sendmsg+0x8fa/0xd60
  sock_sendmsg+0xb1/0xe0
  ___sys_sendmsg+0x678/0x980
  __sys_sendmsg+0xc4/0x210
  do_syscall_64+0x232/0x7f0
  return_from_SYSCALL_64+0x0/0x75

 Freed by task 112:
  kasan_slab_free+0x71/0xc0
  kfree+0x114/0x320
  rcu_process_callbacks+0xc3f/0x1600
  __do_softirq+0x2bf/0xc06

 The buggy address belongs to the object at 881b83dae600
  which belongs to the cache kmalloc-4096 of size 4096
 The buggy address is located 24 bytes inside of
  4096-byte region [881b83dae600, 881b83daf600)
 The buggy address belongs to the page:
 page:ea006e0f6a00 count:1 mapcount:0 mapping:  (null) index:0x0 
compound_mapcount: 0
 flags: 0x17c0008100(slab|head)
 raw: 0017c0008100   000100070007
 raw: dead0100 dead0200 880187c0e600 
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  881b83dae500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  881b83dae580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 >881b83dae600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ^
  881b83dae680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  881b83dae700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ==

The problem is that the htnode is freed before the linked knodes and the
latter will try to access the first at u32_destroy_key() time.
This change addresses the issue using the htnode refcnt to guarantee
the correct free order. While at it also add a pedantic and possibly
unneeded RCU annotation, to keep sparse happy.

Reported-by: Li Shuang 
Fixes: c0d378ef1266 ("net_sched: use tcf_queue_work() in u32 filter")
Signed-off-by: Paolo Abeni 
---
 net/sched/cls_u32.c | 23 +++
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git