(Cc'ing netdev and Willem)

On Wed, Mar 28, 2018 at 12:03 PM, Byoungyoung Lee
<byoungyo...@purdue.edu> wrote:
> Another crash patterns observed: race between (setsockopt$packet_int)
> and (bind$packet).
>
> ------------------------------
> [  357.731597] kernel BUG at
> /home/blee/project/race-fuzzer/kernels/kernel_v4.16-rc3/net/packet/af_packet.c:3107!
> [  357.733382] invalid opcode: 0000 [#1] SMP KASAN
> [  357.734017] Modules linked in:
> [  357.734662] CPU: 1 PID: 3871 Comm: repro.exe Not tainted 4.16.0-rc3 #1
> [  357.735791] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
> [  357.737434] RIP: 0010:packet_do_bind+0x88d/0x950
> [  357.738121] RSP: 0018:ffff8800b2787b08 EFLAGS: 00010293
> [  357.738906] RAX: ffff8800b2fdc780 RBX: ffff880234358cc0 RCX: 
> ffffffff838b244c
> [  357.739905] RDX: 0000000000000000 RSI: ffffffff838b257d RDI: 
> 0000000000000001
> [  357.741315] RBP: ffff8800b2787c10 R08: ffff8800b2fdc780 R09: 
> 0000000000000000
> [  357.743055] R10: 0000000000000001 R11: 0000000000000000 R12: 
> ffff88023352ecc0
> [  357.744744] R13: 0000000000000000 R14: 0000000000000001 R15: 
> 0000000000001d00
> [  357.746377] FS:  00007f4b43733700(0000) GS:ffff8800b8b00000(0000)
> knlGS:0000000000000000
> [  357.749599] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  357.752096] CR2: 0000000020058000 CR3: 00000002334b8000 CR4: 
> 00000000000006e0
> [  357.755045] Call Trace:
> [  357.755822]  ? compat_packet_setsockopt+0x100/0x100
> [  357.757324]  ? __sanitizer_cov_trace_const_cmp8+0x18/0x20
> [  357.758810]  packet_bind+0xa2/0xe0
> [  357.759640]  SYSC_bind+0x279/0x2f0
> [  357.760364]  ? move_addr_to_kernel.part.19+0xc0/0xc0
> [  357.761491]  ? __handle_mm_fault+0x25d0/0x25d0
> [  357.762449]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
> [  357.763663]  ? __do_page_fault+0x417/0xba0
> [  357.764569]  ? vmalloc_fault+0x910/0x910
> [  357.765405]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
> [  357.766525]  ? mark_held_locks+0x25/0xb0
> [  357.767336]  ? SyS_socketpair+0x4a0/0x4a0
> [  357.768182]  SyS_bind+0x24/0x30
> [  357.768851]  do_syscall_64+0x209/0x5d0
> [  357.769650]  ? syscall_return_slowpath+0x3e0/0x3e0
> [  357.770665]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
> [  357.771779]  ? syscall_return_slowpath+0x260/0x3e0
> [  357.772748]  ? mark_held_locks+0x25/0xb0
> [  357.773581]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
> [  357.774720]  ? retint_user+0x18/0x18
> [  357.775493]  ? trace_hardirqs_off_caller+0xb5/0x120
> [  357.776567]  ? trace_hardirqs_off_thunk+0x1a/0x1c
> [  357.777512]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> [  357.778508] RIP: 0033:0x4503a9
> [  357.779156] RSP: 002b:00007f4b43732ce8 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000031
> [  357.780737] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 
> 00000000004503a9
> [  357.782169] RDX: 0000000000000014 RSI: 0000000020058000 RDI: 
> 0000000000000003
> [  357.783710] RBP: 00007f4b43732d10 R08: 0000000000000000 R09: 
> 0000000000000000
> [  357.785202] R10: 0000000000000000 R11: 0000000000000246 R12: 
> 0000000000000000
> [  357.786664] R13: 0000000000000000 R14: 00007f4b437339c0 R15: 
> 00007f4b43733700
> [  357.788210] Code: c0 fd 48 c7 c2 00 c8 d9 84 be ab 02 00 00 48 c7
> c7 60 c8 d9 84 c6 05 e7 a2 48 02 01 e8 3f 17 af fd e9 60 fb ff ff e8
> 43 b3 c0 fd <0f> 0b e8 3c b3 c0 fd 48 8b bd 20 ff ff ff e8 60 1e e7 fd
> 4c 89
> [  357.792260] RIP: packet_do_bind+0x88d/0x950 RSP: ffff8800b2787b08
> [  357.793698] ---[ end trace 0c5a2539f0247369 ]---
> [  357.794696] Kernel panic - not syncing: Fatal exception
> [  357.795918] Kernel Offset: disabled
> [  357.796614] Rebooting in 86400 seconds..
>
> On Wed, Mar 28, 2018 at 1:19 AM, Byoungyoung Lee <byoungyo...@purdue.edu> 
> wrote:
>> We report the crash: WARNING in refcount_dec
>>
>> This crash has been found in v4.16-rc3 using RaceFuzzer (a modified
>> version of Syzkaller), which we describe more at the end of this
>> report. Our analysis shows that the race occurs when invoking two
>> syscalls concurrently, (setsockopt$packet_int) and
>> (setsockopt$packet_rx_ring).
>>
>> C repro code : 
>> https://kiwi.cs.purdue.edu/static/race-fuzzer/repro-refcount_dec.c
>> kernel config: 
>> https://kiwi.cs.purdue.edu/static/race-fuzzer/kernel-config-v4.16-rc3


I tried your reproducer, no luck here.



>>
>> ---------------------------------------
>> [  305.838560] refcount_t: decrement hit 0; leaking memory.
>> [  305.839669] WARNING: CPU: 0 PID: 3867 at
>> /home/blee/project/race-fuzzer/kernels/kernel_v4.16-rc3/lib/refcount.c:228
>> refcount_dec+0x62/0x70
>> [  305.841441] Modules linked in:
>> [  305.841883] CPU: 0 PID: 3867 Comm: repro.exe Not tainted 4.16.0-rc3 #1
>> [  305.842803] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
>> [  305.844345] RIP: 0010:refcount_dec+0x62/0x70
>> [  305.845005] RSP: 0018:ffff880224d374f8 EFLAGS: 00010286
>> [  305.845802] RAX: 000000000000002c RBX: 0000000000000000 RCX: 
>> ffffffff81538991
>> [  305.846768] RDX: 0000000000000000 RSI: ffffffff813cd761 RDI: 
>> 0000000000000005
>> [  305.847748] RBP: ffff880224d37500 R08: ffff88023169a440 R09: 
>> 0000000000000000
>> [  305.848748] R10: 0000000000000000 R11: 0000000000000000 R12: 
>> ffff88023473ad40
>> [  305.849738] R13: ffff88023473b368 R14: 0000000000000001 R15: 
>> 0000000000000000
>> [  305.850733] FS:  0000000000c6e940(0000) GS:ffff8800b8a00000(0000)
>> knlGS:0000000000000000
>> [  305.851837] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  305.852652] CR2: 00007fb120571db8 CR3: 0000000005422000 CR4: 
>> 00000000000006f0
>> [  305.853619] Call Trace:
>> [  305.854086]  __unregister_prot_hook+0x15f/0x1d0
>> [  305.854722]  packet_release+0x77a/0x7a0
>> [  305.855335]  ? mark_held_locks+0x25/0xb0
>> [  305.855883]  ? packet_lookup_frame+0x110/0x110
>> [  305.856576]  ? __lock_is_held+0x39/0xc0
>> [  305.857109]  ? note_gp_changes+0x300/0x300
>> [  305.857745]  ? __sanitizer_cov_trace_const_cmp8+0x18/0x20
>> [  305.858548]  ? locks_remove_file+0x31b/0x420
>> [  305.859138]  ? fcntl_setlk+0xad0/0xad0
>> [  305.859743]  ? trace_event_raw_event_sched_switch+0x320/0x320
>> [  305.860534]  ? fsnotify_first_mark+0x2c0/0x2c0
>> [  305.861234]  sock_release+0x53/0xe0
>> [  305.861711]  ? sock_alloc_file+0x2c0/0x2c0
>> [  305.862334]  sock_close+0x16/0x20
>> [  305.862801]  __fput+0x246/0x4e0
>> [  305.863310]  ? fput+0x130/0x130
>> [  305.863743]  ? trace_event_raw_event_sched_switch+0x320/0x320
>> [  305.864604]  ____fput+0x15/0x20
>> [  305.865046]  task_work_run+0x1a5/0x200
>> [  305.865636]  ? kmem_cache_free+0x25c/0x2d0
>> [  305.866194]  ? task_work_cancel+0x1a0/0x1a0
>> [  305.866829]  ? switch_task_namespaces+0x9e/0xb0
>> [  305.867458]  do_exit+0xacf/0x10d0
>> [  305.868023]  ? mm_update_next_owner+0x650/0x650
>> [  305.868642]  ? __pv_queued_spin_lock_slowpath+0xbf0/0xbf0
>> [  305.869427]  ? trace_hardirqs_on_caller+0x136/0x2a0
>> [  305.870102]  ? trace_hardirqs_on+0xd/0x10
>> [  305.870701]  ? wake_up_new_task+0x41c/0x650
>> [  305.871324]  ? to_ratio+0x20/0x20
>> [  305.871816]  ? lock_release+0x530/0x530
>> [  305.872341]  ? trace_event_raw_event_sched_switch+0x320/0x320
>> [  305.873161]  ? match_held_lock+0x7e/0x360
>> [  305.873777]  ? trace_hardirqs_off+0x10/0x10
>> [  305.874359]  ? put_pid+0x111/0x140
>> [  305.874897]  ? task_active_pid_ns+0x70/0x70
>> [  305.875500]  ? find_held_lock+0xca/0xf0
>> [  305.876118]  ? do_group_exit+0x1f9/0x260
>> [  305.876650]  ? lock_downgrade+0x380/0x380
>> [  305.877297]  ? task_clear_jobctl_pending+0xb5/0xd0
>> [  305.877951]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
>> [  305.878725]  ? do_raw_spin_unlock+0x112/0x1a0
>> [  305.879309]  ? do_raw_spin_trylock+0x100/0x100
>> [  305.879969]  ? mark_held_locks+0x25/0xb0
>> [  305.880505]  ? force_sig+0x30/0x30
>> [  305.881054]  ? _raw_spin_unlock_irq+0x27/0x50
>> [  305.881671]  ? trace_hardirqs_on_caller+0x136/0x2a0
>> [  305.882412]  do_group_exit+0xfb/0x260
>> [  305.882945]  ? SyS_exit+0x30/0x30
>> [  305.883442]  ? find_mergeable_anon_vma+0x90/0x90
>> [  305.884103]  ? SyS_read+0x1c0/0x1c0
>> [  305.884785]  ? mark_held_locks+0x25/0xb0
>> [  305.885503]  ? do_syscall_64+0xb2/0x5d0
>> [  305.886093]  ? do_group_exit+0x260/0x260
>> [  305.886741]  SyS_exit_group+0x1d/0x20
>> [  305.887255]  do_syscall_64+0x209/0x5d0
>> [  305.887888]  ? syscall_return_slowpath+0x3e0/0x3e0
>> [  305.888611]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
>> [  305.889420]  ? syscall_return_slowpath+0x260/0x3e0
>> [  305.890188]  ? mark_held_locks+0x25/0xb0
>> [  305.890724]  ? entry_SYSCALL_64_after_hwframe+0x52/0xb7
>> [  305.891556]  ? trace_hardirqs_off_caller+0xb5/0x120
>> [  305.892265]  ? trace_hardirqs_off_thunk+0x1a/0x1c
>> [  305.892939]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
>> [  305.893676] RIP: 0033:0x44d989
>> [  305.894100] RSP: 002b:00000000007fff38 EFLAGS: 00000202 ORIG_RAX:
>> 00000000000000e7
>> [  305.895158] RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 
>> 000000000044d989
>> [  305.896174] RDX: 00007fb120d739c0 RSI: 00007fb120572700 RDI: 
>> 0000000000000001
>> [  305.897161] RBP: 00000000007fff60 R08: 00007fb120572700 R09: 
>> 0000000000000000
>> [  305.898128] R10: 0000000000000000 R11: 0000000000000202 R12: 
>> 0000000000000000
>> [  305.899464] R13: 000000000040d270 R14: 000000000040d300 R15: 
>> 0000000000000000
>> [  305.900823] Code: b6 1d 81 9a ef 03 31 ff 89 de e8 ca a3 67 ff 84
>> db 75 df e8 f1 a2 67 ff 48 c7 c7 60 8f 83 84 c6 05 61 9a ef 03 01 e8
>> ee 5f 49 ff <0f> 0b eb c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41
>> 57 49
>> [  305.904324] ---[ end trace 360c084b02d93021 ]---
>> [  305.919117] ------------[ cut here ]------------
>> [  305.920120] refcount_t: underflow; use-after-free.
>> [  305.921335] WARNING: CPU: 0 PID: 3867 at
>> /home/blee/project/race-fuzzer/kernels/kernel_v4.16-rc3/lib/refcount.c:187
>> refcount_sub_and_test+0x1ec/0x200
>> [  305.923927] Modules linked in:
>> [  305.924611] CPU: 0 PID: 3867 Comm: repro.exe Tainted: G        W
>>     4.16.0-rc3 #1
>> [  305.925987] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
>> [  305.928119] RIP: 0010:refcount_sub_and_test+0x1ec/0x200
>> [  305.929124] RSP: 0018:ffff880224d374a0 EFLAGS: 00010282
>> [  305.930161] RAX: 0000000000000026 RBX: 0000000000000000 RCX: 
>> ffffffff813c9644
>> [  305.931504] RDX: 0000000000000000 RSI: ffffffff813cd761 RDI: 
>> ffff880224d37018
>> [  305.932942] RBP: ffff880224d37538 R08: ffff88023169a440 R09: 
>> 0000000000000000
>> [  305.934365] R10: 0000000000000000 R11: 0000000000000000 R12: 
>> 00000000ffffffff
>> [  305.935734] R13: ffff88023473adc0 R14: 0000000000000001 R15: 
>> 1ffff100449a6e96
>> [  305.937114] FS:  0000000000c6e940(0000) GS:ffff8800b8a00000(0000)
>> knlGS:0000000000000000
>> [  305.938668] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  305.939768] CR2: 00007fb120571db8 CR3: 0000000005422000 CR4: 
>> 00000000000006f0
>> [  305.941212] Call Trace:
>> [  305.941689]  ? refcount_inc+0x70/0x70
>> [  305.942216]  ? skb_dequeue+0xa5/0xc0
>> [  305.942713]  refcount_dec_and_test+0x1a/0x20
>> [  305.943295]  packet_release+0x702/0x7a0
>> [  305.943816]  ? mark_held_locks+0x25/0xb0
>> [  305.944378]  ? packet_lookup_frame+0x110/0x110
>> [  305.945021]  ? __lock_is_held+0x39/0xc0
>> [  305.945561]  ? note_gp_changes+0x300/0x300
>> [  305.946132]  ? __sanitizer_cov_trace_const_cmp8+0x18/0x20
>> [  305.946866]  ? locks_remove_file+0x31b/0x420
>> [  305.947464]  ? fcntl_setlk+0xad0/0xad0
>> [  305.948000]  ? trace_event_raw_event_sched_switch+0x320/0x320
>> [  305.948781]  ? fsnotify_first_mark+0x2c0/0x2c0
>> [  305.949386]  sock_release+0x53/0xe0
>> [  305.949866]  ? sock_alloc_file+0x2c0/0x2c0
>> [  305.950437]  sock_close+0x16/0x20
>> [  305.950906]  __fput+0x246/0x4e0
>> [  305.951360]  ? fput+0x130/0x130
>> [  305.951807]  ? trace_event_raw_event_sched_switch+0x320/0x320
>> [  305.952620]  ____fput+0x15/0x20
>> [  305.953071]  task_work_run+0x1a5/0x200
>> [  305.953585]  ? kmem_cache_free+0x25c/0x2d0
>> [  305.954143]  ? task_work_cancel+0x1a0/0x1a0
>> [  305.954714]  ? switch_task_namespaces+0x9e/0xb0
>> [  305.955334]  do_exit+0xacf/0x10d0
>> [  305.955801]  ? mm_update_next_owner+0x650/0x650
>> [  305.956431]  ? __pv_queued_spin_lock_slowpath+0xbf0/0xbf0
>> [  305.957157]  ? trace_hardirqs_on_caller+0x136/0x2a0
>> [  305.957811]  ? trace_hardirqs_on+0xd/0x10
>> [  305.958360]  ? wake_up_new_task+0x41c/0x650
>> [  305.958937]  ? to_ratio+0x20/0x20
>> [  305.959391]  ? lock_release+0x530/0x530
>> [  305.959924]  ? trace_event_raw_event_sched_switch+0x320/0x320
>> [  305.960693]  ? match_held_lock+0x7e/0x360
>> [  305.961244]  ? trace_hardirqs_off+0x10/0x10
>> [  305.961810]  ? put_pid+0x111/0x140
>> [  305.962277]  ? task_active_pid_ns+0x70/0x70
>> [  305.962862]  ? find_held_lock+0xca/0xf0
>> [  305.963396]  ? do_group_exit+0x1f9/0x260
>> [  305.963933]  ? lock_downgrade+0x380/0x380
>> [  305.964508]  ? task_clear_jobctl_pending+0xb5/0xd0
>> [  305.965147]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
>> [  305.965871]  ? do_raw_spin_unlock+0x112/0x1a0
>> [  305.966459]  ? do_raw_spin_trylock+0x100/0x100
>> [  305.967060]  ? mark_held_locks+0x25/0xb0
>> [  305.967592]  ? force_sig+0x30/0x30
>> [  305.968135]  ? _raw_spin_unlock_irq+0x27/0x50
>> [  305.968741]  ? trace_hardirqs_on_caller+0x136/0x2a0
>> [  305.969470]  do_group_exit+0xfb/0x260
>> [  305.969987]  ? SyS_exit+0x30/0x30
>> [  305.970505]  ? find_mergeable_anon_vma+0x90/0x90
>> [  305.971126]  ? SyS_read+0x1c0/0x1c0
>> [  305.971718]  ? mark_held_locks+0x25/0xb0
>> [  305.972259]  ? do_syscall_64+0xb2/0x5d0
>> [  305.972843]  ? do_group_exit+0x260/0x260
>> [  305.973374]  SyS_exit_group+0x1d/0x20
>> [  305.973932]  do_syscall_64+0x209/0x5d0
>> [  305.974452]  ? syscall_return_slowpath+0x3e0/0x3e0
>> [  305.975149]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
>> [  305.975941]  ? syscall_return_slowpath+0x260/0x3e0
>> [  305.976669]  ? mark_held_locks+0x25/0xb0
>> [  305.977206]  ? entry_SYSCALL_64_after_hwframe+0x52/0xb7
>> [  305.977978]  ? trace_hardirqs_off_caller+0xb5/0x120
>> [  305.978690]  ? trace_hardirqs_off_thunk+0x1a/0x1c
>> [  305.979381]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
>> [  305.980114] RIP: 0033:0x44d989
>> [  305.980531] RSP: 002b:00000000007fff38 EFLAGS: 00000202 ORIG_RAX:
>> 00000000000000e7
>> [  305.981664] RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 
>> 000000000044d989
>> [  305.982655] RDX: 00007fb120d739c0 RSI: 00007fb120572700 RDI: 
>> 0000000000000001
>> [  305.983654] RBP: 00000000007fff60 R08: 00007fb120572700 R09: 
>> 0000000000000000
>> [  305.984656] R10: 0000000000000000 R11: 0000000000000202 R12: 
>> 0000000000000000
>> [  305.985707] R13: 000000000040d270 R14: 000000000040d300 R15: 
>> 0000000000000000
>> [  305.986724] Code: b6 1d 18 9b ef 03 31 ff 89 de e8 60 a4 67 ff 84
>> db 75 1a e8 87 a3 67 ff 48 c7 c7 00 8f 83 84 c6 05 f8 9a ef 03 01 e8
>> 84 60 49 ff <0f> 0b 31 db e9 2b ff ff ff 66 66 2e 0f 1f 84 00 00 00 00
>> 00 55
>> [  305.990106] ---[ end trace 360c084b02d93022 ]---
>> [  305.998636] IPVS: ftp: loaded support on port[0] = 21
>> ---------------------------------------
>>
>> = About RaceFuzzer
>>
>> RaceFuzzer is a customized version of Syzkaller, specifically tailored
>> to find race condition bugs in the Linux kernel. While we leverage
>> many different technique, the notable feature of RaceFuzzer is in
>> leveraging a custom hypervisor (QEMU/KVM) to interleave the
>> scheduling. In particular, we modified the hypervisor to intentionally
>> stall a per-core execution, which is similar to supporting per-core
>> breakpoint functionality. This allows RaceFuzzer to force the kernel
>> to deterministically trigger racy condition (which may rarely happen
>> in practice due to randomness in scheduling).
>>
>> RaceFuzzer's C repro always pinpoints two racy syscalls. Since C
>> repro's scheduling synchronization should be performed at the user
>> space, its reproducibility is limited (reproduction may take from 1
>> second to 10 minutes (or even more), depending on a bug). This is
>> because, while RaceFuzzer precisely interleaves the scheduling at the
>> kernel's instruction level when finding this bug, C repro cannot fully
>> utilize such a feature. Please disregard all code related to
>> "should_hypercall" in the C repro, as this is only for our debugging
>> purposes using our own hypervisor.

Reply via email to