Re: [PATCH] drm/vkms: Fix soft lockup.
On Wed, Jul 29, 2020 at 02:23:15AM +, Xu Qiang wrote: > A soft deadlock occurs when call hrtimer_cancel in softirq context: > > a) The main frequency of the machine is very slow > b) output->period_ns is very small, even only 1 ns > > The problem can be solved in the following way: Setting a hrtimer exit flag > in the vkms_disable_vblank function and checking the flag in the > vkms_vblank_simulate function. If the flag is set, the hrtimer is not added > to hrtimer queue again,the hrtimer_cancel function can exit quickly > without deadlock. > > watchdog: BUG: soft lockup - CPU#2 stuck for 134s! [syz-executor.2:18027] > Modules linked in: > CPU: 2 PID: 18027 Comm: syz-executor.2 Tainted: GW > 5.8.0-rc2-csan #2 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > Ubuntu-1.8.2-1ubuntu1 04/01/2014 > RIP: 0010:csd_lock_wait kernel/smp.c:108 [inline] > RIP: 0010:smp_call_function_many_cond+0x40c/0x470 kernel/smp.c:555 > RSP: 0018:c900028d3a88 EFLAGS: 0297 > RAX: 0002 RBX: 888237cacc80 RCX: 813240fc > RDX: 0001 RSI: RDI: 0005 > RBP: 0001 R08: 8881f497d000 R09: > R10: R11: 0100 R12: 888237cf6fc0 > R13: 0003 R14: 888237cacc88 R15: 0001 > FS: 7fa6314a9700() GS:888237c8() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 00513c80 CR3: 0001e3df3002 CR4: 000626e0 > Call Trace: > smp_call_function_many kernel/smp.c:577 [inline] > smp_call_function+0x40/0x80 kernel/smp.c:599 > on_each_cpu+0x2a/0xc0 kernel/smp.c:717 > __purge_vmap_area_lazy+0x7c/0xbc0 mm/vmalloc.c:1367 > _vm_unmap_aliases.part.0+0x126/0x180 mm/vmalloc.c:1800 > _vm_unmap_aliases mm/vmalloc.c:1769 [inline] > vm_unmap_aliases+0x2f/0x40 mm/vmalloc.c:1823 > change_page_attr_set_clr+0x10a/0x4a0 arch/x86/mm/pat/set_memory.c:1732 > change_page_attr_clear arch/x86/mm/pat/set_memory.c:1789 [inline] > set_memory_ro+0x2b/0x40 arch/x86/mm/pat/set_memory.c:1935 > bpf_jit_binary_lock_ro include/linux/filter.h:815 [inline] > bpf_int_jit_compile+0x54f/0x663 arch/x86/net/bpf_jit_comp.c:1929 > bpf_prog_select_runtime+0x1cc/0x2c0 kernel/bpf/core.c:1807 > bpf_migrate_filter net/core/filter.c:1264 [inline] > bpf_prepare_filter net/core/filter.c:1312 [inline] > bpf_prepare_filter+0x518/0x620 net/core/filter.c:1278 > __get_filter+0x107/0x160 net/core/filter.c:1481 > sk_attach_filter+0x19/0xa0 net/core/filter.c:1496 > sock_setsockopt+0x1208/0x1230 net/core/sock.c:1080 > __sys_setsockopt+0x248/0x270 net/socket.c:2123 > __do_sys_setsockopt net/socket.c:2143 [inline] > __se_sys_setsockopt net/socket.c:2140 [inline] > __x64_sys_setsockopt+0x22/0x30 net/socket.c:2140 > do_syscall_64+0x48/0xb0 arch/x86/entry/common.c:359 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > Sending NMI from CPU 2 to CPUs 0-1,3-5: > NMI backtrace for cpu 4 skipped: idling at native_safe_halt+0xe/0x10 > arch/x86/include/asm/irqflags.h:60 > NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x10 > arch/x86/include/asm/irqflags.h:60 > NMI backtrace for cpu 1 > NMI backtrace for cpu 3 > CPU: 3 PID: 0 Comm: swapper/3 Tainted: GW 5.8.0-rc2-csan #2 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > Ubuntu-1.8.2-1ubuntu1 04/01/2014 > RIP: 0010:preempt_count arch/x86/include/asm/preempt.h:26 [inline] > RIP: 0010:preempt_count_sub+0xa/0x90 kernel/sched/core.c:3874 > RSP: 0018:c912cdd0 EFLAGS: 0046 > RAX: 0001 RBX: 88822ecb8fb0 RCX: > RDX: RSI: 0086 RDI: 0001 > RBP: 888237d5edc0 R08: 888236ddc000 R09: > R10: R11: R12: > R13: 0086 R14: 888237d5edc0 R15: 85bc6d60 > FS: () GS:888237cc() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 7f35cc47adb8 CR3: 05a23006 CR4: 000626e0 > Call Trace: > > __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:161 [inline] > _raw_spin_unlock_irqrestore+0x2f/0x50 kernel/locking/spinlock.c:191 > unlock_hrtimer_base kernel/time/hrtimer.c:898 [inline] > hrtimer_try_to_cancel kernel/time/hrtimer.c:1171 [inline] > hrtimer_try_to_cancel+0xbd/0x1b0 kernel/time/hrtimer.c:1151 > hrtimer_cancel+0x13/0x40 kernel/time/hrtimer.c:1278 > __disable_vblank drivers/gpu/drm/drm_vblank.c:429 [inline] > drm_vblank_disable_and_save+0x122/0x140 drivers/gpu/drm/drm_vblank.c:470 > vblank_disable_fn+0x96/0xa0 drivers/gpu/drm/drm_vblank.c:487 > call_timer_fn+0x3a/0x230 kernel/time/timer.c:1404 > expire_timers kernel/time/timer.c:1449 [inline] > __run_timers kernel/time/timer.c:1773 [inline] > __run_timers kernel/time/timer.c:1740 [inline] > run_timer_softirq+0x2b8/0x840
[PATCH] drm/vkms: Fix soft lockup.
A soft deadlock occurs when call hrtimer_cancel in softirq context: a) The main frequency of the machine is very slow b) output->period_ns is very small, even only 1 ns The problem can be solved in the following way: Setting a hrtimer exit flag in the vkms_disable_vblank function and checking the flag in the vkms_vblank_simulate function. If the flag is set, the hrtimer is not added to hrtimer queue again,the hrtimer_cancel function can exit quickly without deadlock. watchdog: BUG: soft lockup - CPU#2 stuck for 134s! [syz-executor.2:18027] Modules linked in: CPU: 2 PID: 18027 Comm: syz-executor.2 Tainted: GW 5.8.0-rc2-csan #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 RIP: 0010:csd_lock_wait kernel/smp.c:108 [inline] RIP: 0010:smp_call_function_many_cond+0x40c/0x470 kernel/smp.c:555 RSP: 0018:c900028d3a88 EFLAGS: 0297 RAX: 0002 RBX: 888237cacc80 RCX: 813240fc RDX: 0001 RSI: RDI: 0005 RBP: 0001 R08: 8881f497d000 R09: R10: R11: 0100 R12: 888237cf6fc0 R13: 0003 R14: 888237cacc88 R15: 0001 FS: 7fa6314a9700() GS:888237c8() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 00513c80 CR3: 0001e3df3002 CR4: 000626e0 Call Trace: smp_call_function_many kernel/smp.c:577 [inline] smp_call_function+0x40/0x80 kernel/smp.c:599 on_each_cpu+0x2a/0xc0 kernel/smp.c:717 __purge_vmap_area_lazy+0x7c/0xbc0 mm/vmalloc.c:1367 _vm_unmap_aliases.part.0+0x126/0x180 mm/vmalloc.c:1800 _vm_unmap_aliases mm/vmalloc.c:1769 [inline] vm_unmap_aliases+0x2f/0x40 mm/vmalloc.c:1823 change_page_attr_set_clr+0x10a/0x4a0 arch/x86/mm/pat/set_memory.c:1732 change_page_attr_clear arch/x86/mm/pat/set_memory.c:1789 [inline] set_memory_ro+0x2b/0x40 arch/x86/mm/pat/set_memory.c:1935 bpf_jit_binary_lock_ro include/linux/filter.h:815 [inline] bpf_int_jit_compile+0x54f/0x663 arch/x86/net/bpf_jit_comp.c:1929 bpf_prog_select_runtime+0x1cc/0x2c0 kernel/bpf/core.c:1807 bpf_migrate_filter net/core/filter.c:1264 [inline] bpf_prepare_filter net/core/filter.c:1312 [inline] bpf_prepare_filter+0x518/0x620 net/core/filter.c:1278 __get_filter+0x107/0x160 net/core/filter.c:1481 sk_attach_filter+0x19/0xa0 net/core/filter.c:1496 sock_setsockopt+0x1208/0x1230 net/core/sock.c:1080 __sys_setsockopt+0x248/0x270 net/socket.c:2123 __do_sys_setsockopt net/socket.c:2143 [inline] __se_sys_setsockopt net/socket.c:2140 [inline] __x64_sys_setsockopt+0x22/0x30 net/socket.c:2140 do_syscall_64+0x48/0xb0 arch/x86/entry/common.c:359 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Sending NMI from CPU 2 to CPUs 0-1,3-5: NMI backtrace for cpu 4 skipped: idling at native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:60 NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:60 NMI backtrace for cpu 1 NMI backtrace for cpu 3 CPU: 3 PID: 0 Comm: swapper/3 Tainted: GW 5.8.0-rc2-csan #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 RIP: 0010:preempt_count arch/x86/include/asm/preempt.h:26 [inline] RIP: 0010:preempt_count_sub+0xa/0x90 kernel/sched/core.c:3874 RSP: 0018:c912cdd0 EFLAGS: 0046 RAX: 0001 RBX: 88822ecb8fb0 RCX: RDX: RSI: 0086 RDI: 0001 RBP: 888237d5edc0 R08: 888236ddc000 R09: R10: R11: R12: R13: 0086 R14: 888237d5edc0 R15: 85bc6d60 FS: () GS:888237cc() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f35cc47adb8 CR3: 05a23006 CR4: 000626e0 Call Trace: __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:161 [inline] _raw_spin_unlock_irqrestore+0x2f/0x50 kernel/locking/spinlock.c:191 unlock_hrtimer_base kernel/time/hrtimer.c:898 [inline] hrtimer_try_to_cancel kernel/time/hrtimer.c:1171 [inline] hrtimer_try_to_cancel+0xbd/0x1b0 kernel/time/hrtimer.c:1151 hrtimer_cancel+0x13/0x40 kernel/time/hrtimer.c:1278 __disable_vblank drivers/gpu/drm/drm_vblank.c:429 [inline] drm_vblank_disable_and_save+0x122/0x140 drivers/gpu/drm/drm_vblank.c:470 vblank_disable_fn+0x96/0xa0 drivers/gpu/drm/drm_vblank.c:487 call_timer_fn+0x3a/0x230 kernel/time/timer.c:1404 expire_timers kernel/time/timer.c:1449 [inline] __run_timers kernel/time/timer.c:1773 [inline] __run_timers kernel/time/timer.c:1740 [inline] run_timer_softirq+0x2b8/0x840 kernel/time/timer.c:1786 __do_softirq+0x118/0x344 kernel/softirq.c:292 asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:711 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline] run_on_irqstack_cond