On Tue, 2025-05-27 at 15:37 +0200, Nam Cao wrote:
> On Wed, May 14, 2025 at 10:43:14AM +0200, Gabriele Monaco wrote:
> > Add a per-cpu monitor as part of the sched model:
> > * opid: operations with preemption and irq disabled
> > Monitor to ensure wakeup and need_resched occur with irq and
> > preemption disabled or in irq handlers.
>
> This monitor reports some warnings:
>
> $ perf record -e rv:error_opid --call-graph dwarf -a -- ./stress-
> epoll
> (stress-epoll program from
> https://github.com/rouming/test-tools/blob/master/stress-epoll.c)
>
Thanks for trying it out, and good to know about this stressor.
Unfortunately it's a bit hard to understand from this stack trace, but
that's very likely a problem in the model.
I have a few ideas where that could be but I believe it's something
visible only on a physical machine (haven't tested much on x86 bare
metal, only VM).
You're running on bare metal right?
> $ perf script
> stress-epoll 315 [003] 527.674724: rv:error_opid: event
> preempt_disable not expected in the state preempt_disabled
> ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
> ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
> ffffffff9fdfba0d handle_preempt_disable+0x3d
> ([kernel.kallsyms])
> ffffffff9fdd32d0 __traceiter_preempt_disable+0x30
> ([kernel.kallsyms])
> ffffffff9fdd38fe trace_preempt_off+0x4e ([kernel.kallsyms])
> ffffffff9fee6c1c vfs_write+0x12c ([kernel.kallsyms])
> ffffffff9fee7128 ksys_write+0x68 ([kernel.kallsyms])
> ffffffffa0bdbd92 do_syscall_64+0xb2 ([kernel.kallsyms])
> ffffffff9fa00130 entry_SYSCALL_64_after_hwframe+0x77
> ([kernel.kallsyms])
> f833f __GI___libc_write+0x4f (/usr/lib/x86_64-
> linux-gnu/libc.so.6)
> f833f __GI___libc_write+0x4f (/usr/lib/x86_64-
> linux-gnu/libc.so.6)
> 1937 thread_work+0x47 (/root/test-tools/stress-
> epoll)
> 891f4 start_thread+0x304 (/usr/lib/x86_64-linux-
> gnu/libc.so.6)
> 10989b clone3+0x2b (/usr/lib/x86_64-linux-
> gnu/libc.so.6)
>
> stress-epoll 318 [002] 527.674759: rv:error_opid: event
> preempt_disable not expected in the state disabled
> ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
> ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
> ffffffff9fdfba0d handle_preempt_disable+0x3d
> ([kernel.kallsyms])
> ffffffff9fdd32d0 __traceiter_preempt_disable+0x30
> ([kernel.kallsyms])
> ffffffff9fdd38fe trace_preempt_off+0x4e ([kernel.kallsyms])
> ffffffffa0bec1aa _raw_spin_lock_irq+0x1a ([kernel.kallsyms])
> ffffffff9ff4fe73 eventfd_write+0x63 ([kernel.kallsyms])
> ffffffff9fee6be5 vfs_write+0xf5 ([kernel.kallsyms])
> ffffffff9fee7128 ksys_write+0x68 ([kernel.kallsyms])
> ffffffffa0bdbd92 do_syscall_64+0xb2 ([kernel.kallsyms])
> ffffffff9fa00130 entry_SYSCALL_64_after_hwframe+0x77
> ([kernel.kallsyms])
> f833f __GI___libc_write+0x4f (/usr/lib/x86_64-
> linux-gnu/libc.so.6)
> f833f __GI___libc_write+0x4f (/usr/lib/x86_64-
> linux-gnu/libc.so.6)
> 1937 thread_work+0x47 (/root/test-tools/stress-
> epoll)
> 891f4 start_thread+0x304 (/usr/lib/x86_64-linux-
> gnu/libc.so.6)
> 10989b clone3+0x2b (/usr/lib/x86_64-linux-
> gnu/libc.so.6)
>
> I'm not sure what I'm looking at here. Do you think these are kernel
> bugs,
> or the monitor is missing some corner cases?
>
As said, likely a missing corner case, I believe it has to do with IRQs
(which is what makes this monitor more complex than it could be).
Thanks for the pointers, I'll try reproduce it this way.
Gabriele