On 2025/9/1 23:56, Masami Hiramatsu (Google) wrote:
On Fri, 29 Aug 2025 08:26:04 -0400
Steven Rostedt <rost...@goodmis.org> wrote:
[ Adding arm64 maintainers ]
On Fri, 29 Aug 2025 16:29:07 +0800
Luo Gengkun <luogeng...@huaweicloud.com> wrote:
On 2025/8/20 1:50, Steven Rostedt wrote:
On Tue, 19 Aug 2025 10:51:52 +0000
Luo Gengkun <luogeng...@huaweicloud.com> wrote:
Both tracing_mark_write and tracing_mark_raw_write call
__copy_from_user_inatomic during preempt_disable. But in some case,
__copy_from_user_inatomic may trigger page fault, and will call schedule()
subtly. And if a task is migrated to other cpu, the following warning will
Wait! What?
__copy_from_user_inatomic() is allowed to be called from in atomic context.
Hence the name it has. How the hell can it sleep? If it does, it's totally
broken!
Now, I'm not against using nofault() as it is better named, but I want to
know why you are suggesting this change. Did you actually trigger a bug here?
yes, I trigger this bug in arm64.
And I still think this is an arm64 bug.
I think it could be.
be trigger:
if (RB_WARN_ON(cpu_buffer,
!local_read(&cpu_buffer->committing)))
An example can illustrate this issue:
You've missed an important part.
process flow CPU
---------------------------------------------------------------------
tracing_mark_raw_write(): cpu:0
...
ring_buffer_lock_reserve(): cpu:0
...
preempt_disable_notrace(); --> this is unlocked by
ring_buffer_unlock_commit()
cpu = raw_smp_processor_id() cpu:0
cpu_buffer = buffer->buffers[cpu] cpu:0
...
...
__copy_from_user_inatomic(): cpu:0
So this is called under preempt-disabled.
...
# page fault
do_mem_abort(): cpu:0
Sounds to me that arm64 __copy_from_user_inatomic() may be broken.
...
# Call schedule
schedule() cpu:0
If this does not check the preempt flag, it is a problem.
Maybe arm64 needs to do fixup and abort instead of do_mem_abort()?
My kernel was built without CONFIG_PREEMPT_COUNT, so the preempt_disable()
does nothing more than act as a barrier. In this case, it can pass the
check by schedule(). Perhaps this is another issue?
...
# the task schedule to cpu1
__buffer_unlock_commit(): cpu:1
...
ring_buffer_unlock_commit(): cpu:1
...
cpu = raw_smp_processor_id() cpu:1
cpu_buffer = buffer->buffers[cpu] cpu:1
preempt_enable_notrace(); <-- here we enable preempt again.
As shown above, the process will acquire cpuid twice and the return values
are not the same.
To fix this problem using copy_from_user_nofault instead of
__copy_from_user_inatomic, as the former performs 'access_ok' before
copying.
Fixes: 656c7f0d2d2b ("tracing: Replace kmap with copy_from_user() in trace_marker
writing")
The above commit was intorduced in 2016. copy_from_user_nofault() was
introduced in 2020. I don't think this would be the fix for that kernel.
So no, I'm not taking this patch. If you see __copy_from_user_inatomic()
sleeping, it's users are not the issue. That function is.
BTW, the biggest difference between __copy_from_user() and
__copy_from_user_inatomic() is `might_fault()` and `should_fail_usercopy()`.
The latter is a fault injection, so we can ignore it. But since
the `might_fail()` is NOT in __copy_from_user_inatomic(), it is designed
not to cause fault as Steve said?
Thank you,