On Wed, Jul 02, 2025 at 03:05:06PM +0200, Alexandre Ghiti wrote: > Hi Yao, > > On 7/2/25 12:50, Yao Zi wrote: > > On Tue, Jul 01, 2025 at 02:27:32PM +0200, Alexandre Ghiti wrote: > > > Hi Yao, > > > > > > On 7/1/25 08:41, Yao Zi wrote: > > > > Linux v6.16 built with dynamic ftrace randomly oops or triggers > > > > ftrace_bug() on Sophgo SG2042 when booting systemd-based userspace, > > ... > > > > > > Not sure either reverting the commits or fixing them up is a better > > > > idea, but anyway the fatal first issue shouidn't go into the stable > > > > release. > > > Let's fix this, we were expecting issues with dynamic ftrace :) > > > > > > So the following diff fixes all the issues you mentioned (not the first > > > crash though, I'll let you test and see if it works better, I don't have > > > this board): > > Thanks for the fix! I've tested it with both QEMU and SG2042, it does > > fix the lockdep failures as well as the boot time crash on SG2042. The > > boot-time crash is caused by the race so will disappear as long as we > > fix the race. > > > > > diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c > > > index 4c6c24380cfd9..97ced537aa1e0 100644 > > > --- a/arch/riscv/kernel/ftrace.c > > > +++ b/arch/riscv/kernel/ftrace.c > > > @@ -14,6 +14,16 @@ > > > #include <asm/text-patching.h> > > > > > > #ifdef CONFIG_DYNAMIC_FTRACE > > > +void ftrace_arch_code_modify_prepare(void) > > > +{ > > > + mutex_lock(&text_mutex); > > > +} > > > + > > > +void ftrace_arch_code_modify_post_process(void) > > > +{ > > > + mutex_unlock(&text_mutex); > > > +} > > > + > > > unsigned long ftrace_call_adjust(unsigned long addr) > > > { > > > if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS)) > > > @@ -29,10 +39,8 @@ unsigned long arch_ftrace_get_symaddr(unsigned long > > > fentry_ip) > > > > > > void arch_ftrace_update_code(int command) > > > { > > > - mutex_lock(&text_mutex); > > > command |= FTRACE_MAY_SLEEP; > > > ftrace_modify_all_code(command); > > > - mutex_unlock(&text_mutex); > > > flush_icache_all(); > > > } > > > > > > @@ -149,16 +157,17 @@ int ftrace_init_nop(struct module *mod, struct > > > dyn_ftrace *rec) > > > unsigned int nops[2], offset; > > > int ret; > > > > > > + mutex_lock(&text_mutex); > > Besides using the guard API, could we swap the order between > > ftrace_rec_set_nop_ops() and calculation of the nops array? This shrinks > > the critical region a little. > > > If you don't mind, I won't, I don't like initializing stuff which could > never be used in case of error.
Yes, I don't mind it. > > > > > With or without the change, here's my tag, > > > > Tested-by: Yao Zi <zi...@disroot.org> > > > > and also > > > > Reported-by: Han Gao <rabenda...@gmail.com> > > Reported-by: Vivian Wang <wangruik...@iscas.ac.cn> > > > > for their first-hand report of boot-time crash and analysis for the > > first lock issue. > > > I'll add all those tags in the patch I'll send today (or tomorrow if the CI > is slow). Is there any update about the fix patch? It'll be nice to get the problem fixed soon. > Thanks again for the great bug report, really appreciated. > > Alex Best regards, Yao Zi