On Tue, Oct 1, 2024 at 10:53 AM Alexandre Ghiti <[email protected]> wrote: > > Hi Jason, > > On 25/09/2024 16:32, Jason Montleon wrote: > > On Tue, Sep 24, 2024 at 1:36 PM Kees Cook <[email protected]> wrote: > >> > >> > >> On September 24, 2024 8:58:57 AM PDT, Jason Montleon <[email protected]> > >> wrote: > >>> On Sun, Sep 22, 2024 at 6:38 PM Kees Cook <[email protected]> wrote: > >>>> Can you try this patch? It should avoid using the "WARN" infrastructure > >>>> (if that is the source of blocking boot), but should still provide some > >>>> detail about what tripped it up (via the "regular" pr_*() logging). And > >>>> if it boots, can you look at the log to find what it reports for the > >>>> "Wanted to write ..." line(s)? > >>>> > >>>> Thanks! > >>>> > >>>> -Kees > >>>> > >>>> > >>>> diff --git a/include/linux/fortify-string.h > >>>> b/include/linux/fortify-string.h > >>>> index 0d99bf11d260..469a3439959d 100644 > >>>> --- a/include/linux/fortify-string.h > >>>> +++ b/include/linux/fortify-string.h > >>>> @@ -549,7 +549,8 @@ __FORTIFY_INLINE bool > >>>> fortify_memcpy_chk(__kernel_size_t size, > >>>> const size_t q_size, > >>>> const size_t p_size_field, > >>>> const size_t q_size_field, > >>>> - const u8 func) > >>>> + const u8 func, > >>>> + const char *field) > >>>> { > >>>> if (__builtin_constant_p(size)) { > >>>> /* > >>>> @@ -610,8 +611,13 @@ __FORTIFY_INLINE bool > >>>> fortify_memcpy_chk(__kernel_size_t size, > >>>> * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832 > >>>> */ > >>>> if (p_size_field != SIZE_MAX && > >>>> - p_size != p_size_field && p_size_field < size) > >>>> - return true; > >>>> + p_size != p_size_field && p_size_field < size) { > >>>> + if (p_size_field == 0) > >>>> + pr_warn("Wanted to write %zu to a 0-sized > >>>> destination: %s\n", > >>>> + size, field); > >>>> + else > >>>> + return true; > >>>> + } > >>>> > >>>> return false; > >>>> } > >>>> @@ -625,7 +631,8 @@ __FORTIFY_INLINE bool > >>>> fortify_memcpy_chk(__kernel_size_t size, > >>>> const size_t __q_size_field = (q_size_field); > >>>> \ > >>>> fortify_warn_once(fortify_memcpy_chk(__fortify_size, __p_size, > >>>> \ > >>>> __q_size, __p_size_field, > >>>> \ > >>>> - __q_size_field, FORTIFY_FUNC_ > >>>> ##op), \ > >>>> + __q_size_field, FORTIFY_FUNC_ ##op,\ > >>>> + #p " at " FILE_LINE), \ > >>>> #op ": detected field-spanning write (size %zu) of > >>>> single %s (size %zu)\n", \ > >>>> __fortify_size, > >>>> \ > >>>> "field \"" #p "\" at " FILE_LINE, > >>>> \ > >>>> > >>>> > >>>> -- > >>>> Kees Cook > >>>> > >>> Hi Kees, > >>> Sorry for the delay in responding. Your patch also caused the system > >>> to hang booting, but I took it as inspiration to try some things. > >>> TL;DR I see these two print several times: > >>> > >>> Wanted to write 8 to a 0-sized destination: oldptr at > >>> arch/riscv/errata/thead/errata.c:185 > >>> Wanted to write 28 to a 0-sized destination: oldptr at > >>> arch/riscv/errata/thead/errata.c:185 > >>> > >>> Since I suspected CONFIG_ERRATA_THEAD_MAE which relies on > >>> CONFIG_RISCV_ALTERNATIVE_EARLY I suspected it is just too early for > >>> pr_*, etc. I tried trace_printk but that didn't produce any traces, > >>> and then I found a comment which makes me think we're too early for > >>> that too. > >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/kernel/alternative.c?h=v6.11#n214 > >>> > >>> I didn't want to give up so I wrote a simple module to figure out how > >>> to use sbi_ecall and then incorporated that idea into your patch, > >>> although even that was not straightforward: > >>> https://gist.github.com/jmontleon/f3536ad70dc92bed992139ad5b266d5c > >> Oh nice work! Probably something riscv-specific needs to be wired up to > >> the printk routines to do what you have there until printk is fully > >> available But that's a separate issue, I guess. > >> > >>> Full output: > >>> https://gist.github.com/jmontleon/823facff38e28b717f153e38adcfd7fe > >> Does the system boot with your patch? > > Yes. It boots fully. It also booted with trace_printk, it just didn't > > produce a trace, which again I think is expected at that stage. It > > seems as long as there are no 'normal' print attempts, pr_()*, etc. it > > works. > > > Hmmm I don't see what in your patch would fix your issue. > > I sent a patch > (https://lore.kernel.org/linux-riscv/[email protected]/) > that was merged in v6.11-rc7 which fixes something similar, could that > be what fixed your problem? >
Hi Alex, Thank you for having a look. Although I narrowed it down to starting with 6.11-rc1, the first time I encountered this was actually with an rc7 Fedora build. http://riscv.rocks/koji/buildinfo?buildID=338433 I also did a build from upstream 6.11.1 last night and still see the problem trying to boot with it this morning. > The problem with this early code is that the MMU is not enabled yet and > if your kernel is relocatable, the relocations are set "virtually" and > then without the MMU, a call to an external function results in a > trap...Which happens way too often, I have a few ideas to fix that, I > need to find time to tackle this though. > > I don't know FORTIFY_SOURCE, but I guess this could cause a call to an > external function and then cause the issue I describe above right? We > already remove all instrumentations from this early code so maybe we > should remove FORTIFY_SOURCE too? > What you are describing here does not seem unreasonable to me, but I simply do not know enough to say one way or the other. Is there something I can do to collect more information to determine if this is the cause? Thank you again for having a look, Jason > Thanks for digging into this, > > Alex > > > > > >> I would guess that this line at arch/riscv/errata/thead/errata.c:168: > >> > >> void *oldptr, *altptr; > >> > >> Should be: > >> > >> u8 *oldptr, *altptr; > > I tried this, but it didn't appear to make a difference, I will try > > some more today, though I admit to being out of my element. > > > > > > > >> But maybe __ALT_PTR needs adjustment too? Hmm. I would have expected void > >> * to have a size of SIZE_MAX here, though, so something else might be > >> confusing the check. > >> > >> > >> -- > >> Kees Cook > >> > > > > _______________________________________________ > > linux-riscv mailing list > > [email protected] > > http://lists.infradead.org/mailman/listinfo/linux-riscv >
