Hi Jason,

On 02/10/2024 17:14, Jason Montleon wrote:
On Tue, Oct 1, 2024 at 10:53 AM Alexandre Ghiti <[email protected]> wrote:
Hi Jason,

On 25/09/2024 16:32, Jason Montleon wrote:
On Tue, Sep 24, 2024 at 1:36 PM Kees Cook <[email protected]> wrote:

On September 24, 2024 8:58:57 AM PDT, Jason Montleon <[email protected]> 
wrote:
On Sun, Sep 22, 2024 at 6:38 PM Kees Cook <[email protected]> wrote:
Can you try this patch? It should avoid using the "WARN" infrastructure
(if that is the source of blocking boot), but should still provide some
detail about what tripped it up (via the "regular" pr_*() logging). And
if it boots, can you look at the log to find what it reports for the
"Wanted to write ..." line(s)?

Thanks!

-Kees


diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
index 0d99bf11d260..469a3439959d 100644
--- a/include/linux/fortify-string.h
+++ b/include/linux/fortify-string.h
@@ -549,7 +549,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t 
size,
                                           const size_t q_size,
                                           const size_t p_size_field,
                                           const size_t q_size_field,
-                                        const u8 func)
+                                        const u8 func,
+                                        const char *field)
   {
          if (__builtin_constant_p(size)) {
                  /*
@@ -610,8 +611,13 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t 
size,
           * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832
           */
          if (p_size_field != SIZE_MAX &&
-           p_size != p_size_field && p_size_field < size)
-               return true;
+           p_size != p_size_field && p_size_field < size) {
+               if (p_size_field == 0)
+                       pr_warn("Wanted to write %zu to a 0-sized destination: 
%s\n",
+                               size, field);
+               else
+                       return true;
+       }

          return false;
   }
@@ -625,7 +631,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t 
size,
          const size_t __q_size_field = (q_size_field);                   \
          fortify_warn_once(fortify_memcpy_chk(__fortify_size, __p_size,  \
                                       __q_size, __p_size_field,          \
-                                    __q_size_field, FORTIFY_FUNC_ ##op), \
+                                    __q_size_field, FORTIFY_FUNC_ ##op,\
+                                    #p " at " FILE_LINE), \
                    #op ": detected field-spanning write (size %zu) of single %s 
(size %zu)\n", \
                    __fortify_size,                                       \
                    "field \"" #p "\" at " FILE_LINE,                     \


--
Kees Cook

Hi Kees,
Sorry for the delay in responding. Your patch also caused the system
to hang booting, but I took it as inspiration to try some things.
TL;DR I see these two print several times:

Wanted to write 8 to a 0-sized destination: oldptr at
arch/riscv/errata/thead/errata.c:185
Wanted to write 28 to a 0-sized destination: oldptr at
arch/riscv/errata/thead/errata.c:185

Since I suspected CONFIG_ERRATA_THEAD_MAE which relies on
CONFIG_RISCV_ALTERNATIVE_EARLY I suspected it is just too early for
pr_*, etc. I tried trace_printk but that didn't produce any traces,
and then I found a comment which makes me think we're too early for
that too.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/kernel/alternative.c?h=v6.11#n214

I didn't want to give up so I wrote a simple module to figure out how
to use sbi_ecall and then incorporated that idea into your patch,
although even that was not straightforward:
https://gist.github.com/jmontleon/f3536ad70dc92bed992139ad5b266d5c
Oh nice work! Probably something riscv-specific needs to be wired up to the 
printk routines to do what you have there until printk is fully available  But 
that's a separate issue, I guess.

Full output: https://gist.github.com/jmontleon/823facff38e28b717f153e38adcfd7fe
Does the system boot with your patch?
Yes. It boots fully. It also booted with trace_printk, it just didn't
produce a trace, which again I think is expected at that stage. It
seems as long as there are no 'normal' print attempts, pr_()*, etc. it
works.

Hmmm I don't see what in your patch would fix your issue.

I sent a patch
(https://lore.kernel.org/linux-riscv/[email protected]/)
that was merged in v6.11-rc7 which fixes something similar, could that
be what fixed your problem?

Hi Alex, Thank you for having a look.

Although I narrowed it down to starting with 6.11-rc1, the first time
I encountered this was actually with an rc7 Fedora build.
http://riscv.rocks/koji/buildinfo?buildID=338433

I also did a build from upstream 6.11.1 last night and still see the
problem trying to boot with it this morning.

The problem with this early code is that the MMU is not enabled yet and
if your kernel is relocatable, the relocations are set "virtually" and
then without the MMU, a call to an external function results in a
trap...Which happens way too often, I have a few ideas to fix that, I
need to find time to tackle this though.

I don't know FORTIFY_SOURCE, but I guess this could cause a call to an
external function and then cause the issue I describe above right? We
already remove all instrumentations from this early code so maybe we
should remove FORTIFY_SOURCE too?


So I was able to reproduce the issue on qemu by adding a few tweaks, and indeed we trap in __warn_printk() on a virtual address but MMU is not enabled yet.

The following diff though allows me to pass this failure but I can't get much further in the boot since the tweaks I added won't allow it, can you give the following a try?

diff --git a/arch/riscv/errata/Makefile b/arch/riscv/errata/Makefile
index 8a27394851233..4913f3b3f198c 100644
--- a/arch/riscv/errata/Makefile
+++ b/arch/riscv/errata/Makefile
@@ -2,6 +2,10 @@ ifdef CONFIG_RELOCATABLE
 KBUILD_CFLAGS += -fno-pie
 endif

+ifdef CONFIG_RISCV_ALTERNATIVE_EARLY
+KBUILD_CFLAGS += -D__NO_FORTIFY
+endif
+
 obj-$(CONFIG_ERRATA_ANDES) += andes/
 obj-$(CONFIG_ERRATA_SIFIVE) += sifive/
 obj-$(CONFIG_ERRATA_THEAD) += thead/


Thanks,

Alex



What you are describing here does not seem unreasonable to me, but I
simply do not know enough to say one way or the other.
Is there something I can do to collect more information to determine
if this is the cause?

Thank you again for having a look,
Jason

Thanks for digging into this,

Alex


I would guess that this line at arch/riscv/errata/thead/errata.c:168:

void *oldptr, *altptr;

Should be:

u8 *oldptr, *altptr;
I tried this, but it didn't appear to make a difference, I will try
some more today, though I admit to being out of my element.



But maybe __ALT_PTR needs adjustment too? Hmm. I would have expected void * to 
have a size of SIZE_MAX here, though, so something else might be confusing the 
check.


--
Kees Cook

_______________________________________________
linux-riscv mailing list
[email protected]
http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
[email protected]
http://lists.infradead.org/mailman/listinfo/linux-riscv

Reply via email to