Re: STM32H7 crash

raiden00pl Mon, 09 Feb 2026 00:07:19 -0800

> I wonder if any of the stack size issues can be traced back to the removal
> of lazy FPU? I remember having stack size issues with the H743 and someone
> mentioned that lazy FPU can reduce the stack usage. There was an issue
> referenced not long ago by raiden about the removal of lazy FPU if I can
> find it again. Maybe it predates the 11.0 release and that's what broke
> this configuration?


Most likely not. Lazy FPU was disabled by default and only a few boards
enabled it.
More stack usage is a natural consequence of system development, adding new
features and POSIX compatibility. If you want to avoid this, you have to
set
CONFIG_DEFAULT SMALL=y for your config then the chance that memory usage
will increase in the future is smaller, but it still exists.

Also, some of the features have probably never been tested and the stack
size is
simply too small to cover all uses. When I was developing the STM32H7 port
I never
thought to run the "time ls" command, so it's quite possible that with the
default
stack sizes it never worked :)


pon., 9 lut 2026 o 08:29 Roberto Bucher <[email protected]> napisał(a):

> I did some tests with my nucleo-H743-ZI2 using nucleo-h745zi:pysim_cm7
> without problems
>
> Bye
>
> Roberto
>
> On 2/7/26 11:31 AM, [email protected] wrote:
> > You can compile with the CONFIG_STACK_COLORATION option and see the
> stack usage
> > in the crashdump.
> >
> > Michal
> >
> > On Fri, 2026-02-06 at 20:45 -0500, Peter Barada wrote:
> >> Haven't tried yet(personally feel should know _why_ it happens) - is
> >> there a config for compiling in stack checking on function entry?
> >>
> >> On 2/6/26 20:22, Matteo Golin wrote:
> >>> Hmmm, if the problem goes that far back it may not be worth triaging
> >>> that way. Things have probably diverged so much since then. No luck
> >>> with the stack increase?
> >>>
> >>> Matteo
> >>>
> >>> On Fri, Feb 6, 2026, 8:18 PM Peter Barada <[email protected]>
> wrote:
> >>>
> >>>      Matteo,
> >>>
> >>>      I'm walking back release points and have had to change board
> >>>      configuration names(to nucleo-h743zi), rename nuttx-apps to appa,
> >>>      and still seeing the fault in release/11.0 branch.
> >>>
> >>>      I'm trying to go back further but wondering if I'll find a bisect
> >>>      start point...
> >>>
> >>>      On 2/6/26 17:05, Matteo Golin wrote:
> >>>>      Hi Peter,
> >>>>
> >>>>      My approach is kind of a headache since bisecting over an area
> >>>>      where apps and NuttX are not always in sync is a major limitation
> >>>>      of the split repo. My approach is usually:
> >>>>
> >>>>      - Start the bisect in kernel
> >>>>      - Check the commit date of the current HEAD
> >>>>      - Check out to a commit of the same/similar date in apps
> >>>>      - Build
> >>>>      - Mentally note if this commit was good or bad based on the
> >>>>      results of running the image
> >>>>      - make distclean (avoids artifacts carrying over between
> >>>>      bisections and breaking everything)
> >>>>      - Mark commit good or bad with git bisect
> >>>>
> >>>>      Then basically repeat this until bisecting is finished. It sucks
> >>>>      and I did suggest a script in /tools/ to try and automate most of
> >>>>      this, but I never got around to writing it.
> >>>>
> >>>>      I would suggest you start by checking for the issue on a stable
> >>>>      release (i.e. 12.12.0) to see if that's a good commit you can
> >>>>      start from. Usually those releases have a higher degree of
> >>>>      testing because everyone who voted for the release ran some
> >>>>      images on their hardware.
> >>>>
> >>>>      That's honestly a lot of work but you never know if it'll end up
> >>>>      being faster than trying to triage with logs!
> >>>>
> >>>>      Matteo
> >>>>
> >>>>      On Fri, Feb 6, 2026, 4:50 PM Nathan Hartman
> >>>>      <[email protected]> wrote:
> >>>>
> >>>>          First place I would look: is the stack overflowing? (You
> >>>>          could try enabling some of the stack debugging features.)
> >>>>
> >>>>          On Fri, Feb 6, 2026 at 4:34 PM Peter Barada
> >>>>          <[email protected]> wrote:
> >>>>
> >>>>              Matteo,
> >>>>
> >>>>              I don't know if this was working before but if you can
> >>>>              suggest a good
> >>>>              starting point I can cycle through git bisect to narrow
> >>>>              down to the
> >>>>              failing commit.  What's the best approach to using git
> >>>>              bisect across
> >>>>              multiple repos (since changes in nuttx may have necessary
> >>>>              changes in
> >>>>              nuttx-apps and need to keep them in sync at each build
> >>>>              point)?
> >>>>
> >>>>              As an aside, I also I have a nucleo-f446re board 'time
> >>>>              ls' works fine there.
> >>>>
> >>>>              Further, does anyone have GDB scripts that make it easier
> >>>>              to decipher
> >>>>              Nuttx structures from memory (e.g. dump task/semaphore
> >>>>              lists, etc)? I've
> >>>>              started cobbling snippets but figure I'd ask before
> >>>>              reinventing the wheel.
> >>>>
> >>>>
> >>>>              On 2/6/26 16:12, Matteo Golin wrote:
> >>>>              > Hi Peter,
> >>>>              >
> >>>>              > If you happen to know that this was working before on
> >>>>              an older NuttX
> >>>>              > version, you could use git bisect to narrow down the
> >>>>              breaking commit.
> >>>>              > Then the issue might be clearer.
> >>>>              >
> >>>>              > Best,
> >>>>              > Matteo
> >>>>              >
> >>>>              > On Fri, Feb 6, 2026, 4:09 PM Peter Barada
> >>>>              <[email protected]> wrote:
> >>>>              >
> >>>>              >     I have a STM32 Nucleo-h753zi board - and configured
> >>>>              a build for
> >>>>              > nucleo-743zi2:nsh (which is closest board/chip; the
> >>>>              stm32h753zi is
> >>>>              >     same
> >>>>              >     as stm32h743zi but h753zi includes crypto
> >>>>              acceleration hardware).
> >>>>              >
> >>>>              >     Build works, but if I boot and try 'time ls' nuttx
> >>>>              faults:
> >>>>              >
> >>>>              >     nsh> uname -a
> >>>>              >     NuttX 0.0.0 9ecfff0833 Feb  6 2026 15:45:28 arm
> >>>>              nucleo-h743zi2
> >>>>              >     nsh> time ls
> >>>>              >     /:
> >>>>              >       dev/
> >>>>              >
> >>>>              >     0.00dump_assert_info: Current Version: NuttX  0.0.0
> >>>>              9ecfff0833
> >>>>              >     Feb  6 2026 15:45:28 arm
> >>>>              >     dump_assert_info: Assertion failed panic: at file:
> >>>>              :0 task:
> >>>>              >     <noname> process: <noname> 0x800c9fd
> >>>>              >     up_dump_register: R0: 0801e624 R1: 0000000a R2:
> >>>>              00000050  R3: 0000000a
> >>>>              >     up_dump_register: R4: 00000001 R5: 240000e4 R6:
> >>>>              00000000  FP: 00000000
> >>>>              >     up_dump_register: R8: 00000000 SB: 00000000 SL:
> >>>>              00000000 R11: 00000000
> >>>>              >     up_dump_register: IP: 00000000 SP: 38000c08 LR:
> >>>>              080059db  PC: 08005984
> >>>>              >     up_dump_register: xPSR: 41000000 BASEPRI: 00000000
> >>>>              CONTROL: 00000000
> >>>>              >     up_dump_register: EXC_RETURN: ffffffe9
> >>>>              >     dump_stackinfo: User Stack:
> >>>>              >     dump_stackinfo:   base: 0x38000518
> >>>>              >     dump_stackinfo:   size: 00002000
> >>>>              >     dump_stackinfo:     sp: 0x38000c08
> >>>>              >     stack_dump: 0x38000be8: 00000000 00000000 00000000
> >>>>              00000000
> >>>>              >     00000000 00000000 00000000 00000000
> >>>>              >     stack_dump: 0x38000c08: 0000000a 0801e624 0801e624
> >>>>              38000200
> >>>>              >     38000fac 00000000 0801e624 080172c1
> >>>>              >     stack_dump: 0x38000c28: 00000000 0801e624 38000200
> >>>>              38000158
> >>>>              >     00000000 00000000 38000fac 0800caa1
> >>>>              >     stack_dump: 0x38000c48: 00000000 0800cc77 0801e624
> >>>>              000002fc
> >>>>              >     38000500 00000001 00000001 38000cf0
> >>>>              >     stack_dump: 0x38000c68: 38000cf0 00000008 38000200
> >>>>              00000000
> >>>>              >     00000000 0800ca79 38000500 00000001
> >>>>              >     stack_dump: 0x38000c88: 00000064 38000cf0 00000064
> >>>>              0800ca33
> >>>>              >     38000500 00000001 00000064 00000000
> >>>>              >     stack_dump: 0x38000ca8: 00000000 08009325 00000000
> >>>>              38000500
> >>>>              >     00000001 0800c9fd 00000000 080052f1
> >>>>              >     stack_dump: 0x38000cc8: 00000000 38000500 00000000
> >>>>              38000158
> >>>>              >     00000001 00000001 00000000 00000000
> >>>>              >     stack_dump: 0x38000ce8: 00000000 00000000 00000000
> >>>>              00000000
> >>>>              >     00000000 00000000 00000000 00000000
> >>>>              >     dump_tasks:    PID GROUP PRI POLICY  TYPE    NPX
> >>>>              STATE  EVENT
> >>>>              >       SIGMASK          STACKBASE  STACKSIZE  COMMAND
> >>>>              >     dump_task:       0     0   0 FIFO  Kthread -
>  Ready
> >>>>              >     0000000000000000 0x240018b0      1000  <noname>
> >>>>              >     dump_task:       1     1 100 RR  Task    -
>  Running
> >>>>              >     0000000000000000 0x38000518      2000  <noname>
> ��]���&
> >>>>              >
> >>>>              >     Wondering if anyone has run across this before?
> >>>>              Backtrace shows:
> >>>>              >
> >>>>              >     Program received signal SIGTRAP, Trace/breakpoint
> trap.
> >>>>              >     exception_common () at armv7-m/arm_exception.S:127
> >>>>              >     127             mrs             r0, ipsr
> >>>>              /* R0=exception
> >>>>              >     number */
> >>>>              >     where
> >>>>              >     #0  exception_common () at
> armv7-m/arm_exception.S:127
> >>>>              >     #1  <signal handler called>
> >>>>              >     #2  0x08005984 in env_cmpname (pszname=0x801e624
> "PS1",
> >>>>              >          peqname=0xa <error: Cannot access memory at
> >>>>              address 0xa>)
> >>>>              >          at environ/env_findvar.c:50
> >>>>              >     #3  0x080059da in env_findvar (group=0x38000200,
> >>>>              pname=0x801e624
> >>>>              >     "PS1")
> >>>>              >          at environ/env_findvar.c:105
> >>>>              >     #4  0x080172c0 in getenv (name=0x801e624 "PS1") at
> >>>>              >     environ/env_getenv.c:89
> >>>>              >     #5  0x0800caa0 in nsh_update_prompt () at
> >>>>              nsh_prompt.c:77
> >>>>              >     #6  0x0800cc76 in nsh_session (pstate=0x38000cf0,
> >>>>              login=1, argc=1,
> >>>>              >          argv=0x38000500) at nsh_session.c:249
> >>>>              >     #7  0x0800ca78 in nsh_consolemain (argc=1,
> >>>>              argv=0x38000500)
> >>>>              >          at nsh_consolemain.c:77
> >>>>              >     #8  0x0800ca32 in nsh_main (argc=1,
> >>>>              argv=0x38000500) at nsh_main.c:76
> >>>>              >     #9  0x08009324 in nxtask_startup (entrypt=0x800c9fd
> >>>>              <nsh_main>,
> >>>>              >     argc=1,
> >>>>              >          argv=0x38000500) at sched/task_startup.c:72
> >>>>              >     #10 0x080052f0 in nxtask_start () at
> >>>>              task/task_start.c:104
> >>>>              >     #11 0x00000000 in ?? ()
> >>>>              >
> >>>>              >     Scratching the surface shows that env_findvar() is
> >>>>              called with group
> >>>>              >     pointer of 0x38000200, group->tg_envp is
> >>>>              0x380004b8, both which are
> >>>>              >     reasonable. But *group->tg_envp is 0xA.  Further if
> >>>>              I "watch
> >>>>              >     *(int*)0x380004b8" in GDB, I see it is getting
> >>>>              overwritten by
> >>>>              >     up_serialout() invoked from
> stm32_serial.c::up_send.
> >>>>              >
> >>>>              >     Any suggestions on how I can best track this down
> >>>>              further?
> >>>>              >
> >>>>              >     Thanks in advance!
> >>>>              >
> >>>>              >     --
> >>>>              >     Peter Barada
> >>>>              > [email protected]
> >>>>              >
> >>>>              --
> >>>>              Peter Barada
> >>>>              [email protected]
> >>>>
> >>>      --
> >>>      Peter Barada
> >>>      [email protected]
> >>>
>
>

Re: STM32H7 crash

Reply via email to