Re: STM32H7 crash

Nathan Hartman Sat, 07 Feb 2026 20:54:54 -0800

Yeah, it's usually the stack, but does anyone know why it needs to be
enlarged now? Is something using more stack than before?


On Sat, Feb 7, 2026 at 5:28 PM Peter Barada <[email protected]> wrote:

> Cranking up CONFIG_INIT_STACKSIZE to 3072 fixes the issue.
>
> I tried enabling STACK_COLORATION, STACK_USAGE, and ARMV7M_STACKTRACE
> while leaving INIT_STACKSIZE at 2048 to hopefully and debug using
> STM32CubeIDE when I try "time ls" the GDB session is lost (which seems
> strange).
>
> If I then enable ARMV7M_STACKCHECK_BREAKPOINT GDB stops when it detects
> the stack overflow can get a call stack to understand why but can't
> continue(to show dump).
>
> Finally after enabling ARCH_STACKDUMP, ARMV7M_STACKCHECK,
> SCHED_BACKTRACE, STACK_COLORATION, STACK_USAGE, disable
> STACKCHECK_BREAKPOINT, and enable/set ARCH_INTERRUPTSTACK=2048, and
> ARCH_STACKDUMP_MAX_LENGTH=1024, I get a full dump when it detects stack
> overflow.
>
> Thanks for the help!
>
>
> On 2/7/26 03:25, raiden00pl wrote:
> > hi, this is a 100% stack issue. Increase all stack sizes to at least
> 4092.
> > Another option is to enable full optimisation with
> CONFIG_DEBUG_FULLOPT=y,
> > should also help.
> >
> > quick tip: about 80% of crashes in NuttX are stack issues, the first
> thing
> > you
> > always do when such crashes occur is to increase all stack sizes :)
> >
> > sob., 7 lut 2026 o 04:02 Matteo Golin <[email protected]>
> napisał(a):
> >
> >> I am not familiar enough, but there should be an option for stack
> canaries.
> >> I haven't had much luck with that configuration, and I imagine that your
> >> DEBUGASSERT will trigger before stack smashing is detected.
> >>
> >> Matteo
> >>
> >> On Fri, Feb 6, 2026, 8:45 PM Peter Barada <[email protected]>
> wrote:
> >>
> >>> Haven't tried yet(personally feel should know _why_ it happens) - is
> >> there
> >>> a config for compiling in stack checking on function entry?
> >>> On 2/6/26 20:22, Matteo Golin wrote:
> >>>
> >>> Hmmm, if the problem goes that far back it may not be worth triaging
> that
> >>> way. Things have probably diverged so much since then. No luck with the
> >>> stack increase?
> >>>
> >>> Matteo
> >>>
> >>> On Fri, Feb 6, 2026, 8:18 PM Peter Barada <[email protected]>
> >> wrote:
> >>>> Matteo,
> >>>>
> >>>> I'm walking back release points and have had to change board
> >>>> configuration names(to nucleo-h743zi), rename nuttx-apps to appa, and
> >> still
> >>>> seeing the fault in release/11.0 branch.
> >>>>
> >>>> I'm trying to go back further but wondering if I'll find a bisect
> start
> >>>> point...
> >>>> On 2/6/26 17:05, Matteo Golin wrote:
> >>>>
> >>>> Hi Peter,
> >>>>
> >>>> My approach is kind of a headache since bisecting over an area where
> >> apps
> >>>> and NuttX are not always in sync is a major limitation of the split
> >> repo.
> >>>> My approach is usually:
> >>>>
> >>>> - Start the bisect in kernel
> >>>> - Check the commit date of the current HEAD
> >>>> - Check out to a commit of the same/similar date in apps
> >>>> - Build
> >>>> - Mentally note if this commit was good or bad based on the results of
> >>>> running the image
> >>>> - make distclean (avoids artifacts carrying over between bisections
> and
> >>>> breaking everything)
> >>>> - Mark commit good or bad with git bisect
> >>>>
> >>>> Then basically repeat this until bisecting is finished. It sucks and I
> >>>> did suggest a script in /tools/ to try and automate most of this, but
> I
> >>>> never got around to writing it.
> >>>>
> >>>> I would suggest you start by checking for the issue on a stable
> release
> >>>> (i.e. 12.12.0) to see if that's a good commit you can start from.
> >> Usually
> >>>> those releases have a higher degree of testing because everyone who
> >> voted
> >>>> for the release ran some images on their hardware.
> >>>>
> >>>> That's honestly a lot of work but you never know if it'll end up being
> >>>> faster than trying to triage with logs!
> >>>>
> >>>> Matteo
> >>>>
> >>>> On Fri, Feb 6, 2026, 4:50 PM Nathan Hartman <[email protected]
> >
> >>>> wrote:
> >>>>
> >>>>> First place I would look: is the stack overflowing? (You could try
> >>>>> enabling some of the stack debugging features.)
> >>>>>
> >>>>> On Fri, Feb 6, 2026 at 4:34 PM Peter Barada <[email protected]>
> >>>>> wrote:
> >>>>>
> >>>>>> Matteo,
> >>>>>>
> >>>>>> I don't know if this was working before but if you can suggest a
> good
> >>>>>> starting point I can cycle through git bisect to narrow down to the
> >>>>>> failing commit.  What's the best approach to using git bisect across
> >>>>>> multiple repos (since changes in nuttx may have necessary changes in
> >>>>>> nuttx-apps and need to keep them in sync at each build point)?
> >>>>>>
> >>>>>> As an aside, I also I have a nucleo-f446re board 'time ls' works
> fine
> >>>>>> there.
> >>>>>>
> >>>>>> Further, does anyone have GDB scripts that make it easier to
> decipher
> >>>>>> Nuttx structures from memory (e.g. dump task/semaphore lists, etc)?
> >>>>>> I've
> >>>>>> started cobbling snippets but figure I'd ask before reinventing the
> >>>>>> wheel.
> >>>>>>
> >>>>>>
> >>>>>> On 2/6/26 16:12, Matteo Golin wrote:
> >>>>>>> Hi Peter,
> >>>>>>>
> >>>>>>> If you happen to know that this was working before on an older
> NuttX
> >>>>>>> version, you could use git bisect to narrow down the breaking
> >> commit.
> >>>>>>> Then the issue might be clearer.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Matteo
> >>>>>>>
> >>>>>>> On Fri, Feb 6, 2026, 4:09 PM Peter Barada <[email protected]>
> >>>>>> wrote:
> >>>>>>>      I have a STM32 Nucleo-h753zi board - and configured a build
> for
> >>>>>>>      nucleo-743zi2:nsh (which is closest board/chip; the
> stm32h753zi
> >>>>>> is
> >>>>>>>      same
> >>>>>>>      as stm32h743zi but h753zi includes crypto acceleration
> >> hardware).
> >>>>>>>      Build works, but if I boot and try 'time ls' nuttx faults:
> >>>>>>>
> >>>>>>>      nsh> uname -a
> >>>>>>>      NuttX 0.0.0 9ecfff0833 Feb  6 2026 15:45:28 arm nucleo-h743zi2
> >>>>>>>      nsh> time ls
> >>>>>>>      /:
> >>>>>>>        dev/
> >>>>>>>
> >>>>>>>      0.00dump_assert_info: Current Version: NuttX  0.0.0 9ecfff0833
> >>>>>>>      Feb  6 2026 15:45:28 arm
> >>>>>>>      dump_assert_info: Assertion failed panic: at file: :0 task:
> >>>>>>>      <noname> process: <noname> 0x800c9fd
> >>>>>>>      up_dump_register: R0: 0801e624 R1: 0000000a R2: 00000050  R3:
> >>>>>> 0000000a
> >>>>>>>      up_dump_register: R4: 00000001 R5: 240000e4 R6: 00000000  FP:
> >>>>>> 00000000
> >>>>>>>      up_dump_register: R8: 00000000 SB: 00000000 SL: 00000000 R11:
> >>>>>> 00000000
> >>>>>>>      up_dump_register: IP: 00000000 SP: 38000c08 LR: 080059db  PC:
> >>>>>> 08005984
> >>>>>>>      up_dump_register: xPSR: 41000000 BASEPRI: 00000000 CONTROL:
> >>>>>> 00000000
> >>>>>>>      up_dump_register: EXC_RETURN: ffffffe9
> >>>>>>>      dump_stackinfo: User Stack:
> >>>>>>>      dump_stackinfo:   base: 0x38000518
> >>>>>>>      dump_stackinfo:   size: 00002000
> >>>>>>>      dump_stackinfo:     sp: 0x38000c08
> >>>>>>>      stack_dump: 0x38000be8: 00000000 00000000 00000000 00000000
> >>>>>>>      00000000 00000000 00000000 00000000
> >>>>>>>      stack_dump: 0x38000c08: 0000000a 0801e624 0801e624 38000200
> >>>>>>>      38000fac 00000000 0801e624 080172c1
> >>>>>>>      stack_dump: 0x38000c28: 00000000 0801e624 38000200 38000158
> >>>>>>>      00000000 00000000 38000fac 0800caa1
> >>>>>>>      stack_dump: 0x38000c48: 00000000 0800cc77 0801e624 000002fc
> >>>>>>>      38000500 00000001 00000001 38000cf0
> >>>>>>>      stack_dump: 0x38000c68: 38000cf0 00000008 38000200 00000000
> >>>>>>>      00000000 0800ca79 38000500 00000001
> >>>>>>>      stack_dump: 0x38000c88: 00000064 38000cf0 00000064 0800ca33
> >>>>>>>      38000500 00000001 00000064 00000000
> >>>>>>>      stack_dump: 0x38000ca8: 00000000 08009325 00000000 38000500
> >>>>>>>      00000001 0800c9fd 00000000 080052f1
> >>>>>>>      stack_dump: 0x38000cc8: 00000000 38000500 00000000 38000158
> >>>>>>>      00000001 00000001 00000000 00000000
> >>>>>>>      stack_dump: 0x38000ce8: 00000000 00000000 00000000 00000000
> >>>>>>>      00000000 00000000 00000000 00000000
> >>>>>>>      dump_tasks:    PID GROUP PRI POLICY   TYPE    NPX STATE  EVENT
> >>>>>>>        SIGMASK          STACKBASE  STACKSIZE   COMMAND
> >>>>>>>      dump_task:       0     0   0 FIFO     Kthread -   Ready
> >>>>>>>      0000000000000000 0x240018b0      1000   <noname>
> >>>>>>>      dump_task:       1     1 100 RR       Task    -   Running
> >>>>>>>      0000000000000000 0x38000518      2000   <noname> ��]���&
> >>>>>>>
> >>>>>>>      Wondering if anyone has run across this before?  Backtrace
> >> shows:
> >>>>>>>      Program received signal SIGTRAP, Trace/breakpoint trap.
> >>>>>>>      exception_common () at armv7-m/arm_exception.S:127
> >>>>>>>      127             mrs             r0, ipsr           /*
> >> R0=exception
> >>>>>>>      number */
> >>>>>>>      where
> >>>>>>>      #0  exception_common () at armv7-m/arm_exception.S:127
> >>>>>>>      #1  <signal handler called>
> >>>>>>>      #2  0x08005984 in env_cmpname (pszname=0x801e624 "PS1",
> >>>>>>>           peqname=0xa <error: Cannot access memory at address 0xa>)
> >>>>>>>           at environ/env_findvar.c:50
> >>>>>>>      #3  0x080059da in env_findvar (group=0x38000200,
> pname=0x801e624
> >>>>>>>      "PS1")
> >>>>>>>           at environ/env_findvar.c:105
> >>>>>>>      #4  0x080172c0 in getenv (name=0x801e624 "PS1") at
> >>>>>>>      environ/env_getenv.c:89
> >>>>>>>      #5  0x0800caa0 in nsh_update_prompt () at nsh_prompt.c:77
> >>>>>>>      #6  0x0800cc76 in nsh_session (pstate=0x38000cf0, login=1,
> >> argc=1,
> >>>>>>>           argv=0x38000500) at nsh_session.c:249
> >>>>>>>      #7  0x0800ca78 in nsh_consolemain (argc=1, argv=0x38000500)
> >>>>>>>           at nsh_consolemain.c:77
> >>>>>>>      #8  0x0800ca32 in nsh_main (argc=1, argv=0x38000500) at nsh_
> >>>>>> main.c:76
> >>>>>>>      #9  0x08009324 in nxtask_startup (entrypt=0x800c9fd
> <nsh_main>,
> >>>>>>>      argc=1,
> >>>>>>>           argv=0x38000500) at sched/task_startup.c:72
> >>>>>>>      #10 0x080052f0 in nxtask_start () at task/task_start.c:104
> >>>>>>>      #11 0x00000000 in ?? ()
> >>>>>>>
> >>>>>>>      Scratching the surface shows that env_findvar() is called with
> >>>>>> group
> >>>>>>>      pointer of 0x38000200, group->tg_envp is 0x380004b8, both
> which
> >>>>>> are
> >>>>>>>      reasonable. But *group->tg_envp is 0xA.  Further if I "watch
> >>>>>>>      *(int*)0x380004b8" in GDB, I see it is getting overwritten by
> >>>>>>>      up_serialout() invoked from stm32_serial.c::up_send.
> >>>>>>>
> >>>>>>>      Any suggestions on how I can best track this down further?
> >>>>>>>
> >>>>>>>      Thanks in advance!
> >>>>>>>
> >>>>>>>      --
> >>>>>>>      Peter Barada
> >>>>>>>      [email protected]
> >>>>>>>
> >>>>>> --
> >>>>>> Peter Barada
> >>>>>> [email protected]
> >>>>>>
> >>>>> --
> >>>> Peter [email protected]
> >>>>
> >>>> --
> >>> Peter [email protected]
> >>>
> >>>
> --
> Peter Barada
> [email protected]
>
>

Re: STM32H7 crash

Reply via email to