it’s easiest to inspect these addresses with gdb :) arm-none-eabi-gdb bin/targets/……. .elf
and then start feeding those addresses to see which ones look likely to be part of callchain. x/i 0x0003b4d8 x/i 0x000246a7 x/i 0x0003b4d8 etc > On Aug 31, 2018, at 3:30 PM, Aditya Xavier <[email protected]> > wrote: > > Am really bad at GDB. Also its like a rabbit hole :) > > I ported over my application with the git version of Mynewt-core, and enabled > OS_CRASH_STACKTRACE. > > With it enabled, the following is the dump. > > #mesh-onoff STATUS: Sent ! > Action Received over MESH Length :- 14 > 000486 Unhandled interrupt (3), exception sp 0x2000aba0 > 000486 r0:0xcf0f98cb r1:0x5c5a76b3 r2:0x681af5c8 r3:0xb1334673 > 000486 r4:0x2000ac68 r5:0x00000007 r6:0x00000000 r7:0x200008a9 > 000486 r8:0x2000acf0 r9:0x00012101 r10:0xd7229882 r11:0xd929b3bb > 000486 r12:0x7e3cdeb8 lr:0x2266a80b pc:0x59d8de5b psr:0xe8eb9828 > 000486 ICSR:0x00421803 HFSR:0x40000000 CFSR:0x00040000 > 000486 BFAR:0xe000ed38 MMFAR:0xe000ed34 > 000486 task:DECODE_TASK > 000486 0x2000abec: 0x0003b4d8 > 000486 0x2000abf4: 0x000246a7 > 000486 0x2000ac04: 0x0003b4d8 > 000486 0x2000ac0c: 0x0002488d > 000486 0x2000ac4c: 0x00012101 > 000486 0x2000ad0c: 0x0000c1e7 > 000486 0x2000ad1c: 0x0000c1e7 > 000486 0x2000ad2c: 0x0000c211 > 000486 0x2000ad30: 0x0003ad44 > 000486 0x2000ad3c: 0x00013023 > 000486 0x2000ad58: 0x000238e1 > 000486 0x2000ad60: 0x00037f81 > 000486 0x2000ad6c: 0x00023a79 > 000486 0x2000ad70: 0x00039b80 > 000486 0x2000ad74: 0x00039b7f > 000486 0x2000ad84: 0x00023587 > 000486 0x2000ada8: 0x000087cd > 000486 0x2000adc4: 0x0000d51d > 000486 0x2000adc8: 0x0000d51c > 000486 0x2000add8: 0x000398cd > 000486 0x2000ade4: 0x000087e9 > 000486 0x2000ae08: 0x00010001 > 000486 0x2000ae0c: 0x0001c239 > 000486 0x2000ae10: 0x0003b35c > 000486 0x2000ae1c: 0x00020001 > 000486 0x2000ae20: 0x0001c38d > 000486 0x2000ae30: 0x00030001 > 000486 0x2000ae34: 0x0001c509 > 000486 0x2000ae48: 0x0001c38d > 000486 0x2000ae5c: 0x0001c509 > 000486 0x2000ae70: 0x0001c239 > 000486 0x2000ae74: 0x0003b37c > 000486 0x2000ae84: 0x0001c38d > 000486 0x2000ae98: 0x0001c509 > 000486 0x2000aeac: 0x0001c54d > 000486 0x2000aec0: 0x0001c239 > 000486 0x2000aec4: 0x0003ba28 > 000486 0x2000aed4: 0x0001c38d > 000486 0x2000aee8: 0x0001c509 > 000486 0x2000aefc: 0x0001c38d > 000486 0x2000af10: 0x0001c509 > 000486 0x2000af24: 0x0001c54d > 000486 0x2000af38: 0x0001c38d > 000486 0x2000af4c: 0x0001c509 > 000486 0x2000af60: 0x0001c38d > 000486 0x2000af74: 0x0001c509 > 000486 0x2000af88: 0x0001c54d > 000486 0x2000af9c: 0x0001c38d > 000486 0x2000afb0: 0x0001c509 > > >> On 31-Aug-2018, at 5:21 PM, marko kiiskila <[email protected]> wrote: >> >> Some suggestions (inline). >> >>> On Aug 31, 2018, at 2:32 PM, Aditya Xavier <[email protected]> >>> wrote: >>> >>> Gosh, this doesn’t make much sense to me :( >>> >>> (gdb) monitor go >>> (gdb) monitor reset >>> Resetting target >>> (gdb) c >>> Continuing. >>> >>> Program received signal SIGTRAP, Trace/breakpoint trap. >>> hal_system_reset () at >>> repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50 >>> 50 asm("bkpt"); >>> (gdb) bt >>> #0 hal_system_reset () at >>> repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50 >>> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at >>> repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170 >>> #2 0x0000da56 in os_default_irq_asm () at >>> repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260 >>> #3 <signal handler called> >>> #4 0x00000000 in ?? () >>> Backtrace stopped: previous frame identical to this frame (corrupt stack?) >>> (gdb) frame 1 >>> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at >>> repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170 >>> 170 hal_system_reset(); >>> (gdb) p/x *tf >>> $1 = {ef = 0x2000abd0, r4 = 0x1b000000, r5 = 0x2000acc0, r6 = 0x2000aca0, >>> r7 = 0x7, r8 = 0x0, r9 = 0x200008a9, r10 = 0x2000ad28, r11 = 0x11d91, lr = >>> 0xfffffffd} >>> (gdb) p/x *tf->ef >>> $2 = {r0 = 0xd7229882, r1 = 0xd929b3bb, r2 = 0xcf0f98cb, r3 = 0x5c5a76b3, >>> r12 = 0x681af5c8, lr = 0xb1334673, pc = 0x7e3cdeb8, psr = 0x2266a80b} >>> (gdb) x/32x 0xd7229882 >>> 0xd7229882: 0x00000000 0x00000000 0x00000000 0x00000000 >>> 0xd7229892: 0x00000000 0x00000000 0x00000000 0x00000000 >>> 0xd72298a2: 0x00000000 0x00000000 0x00000000 0x00000000 >>> 0xd72298b2: 0x00000000 0x00000000 0x00000000 0x00000000 >>> 0xd72298c2: 0x00000000 0x00000000 0x00000000 0x00000000 >>> 0xd72298d2: 0x00000000 0x00000000 0x00000000 0x00000000 >>> 0xd72298e2: 0x00000000 0x00000000 0x00000000 0x00000000 >>> 0xd72298f2: 0x00000000 0x00000000 0x00000000 0x00000000 >>> (gdb) x/32x 0x2000abd0 >>> 0x2000abd0: 0xd7229882 0xd929b3bb 0xcf0f98cb 0x5c5a76b3 >>> 0x2000abe0: 0x681af5c8 0xb1334673 0x7e3cdeb8 0x2266a80b >>> 0x2000abf0: 0x59d8de5b 0xe8eb9828 0x96d74690 0xb4b1ee9b >>> 0x2000ac00: 0x95f0cad6 0x7d1b52fe 0xebcc146e 0x5f7dfaf5 >>> 0x2000ac10: 0x62dd2c19 0x1fc67ee7 0xf40a6a89 0xab77907c >> >> ^^^^^ looks bad, especially the top area. Should have dump of registers >> stored at the time the crash. >> >> >>> 0x2000ac20: 0x00000010 0x00039c74 0x2000ad28 0x0002329f >>> 0x2000ac30: 0xd87c5730 0xa203a288 0x00000010 0x00039c74 >>> 0x2000ac40: 0x2000ad28 0x00023485 0x00000000 0x00000000 >>> (gdb) p &__text >>> No symbol "__text" in current context. >>> (gdb) p &__etext >>> $3 = (<data variable, no debug info> *) 0x3a9c8 >>> (gdb) p &__text >>> No symbol "__text" in current context. >> >> This was probably added at the same time as OS_STACK_BACKTRACE. >> You’re looking for values between start of your image slot and 0x3a9c8. >> >>> (gdb) x/i 0xd7229882 >>> 0xd7229882: movs r0, r0 >>> (gdb) list *0xd7229882 >>> (gdb) x/i 0x681af5c8 >>> 0x681af5c8: movs r0, r0 >>> (gdb) x/i 0x59d8de5b >>> 0x59d8de5b: movs r0, r0 >>> (gdb) x/i 0x62dd2c19 >>> 0x62dd2c19: movs r0, r0 >>> (gdb) x/i 0x2000ad28 >>> 0x2000ad28: lsls r0, r2, #6 >>> (gdb) x/i 0x1fc67ee7 >>> 0x1fc67ee7: movs r0, r0 >>> (gdb) x/i 0xa203a288 >>> 0xa203a288: movs r0, r0 >>> (gdb) x/i 0xe8eb9828 >>> 0xe8eb9828: movs r0, r0 >>> (gdb) x/i 0xcf0f98cb >>> 0xcf0f98cb: movs r0, r0 >>> (gdb) x/i 0x96d74690 >>> 0x96d74690: movs r0, r0 >>> (gdb) x/i 0xf40a6a89 >>> 0xf40a6a89: movs r0, r0 >>> (gdb) x/i 0x2000ad28 >>> 0x2000ad28: lsls r0, r2, #6 >>> (gdb) x/i 0x00000010 >>> 0x10: movs r0, r0 >>> (gdb) x/i 0x0002329f >>> 0x2329f <shift_rows+108>: add sp, #20 >>> (gdb) x/i 0x00039c74 >>> 0x39c74 <sbox>: ldrb r3, [r4, #17] >>> (gdb) x/i 0xa203a288 >>> 0xa203a288: movs r0, r0 >>> (gdb) x/i 0x0002329f >>> 0x2329f <shift_rows+108>: add sp, #20 >>> (gdb) list *0x0002329f >>> 0x2329f is in shift_rows >>> (repos/apache-mynewt-core/crypto/tinycrypt/src/aes_encrypt.c:156). >>> 151 t[0] = s[0]; t[1] = s[5]; t[2] = s[10]; t[3] = s[15]; >>> 152 t[4] = s[4]; t[5] = s[9]; t[6] = s[14]; t[7] = s[3]; >>> 153 t[8] = s[8]; t[9] = s[13]; t[10] = s[2]; t[11] = s[7]; >>> 154 t[12] = s[12]; t[13] = s[1]; t[14] = s[6]; t[15] = s[11]; >>> 155 (void) _copy(s, sizeof(t), t, sizeof(t)); >>> 156 } >>> 157 >>> 158 int tc_aes_encrypt(uint8_t *out, const uint8_t *in, const >>> TCAesKeySched_t s) >>> 159 { >>> 160 uint8_t state[Nk*Nb]; >> >> That could be writing that random looking data in the stack. encrypted data >> should >> look like gibberish. >> Follow the stack a bit further starting continuing from 0x2000ac50. See if >> you >> find who called it. I’m hazarding a guess that one of those args passed to >> aes_encrypt() >> is pointing to stack, and there’s not enough memory allocated to hold that >> data. >> >> >>>> On 31-Aug-2018, at 4:46 PM, marko kiiskila <[email protected]> wrote: >>>> >>>> Sure. Something like this: >>>> >>>> 000933 compat> crash div0 >>>> crash div0 >>>> 003157 Unhandled interrupt (3), exception sp 0x20001dd8 >>>> 003157 r0:0x00000000 r1:0x00017161 r2:0x00000000 r3:0x0000002a >>>> 003157 r4:0x200041d6 r5:0x00000000 r6:0x20000318 r7:0x00000000 >>>> 003157 r8:0x00000000 r9:0x00000000 r10:0x00000000 r11:0x00000000 >>>> 003157 r12:0x00000000 lr:0x00014949 pc:0x00014978 psr:0x61000000 >>>> 003157 ICSR:0x00421803 HFSR:0x40000000 CFSR:0x02000000 >>>> 003157 BFAR:0xe000ed38 MMFAR:0xe000ed34 >>>> >>>> Then from gdb: >>>> >>>> Program received signal SIGTRAP, Trace/breakpoint trap. >>>> hal_system_reset () >>>> at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50 >>>> 50 asm("bkpt"); >>>> (gdb) bt >>>> #0 hal_system_reset () >>>> at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50 >>>> #1 0x00008be8 in os_default_irq (tf=0x2000ffc0) >>>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171 >>>> #2 0x0000a5b6 in os_default_irq_asm () >>>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260 >>>> #3 <signal handler called> >>>> #4 0x00000000 in ?? () >>>> #5 0x0000812c in Reset_Handler () >>>> at >>>> repos/apache-mynewt-core/hw/bsp/nrf52dk/src/arch/cortex_m4/gcc_startup_nrf52.s:180 >>>> Backtrace stopped: previous frame identical to this frame (corrupt stack?) >>>> (gdb) frame 1 >>>> #1 0x00008be8 in os_default_irq (tf=0x2000ffc0) >>>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171 >>>> 171 hal_system_reset(); >>>> (gdb) p/x *tf >>>> $1 = {ef = 0x20001dd8, r4 = 0x200041d6, r5 = 0x0, r6 = 0x20000318, r7 = >>>> 0x0, >>>> r8 = 0x0, r9 = 0x0, r10 = 0x0, r11 = 0x0, lr = 0xfffffffd} >>>> (gdb) p/x *tf->ef >>>> $2 = {r0 = 0x0, r1 = 0x17161, r2 = 0x0, r3 = 0x2a, r12 = 0x0, lr = >>>> 0x14949, >>>> pc = 0x14978, psr = 0x61000000} >>>> (gdb) x/32x 0x20001dd8 >>>> 0x20001dd8 <os_main_stack+3896>: 0x00000000 0x00017161 >>>> 0x00000000 0x0000002a >>>> 0x20001de8 <os_main_stack+3912>: 0x00000000 0x00014949 >>>> 0x00014978 0x61000000 >>>> 0x20001df8 <os_main_stack+3928>: 0x00000003 0x00000000 >>>> 0x00000000 0x0000002a >>>> 0x20001e08 <os_main_stack+3944>: 0x00000001 0x00000002 >>>> 0x0000000a 0x00014a21 >>>> 0x20001e18 <os_main_stack+3960>: 0x00014a15 0x0000ebd9 >>>> 0x00000000 0x200041d0 >>>> 0x20001e28 <os_main_stack+3976>: 0x200041d6 0x00000000 >>>> 0x0000000a 0x0001574d >>>> 0x20001e38 <os_main_stack+3992>: 0x00015741 0x0000c925 >>>> 0x200041d0 0x00000011 >>>> 0x20001e48 <os_main_stack+4008>: 0x00000073 0x200041d3 >>>> 0x00000000 0x0000ede9 >>>> (gdb) p &__text >>>> $3 = (<data variable, no debug info> *) 0x8020 <__isr_vector> >>>> (gdb) p &__etext >>>> $4 = (<data variable, no debug info> *) 0x175f0 >>>> (gdb) x/i 0x00017161 >>>> 0x17161: movs r0, r0 >>>> (gdb) x/i 0x00014949 >>>> 0x14949 <crash_device+12>: cbz r0, 0x1496a <crash_device+46> >>>> (gdb) x/i 0x00014978 >>>> 0x14978 <crash_device+60>: sdiv r3, r3, r2 >>>> (gdb) x/i 0x00014a21 >>>> 0x14a21 <crash_cli_cmd+12>: cbz r0, 0x14a28 <crash_cli_cmd+20> >>>> (gdb) x/i 0x00014a15 >>>> 0x14a15 <crash_cli_cmd>: push {r3, lr} >>>> (gdb) list *0x14949 >>>> 0x14949 is in crash_device >>>> (repos/apache-mynewt-core/test/crash_test/src/crash_test.c:42). >>>> warning: Source file is more recent than executable. >>>> 37 int >>>> 38 crash_device(char *how) >>>> 39 { >>>> 40 volatile int val1, val2, val3; >>>> 41 >>>> 42 if (!strcmp(how, "div0")) { >>>> 43 >>>> 44 val1 = 42; >>>> 45 val2 = 0; >>>> 46 >>>> (gdb) list *0x00014a21 >>>> 0x14a21 is in crash_cli_cmd >>>> (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41). >>>> 36 }; >>>> 37 >>>> 38 static int >>>> 39 crash_cli_cmd(int argc, char **argv) >>>> 40 { >>>> 41 if (argc >= 2 && crash_device(argv[1]) == 0) { >>>> 42 return 0; >>>> 43 } >>>> 44 console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n"); >>>> 45 return 0; >>>> (gdb) list *0x14a21 >>>> 0x14a21 is in crash_cli_cmd >>>> (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41). >>>> 36 }; >>>> 37 >>>> 38 static int >>>> 39 crash_cli_cmd(int argc, char **argv) >>>> 40 { >>>> 41 if (argc >= 2 && crash_device(argv[1]) == 0) { >>>> 42 return 0; >>>> 43 } >>>> 44 console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n"); >>>> 45 return 0; >>>> >>>> good luck. >>>> >>>>> On Aug 31, 2018, at 2:10 PM, Aditya Xavier <[email protected]> >>>>> wrote: >>>>> >>>>> It seems OS_CRASH_STACKTRACE was introduced after 1.4.1 and hence not in >>>>> the release. >>>>> >>>>> If I change the release, I believe there would be many API changes to be >>>>> done on MESH side. >>>>> >>>>> Can you guide me on how to "manually walk the stack for looking for >>>>> things which look like pointers to text” ? >>>>> >>>>> My gdb skill are pretty weak. >>>>> >>>>> I tried gdb where, with the following outcome. >>>>> >>>>> (gdb) c >>>>> Continuing. >>>>> >>>>> >>>>> Program received signal SIGTRAP, Trace/breakpoint trap. >>>>> hal_system_reset () at >>>>> repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50 >>>>> 50 asm("bkpt"); >>>>> (gdb) >>>>> Continuing. >>>>> >>>>> Program received signal SIGTRAP, Trace/breakpoint trap. >>>>> hal_system_reset () at >>>>> repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50 >>>>> 50 asm("bkpt"); >>>>> (gdb) where >>>>> #0 hal_system_reset () at >>>>> repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50 >>>>> #1 0x0000bf2e in os_default_irq (tf=0x2000ffc8) at >>>>> repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170 >>>>> #2 0x0000da56 in os_default_irq_asm () at >>>>> repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260 >>>>> #3 <signal handler called> >>>>> #4 0x00000000 in ?? () >>>>> Backtrace stopped: previous frame identical to this frame (corrupt stack?) >>>>> >>>>> >>>>> >>>>>> On 31-Aug-2018, at 4:30 PM, marko kiiskila <[email protected]> wrote: >>>>>> >>>>>> >>>>>> >>>>>>> On Aug 31, 2018, at 1:47 PM, Aditya Xavier >>>>>>> <[email protected]> wrote: >>>>>>> >>>>>>> Hi ! >>>>>>> >>>>>>> Am having an issue with Sending and Receiving a Mesh Message. Though am >>>>>>> positive the problem is more towards releasing the semaphore. >>>>>>> >>>>>>> Action Received over MESH Length :- 15 >>>>>>> 012273 Unhandled interrupt (3), exception sp 0x2000abd0 >>>>>>> 012273 r0:0xd7229882 r1:0xd929b3bb r2:0xcf0f98cb r3:0x5c5a76b3 >>>>>>> 012273 r4:0x1b000000 r5:0x2000acc0 r6:0x2000aca0 r7:0x00000008 >>>>>>> 012273 r8:0x00000000 r9:0x200008a9 r10:0x2000ad28 r11:0x00011d91 >>>>>>> 012273 r12:0x681af5c8 lr:0xb1334673 pc:0x7e3cdeb8 psr:0x2266a80b >>>>>>> 012273 ICSR:0x00411803 HFSR:0x40000000 CFSR:0x00040000 >>>>>>> 012273 BFAR:0xe000ed38 MMFAR:0xe000ed34 >>>>>>> >>>>>>> Am sending a group mesh message for testing. The sequence of events are >>>>>>> as follows. >>>>>>> >>>>>>> Button TASK -> send message over MESH -> Mesh receives message on model >>>>>>> -> copies the data and starts releases the Semaphore for another task >>>>>>> -> LOG Task starts and logs the message. >>>>>>> >>>>>>> In this entire flow, the moment I receive the message and release the >>>>>>> semaphore the firmware crashes. >>>>>>> >>>>>>> I tried increasing the STACK size of the LOG task, however that didn’t >>>>>>> help. >>>>>>> >>>>>>> Could someone let me know how to understand where / why the crash is >>>>>>> happening ? >>>>>> >>>>>> Looking at your registers they seem to be garbage, so I’m guessing stack >>>>>> corruption of some sort; does not have to be overflow. >>>>>> Try turning on OS_CRASH_STACKTRACE, or manually walk the stack for >>>>>> looking for things which >>>>>> look like pointers to text. >
