Re: Mynewt crash when releasing semaphore

marko kiiskila Fri, 31 Aug 2018 05:35:57 -0700

it’s easiest to inspect these addresses with gdb :)

arm-none-eabi-gdb bin/targets/……. .elf


and then start feeding those addresses to see which ones look likely to be part
of callchain.

x/i 0x0003b4d8
x/i 0x000246a7
x/i 0x0003b4d8
etc

> On Aug 31, 2018, at 3:30 PM, Aditya Xavier <[email protected]> 
> wrote:
> 
> Am really bad at GDB. Also its like a rabbit hole :)
> 
> I ported over my application with the git version of Mynewt-core, and enabled 
> OS_CRASH_STACKTRACE.
> 
> With it enabled, the following is the dump.
> 
> #mesh-onoff STATUS: Sent !
> Action Received over MESH Length :- 14
> 000486 Unhandled interrupt (3), exception sp 0x2000aba0
> 000486  r0:0xcf0f98cb  r1:0x5c5a76b3  r2:0x681af5c8  r3:0xb1334673
> 000486  r4:0x2000ac68  r5:0x00000007  r6:0x00000000  r7:0x200008a9
> 000486  r8:0x2000acf0  r9:0x00012101 r10:0xd7229882 r11:0xd929b3bb
> 000486 r12:0x7e3cdeb8  lr:0x2266a80b  pc:0x59d8de5b psr:0xe8eb9828
> 000486 ICSR:0x00421803 HFSR:0x40000000 CFSR:0x00040000
> 000486 BFAR:0xe000ed38 MMFAR:0xe000ed34
> 000486 task:DECODE_TASK
> 000486  0x2000abec: 0x0003b4d8
> 000486  0x2000abf4: 0x000246a7
> 000486  0x2000ac04: 0x0003b4d8
> 000486  0x2000ac0c: 0x0002488d
> 000486  0x2000ac4c: 0x00012101
> 000486  0x2000ad0c: 0x0000c1e7
> 000486  0x2000ad1c: 0x0000c1e7
> 000486  0x2000ad2c: 0x0000c211
> 000486  0x2000ad30: 0x0003ad44
> 000486  0x2000ad3c: 0x00013023
> 000486  0x2000ad58: 0x000238e1
> 000486  0x2000ad60: 0x00037f81
> 000486  0x2000ad6c: 0x00023a79
> 000486  0x2000ad70: 0x00039b80
> 000486  0x2000ad74: 0x00039b7f
> 000486  0x2000ad84: 0x00023587
> 000486  0x2000ada8: 0x000087cd
> 000486  0x2000adc4: 0x0000d51d
> 000486  0x2000adc8: 0x0000d51c
> 000486  0x2000add8: 0x000398cd
> 000486  0x2000ade4: 0x000087e9
> 000486  0x2000ae08: 0x00010001
> 000486  0x2000ae0c: 0x0001c239
> 000486  0x2000ae10: 0x0003b35c
> 000486  0x2000ae1c: 0x00020001
> 000486  0x2000ae20: 0x0001c38d
> 000486  0x2000ae30: 0x00030001
> 000486  0x2000ae34: 0x0001c509
> 000486  0x2000ae48: 0x0001c38d
> 000486  0x2000ae5c: 0x0001c509
> 000486  0x2000ae70: 0x0001c239
> 000486  0x2000ae74: 0x0003b37c
> 000486  0x2000ae84: 0x0001c38d
> 000486  0x2000ae98: 0x0001c509
> 000486  0x2000aeac: 0x0001c54d
> 000486  0x2000aec0: 0x0001c239
> 000486  0x2000aec4: 0x0003ba28
> 000486  0x2000aed4: 0x0001c38d
> 000486  0x2000aee8: 0x0001c509
> 000486  0x2000aefc: 0x0001c38d
> 000486  0x2000af10: 0x0001c509
> 000486  0x2000af24: 0x0001c54d
> 000486  0x2000af38: 0x0001c38d
> 000486  0x2000af4c: 0x0001c509
> 000486  0x2000af60: 0x0001c38d
> 000486  0x2000af74: 0x0001c509
> 000486  0x2000af88: 0x0001c54d
> 000486  0x2000af9c: 0x0001c38d
> 000486  0x2000afb0: 0x0001c509
> 
> 
>> On 31-Aug-2018, at 5:21 PM, marko kiiskila <[email protected]> wrote:
>> 
>> Some suggestions (inline).
>> 
>>> On Aug 31, 2018, at 2:32 PM, Aditya Xavier <[email protected]> 
>>> wrote:
>>> 
>>> Gosh, this doesn’t make much sense to me :(
>>> 
>>> (gdb) monitor go
>>> (gdb) monitor reset
>>> Resetting target
>>> (gdb) c
>>> Continuing.
>>> 
>>> Program received signal SIGTRAP, Trace/breakpoint trap.
>>> hal_system_reset () at 
>>> repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>> 50              asm("bkpt");
>>> (gdb) bt
>>> #0  hal_system_reset () at 
>>> repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>> #1  0x0000bf2e in os_default_irq (tf=0x2000ffc8) at 
>>> repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
>>> #2  0x0000da56 in os_default_irq_asm () at 
>>> repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
>>> #3  <signal handler called>
>>> #4  0x00000000 in ?? ()
>>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>>> (gdb) frame 1
>>> #1  0x0000bf2e in os_default_irq (tf=0x2000ffc8) at 
>>> repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
>>> 170     hal_system_reset();
>>> (gdb) p/x *tf
>>> $1 = {ef = 0x2000abd0, r4 = 0x1b000000, r5 = 0x2000acc0, r6 = 0x2000aca0, 
>>> r7 = 0x7, r8 = 0x0, r9 = 0x200008a9, r10 = 0x2000ad28, r11 = 0x11d91, lr = 
>>> 0xfffffffd}
>>> (gdb) p/x *tf->ef
>>> $2 = {r0 = 0xd7229882, r1 = 0xd929b3bb, r2 = 0xcf0f98cb, r3 = 0x5c5a76b3, 
>>> r12 = 0x681af5c8, lr = 0xb1334673, pc = 0x7e3cdeb8, psr = 0x2266a80b}
>>> (gdb) x/32x 0xd7229882
>>> 0xd7229882: 0x00000000      0x00000000      0x00000000      0x00000000
>>> 0xd7229892: 0x00000000      0x00000000      0x00000000      0x00000000
>>> 0xd72298a2: 0x00000000      0x00000000      0x00000000      0x00000000
>>> 0xd72298b2: 0x00000000      0x00000000      0x00000000      0x00000000
>>> 0xd72298c2: 0x00000000      0x00000000      0x00000000      0x00000000
>>> 0xd72298d2: 0x00000000      0x00000000      0x00000000      0x00000000
>>> 0xd72298e2: 0x00000000      0x00000000      0x00000000      0x00000000
>>> 0xd72298f2: 0x00000000      0x00000000      0x00000000      0x00000000
>>> (gdb) x/32x 0x2000abd0
>>> 0x2000abd0: 0xd7229882      0xd929b3bb      0xcf0f98cb      0x5c5a76b3
>>> 0x2000abe0: 0x681af5c8      0xb1334673      0x7e3cdeb8      0x2266a80b
>>> 0x2000abf0: 0x59d8de5b      0xe8eb9828      0x96d74690      0xb4b1ee9b
>>> 0x2000ac00: 0x95f0cad6      0x7d1b52fe      0xebcc146e      0x5f7dfaf5
>>> 0x2000ac10: 0x62dd2c19      0x1fc67ee7      0xf40a6a89      0xab77907c
>> 
>> ^^^^^ looks bad, especially the top area. Should have dump of registers
>> stored at the time the crash.
>> 
>> 
>>> 0x2000ac20: 0x00000010      0x00039c74      0x2000ad28      0x0002329f
>>> 0x2000ac30: 0xd87c5730      0xa203a288      0x00000010      0x00039c74
>>> 0x2000ac40: 0x2000ad28      0x00023485      0x00000000      0x00000000
>>> (gdb) p &__text
>>> No symbol "__text" in current context.
>>> (gdb)  p &__etext
>>> $3 = (<data variable, no debug info> *) 0x3a9c8
>>> (gdb) p &__text
>>> No symbol "__text" in current context.
>> 
>> This was probably added at the same time as OS_STACK_BACKTRACE.
>> You’re looking for values between start of your image slot and 0x3a9c8.
>> 
>>> (gdb) x/i 0xd7229882
>>> 0xd7229882: movs    r0, r0
>>> (gdb) list *0xd7229882
>>> (gdb) x/i 0x681af5c8
>>> 0x681af5c8: movs    r0, r0
>>> (gdb) x/i 0x59d8de5b
>>> 0x59d8de5b: movs    r0, r0
>>> (gdb) x/i 0x62dd2c19
>>> 0x62dd2c19: movs    r0, r0
>>> (gdb) x/i 0x2000ad28
>>> 0x2000ad28: lsls    r0, r2, #6
>>> (gdb) x/i 0x1fc67ee7
>>> 0x1fc67ee7: movs    r0, r0
>>> (gdb) x/i 0xa203a288
>>> 0xa203a288: movs    r0, r0
>>> (gdb) x/i 0xe8eb9828
>>> 0xe8eb9828: movs    r0, r0
>>> (gdb) x/i 0xcf0f98cb
>>> 0xcf0f98cb: movs    r0, r0
>>> (gdb) x/i 0x96d74690
>>> 0x96d74690: movs    r0, r0
>>> (gdb) x/i 0xf40a6a89
>>> 0xf40a6a89: movs    r0, r0
>>> (gdb) x/i 0x2000ad28
>>> 0x2000ad28: lsls    r0, r2, #6
>>> (gdb) x/i 0x00000010
>>> 0x10:       movs    r0, r0
>>> (gdb) x/i 0x0002329f
>>> 0x2329f <shift_rows+108>:   add     sp, #20
>>> (gdb) x/i 0x00039c74
>>> 0x39c74 <sbox>:     ldrb    r3, [r4, #17]
>>> (gdb) x/i 0xa203a288
>>> 0xa203a288: movs    r0, r0
>>> (gdb) x/i 0x0002329f
>>> 0x2329f <shift_rows+108>:   add     sp, #20
>>> (gdb) list *0x0002329f
>>> 0x2329f is in shift_rows 
>>> (repos/apache-mynewt-core/crypto/tinycrypt/src/aes_encrypt.c:156).
>>> 151         t[0]  = s[0]; t[1] = s[5]; t[2] = s[10]; t[3] = s[15];
>>> 152         t[4]  = s[4]; t[5] = s[9]; t[6] = s[14]; t[7] = s[3];
>>> 153         t[8]  = s[8]; t[9] = s[13]; t[10] = s[2]; t[11] = s[7];
>>> 154         t[12] = s[12]; t[13] = s[1]; t[14] = s[6]; t[15] = s[11];
>>> 155         (void) _copy(s, sizeof(t), t, sizeof(t));
>>> 156 }
>>> 157 
>>> 158 int tc_aes_encrypt(uint8_t *out, const uint8_t *in, const 
>>> TCAesKeySched_t s)
>>> 159 {
>>> 160         uint8_t state[Nk*Nb];
>> 
>> That could be writing that random looking data in the stack. encrypted data 
>> should
>> look like gibberish.
>> Follow the stack a bit further starting continuing from 0x2000ac50. See if 
>> you
>> find who called it. I’m hazarding a guess that one of those args passed to 
>> aes_encrypt()
>> is pointing to stack, and there’s not enough memory allocated to hold that 
>> data.
>> 
>> 
>>>> On 31-Aug-2018, at 4:46 PM, marko kiiskila <[email protected]> wrote:
>>>> 
>>>> Sure. Something like this:
>>>> 
>>>> 000933 compat> crash div0
>>>> crash div0
>>>> 003157 Unhandled interrupt (3), exception sp 0x20001dd8
>>>> 003157  r0:0x00000000  r1:0x00017161  r2:0x00000000  r3:0x0000002a
>>>> 003157  r4:0x200041d6  r5:0x00000000  r6:0x20000318  r7:0x00000000
>>>> 003157  r8:0x00000000  r9:0x00000000 r10:0x00000000 r11:0x00000000
>>>> 003157 r12:0x00000000  lr:0x00014949  pc:0x00014978 psr:0x61000000
>>>> 003157 ICSR:0x00421803 HFSR:0x40000000 CFSR:0x02000000
>>>> 003157 BFAR:0xe000ed38 MMFAR:0xe000ed34
>>>> 
>>>> Then from gdb:
>>>> 
>>>> Program received signal SIGTRAP, Trace/breakpoint trap.
>>>> hal_system_reset ()
>>>> at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>>> 50             asm("bkpt");
>>>> (gdb) bt
>>>> #0  hal_system_reset ()
>>>> at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>>> #1  0x00008be8 in os_default_irq (tf=0x2000ffc0)
>>>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171
>>>> #2  0x0000a5b6 in os_default_irq_asm ()
>>>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
>>>> #3  <signal handler called>
>>>> #4  0x00000000 in ?? ()
>>>> #5  0x0000812c in Reset_Handler ()
>>>> at 
>>>> repos/apache-mynewt-core/hw/bsp/nrf52dk/src/arch/cortex_m4/gcc_startup_nrf52.s:180
>>>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>>>> (gdb) frame 1
>>>> #1  0x00008be8 in os_default_irq (tf=0x2000ffc0)
>>>> at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:171
>>>> 171            hal_system_reset();
>>>> (gdb) p/x *tf
>>>> $1 = {ef = 0x20001dd8, r4 = 0x200041d6, r5 = 0x0, r6 = 0x20000318, r7 = 
>>>> 0x0, 
>>>> r8 = 0x0, r9 = 0x0, r10 = 0x0, r11 = 0x0, lr = 0xfffffffd}
>>>> (gdb) p/x *tf->ef
>>>> $2 = {r0 = 0x0, r1 = 0x17161, r2 = 0x0, r3 = 0x2a, r12 = 0x0, lr = 
>>>> 0x14949, 
>>>> pc = 0x14978, psr = 0x61000000}
>>>> (gdb) x/32x 0x20001dd8
>>>> 0x20001dd8 <os_main_stack+3896>:   0x00000000      0x00017161      
>>>> 0x00000000      0x0000002a
>>>> 0x20001de8 <os_main_stack+3912>:   0x00000000      0x00014949      
>>>> 0x00014978      0x61000000
>>>> 0x20001df8 <os_main_stack+3928>:   0x00000003      0x00000000      
>>>> 0x00000000      0x0000002a
>>>> 0x20001e08 <os_main_stack+3944>:   0x00000001      0x00000002      
>>>> 0x0000000a      0x00014a21
>>>> 0x20001e18 <os_main_stack+3960>:   0x00014a15      0x0000ebd9      
>>>> 0x00000000      0x200041d0
>>>> 0x20001e28 <os_main_stack+3976>:   0x200041d6      0x00000000      
>>>> 0x0000000a      0x0001574d
>>>> 0x20001e38 <os_main_stack+3992>:   0x00015741      0x0000c925      
>>>> 0x200041d0      0x00000011
>>>> 0x20001e48 <os_main_stack+4008>:   0x00000073      0x200041d3      
>>>> 0x00000000      0x0000ede9
>>>> (gdb) p &__text
>>>> $3 = (<data variable, no debug info> *) 0x8020 <__isr_vector>
>>>> (gdb) p &__etext
>>>> $4 = (<data variable, no debug info> *) 0x175f0
>>>> (gdb) x/i 0x00017161
>>>> 0x17161:   movs    r0, r0
>>>> (gdb) x/i 0x00014949
>>>> 0x14949 <crash_device+12>: cbz     r0, 0x1496a <crash_device+46>
>>>> (gdb) x/i 0x00014978
>>>> 0x14978 <crash_device+60>: sdiv    r3, r3, r2
>>>> (gdb) x/i 0x00014a21
>>>> 0x14a21 <crash_cli_cmd+12>:        cbz     r0, 0x14a28 <crash_cli_cmd+20>
>>>> (gdb) x/i 0x00014a15
>>>> 0x14a15 <crash_cli_cmd>:   push    {r3, lr}
>>>> (gdb) list *0x14949
>>>> 0x14949 is in crash_device 
>>>> (repos/apache-mynewt-core/test/crash_test/src/crash_test.c:42).
>>>> warning: Source file is more recent than executable.
>>>> 37 int
>>>> 38 crash_device(char *how)
>>>> 39 {
>>>> 40     volatile int val1, val2, val3;
>>>> 41 
>>>> 42     if (!strcmp(how, "div0")) {
>>>> 43 
>>>> 44         val1 = 42;
>>>> 45         val2 = 0;
>>>> 46 
>>>> (gdb) list *0x00014a21
>>>> 0x14a21 is in crash_cli_cmd 
>>>> (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41).
>>>> 36 };
>>>> 37 
>>>> 38 static int
>>>> 39 crash_cli_cmd(int argc, char **argv)
>>>> 40 {
>>>> 41     if (argc >= 2 && crash_device(argv[1]) == 0) {
>>>> 42         return 0;
>>>> 43     }
>>>> 44     console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n");
>>>> 45     return 0;
>>>> (gdb) list *0x14a21
>>>> 0x14a21 is in crash_cli_cmd 
>>>> (repos/apache-mynewt-core/test/crash_test/src/crash_cli.c:41).
>>>> 36 };
>>>> 37 
>>>> 38 static int
>>>> 39 crash_cli_cmd(int argc, char **argv)
>>>> 40 {
>>>> 41     if (argc >= 2 && crash_device(argv[1]) == 0) {
>>>> 42         return 0;
>>>> 43     }
>>>> 44     console_printf("Usage crash [div0|jump0|ref0|assert|wdog]\n");
>>>> 45     return 0;
>>>> 
>>>> good luck.
>>>> 
>>>>> On Aug 31, 2018, at 2:10 PM, Aditya Xavier <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>> It seems OS_CRASH_STACKTRACE was introduced after 1.4.1 and hence not in 
>>>>> the release.
>>>>> 
>>>>> If I change the release, I believe there would be many API changes to be 
>>>>> done on MESH side.
>>>>> 
>>>>> Can you guide me on how to "manually walk the stack for looking for 
>>>>> things which look like pointers to text” ?
>>>>> 
>>>>> My gdb skill are pretty weak.
>>>>> 
>>>>> I tried gdb where, with the following outcome.
>>>>> 
>>>>> (gdb) c
>>>>> Continuing.
>>>>> 
>>>>> 
>>>>> Program received signal SIGTRAP, Trace/breakpoint trap.
>>>>> hal_system_reset () at 
>>>>> repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>>>> 50                    asm("bkpt");
>>>>> (gdb) 
>>>>> Continuing.
>>>>> 
>>>>> Program received signal SIGTRAP, Trace/breakpoint trap.
>>>>> hal_system_reset () at 
>>>>> repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>>>> 50                    asm("bkpt");
>>>>> (gdb) where
>>>>> #0  hal_system_reset () at 
>>>>> repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:50
>>>>> #1  0x0000bf2e in os_default_irq (tf=0x2000ffc8) at 
>>>>> repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:170
>>>>> #2  0x0000da56 in os_default_irq_asm () at 
>>>>> repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
>>>>> #3  <signal handler called>
>>>>> #4  0x00000000 in ?? ()
>>>>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 31-Aug-2018, at 4:30 PM, marko kiiskila <[email protected]> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Aug 31, 2018, at 1:47 PM, Aditya Xavier 
>>>>>>> <[email protected]> wrote:
>>>>>>> 
>>>>>>> Hi !
>>>>>>> 
>>>>>>> Am having an issue with Sending and Receiving a Mesh Message. Though am 
>>>>>>> positive the problem is more towards releasing the semaphore.
>>>>>>> 
>>>>>>> Action Received over MESH Length :- 15
>>>>>>> 012273 Unhandled interrupt (3), exception sp 0x2000abd0
>>>>>>> 012273  r0:0xd7229882  r1:0xd929b3bb  r2:0xcf0f98cb  r3:0x5c5a76b3
>>>>>>> 012273  r4:0x1b000000  r5:0x2000acc0  r6:0x2000aca0  r7:0x00000008
>>>>>>> 012273  r8:0x00000000  r9:0x200008a9 r10:0x2000ad28 r11:0x00011d91
>>>>>>> 012273 r12:0x681af5c8  lr:0xb1334673  pc:0x7e3cdeb8 psr:0x2266a80b
>>>>>>> 012273 ICSR:0x00411803 HFSR:0x40000000 CFSR:0x00040000
>>>>>>> 012273 BFAR:0xe000ed38 MMFAR:0xe000ed34
>>>>>>> 
>>>>>>> Am sending a group mesh message for testing. The sequence of events are 
>>>>>>> as follows.
>>>>>>> 
>>>>>>> Button TASK -> send message over MESH -> Mesh receives message on model 
>>>>>>> -> copies the data and starts releases the Semaphore for another task 
>>>>>>> -> LOG Task starts and logs the message.
>>>>>>> 
>>>>>>> In this entire flow, the moment I receive the message and release the 
>>>>>>> semaphore the firmware crashes.
>>>>>>> 
>>>>>>> I tried increasing the STACK size of the LOG task, however that didn’t 
>>>>>>> help.
>>>>>>> 
>>>>>>> Could someone let me know how to understand where / why the crash is 
>>>>>>> happening ?
>>>>>> 
>>>>>> Looking at your registers they seem to be garbage, so I’m guessing stack
>>>>>> corruption of some sort; does not have to be overflow.
>>>>>> Try turning on OS_CRASH_STACKTRACE, or manually walk the stack for 
>>>>>> looking for things which
>>>>>> look like pointers to text.
>

Re: Mynewt crash when releasing semaphore

Reply via email to