> On Apr 19, 2017, at 4:33 PM, Jacob Rosenthal <[email protected]> wrote:
>
> On Wed, Apr 19, 2017 at 11:19 AM, marko kiiskila <[email protected]> wrote:
>
>> Just general comments, I hope I’m not saying things which are
>> too obvious.
>>
> More specific would be even better :) I dont think my gdb is up to par
>
> Either g_os_run_list or one of the task structures is getting smashed.
>> As you know you tasks beforehand, you can walk through them manually
>> to figure which one it is.
>>
> How do I know the tasks beforehand? I would guess something in imgr_upload
> is corrupting it? So print as that function starts and ends? How do I walk
> through them manually?
You could do this, for example:
(gdb) source repos/apache-mynewt-core/compiler/gdbmacros/os.gdb
(gdb) os_tasks
prio state stack stksz task name
* 255 0x1 0xae7d4 16384 0x9e780 idle
127 0x2 0x9b128 5376 0x95cd8 main
0 0x2 0x95a2c 16384 0x859dc uartpoll
2 0x2 0xb4338 4096 0x9d1d0 socket
9 0x2 0x85908 4096 0x818a8 ble_hs
This was from native build target I happened to have debugger on with it.
But you would get the same type of data from actual targets as well.
The pointer to os_task structure is under the ‘task’ column. Here I'm picking
the idle task for closer inspection:
(gdb) set print pretty
(gdb) p *(struct os_task *)0x9e780
$3 = {
t_stackptr = 0xae63c <g_idle_task_stack+65128>,
t_stacktop = 0xae7d4 <g_os_idle_ctr>,
t_stacksize = 16384,
t_taskid = 0 '\000',
t_prio = 255 '\377',
t_state = 1 '\001',
t_flags = 0 '\000',
t_lockcnt = 0 '\000',
t_pad = 0 '\000',
t_name = 0x6b8d8 "idle",
t_func = 0x192f0 <os_idle_task>,
t_arg = 0x0,
t_obj = 0x0,
t_sanity_check = {
sc_checkin_last = 0,
sc_checkin_itvl = 0,
sc_func = 0x0,
sc_arg = 0x0,
sc_next = {
sle_next = 0x0
}
},
t_next_wakeup = 0,
t_run_time = 52837,
t_ctx_sw_cnt = 50124,
t_os_task_list = {
stqe_next = 0x95cd8 <os_main_task>
},
t_os_list = {
tqe_next = 0x0,
tqe_prev = 0x8143c <g_os_run_list>
},
t_obj_list = {
sle_next = 0x0
}
}
And then I’ll compute where the task stack starts,
t_stacktop - sizeof(os_stack_t) * t_stacksize
(gdb) x/x 0xae7d4-16384*4
0x9e7d4 <g_idle_task_stack>: 0xdeadbeef
So that’s where the stack starts. Then I’ll inspect the stack top, see if it
still has the fill pattern ‘0xdeadbeef'
(gdb) x/32x 0x9e7d4
0x9e7d4 <g_idle_task_stack>: 0xdeadbeef 0xdeadbeef 0xdeadbeef
0xdeadbeef
0x9e7e4 <g_idle_task_stack+16>: 0xdeadbeef 0xdeadbeef 0xdeadbeef
0xdeadbeef
0x9e7f4 <g_idle_task_stack+32>: 0xdeadbeef 0xdeadbeef 0xdeadbeef
0xdeadbeef
0x9e804 <g_idle_task_stack+48>: 0xdeadbeef 0xdeadbeef 0xdeadbeef
0xdeadbeef
0x9e814 <g_idle_task_stack+64>: 0xdeadbeef 0xdeadbeef 0xdeadbeef
0xdeadbeef
So this stack has not been used completely.
>
>>
>> Usually culprit is stack overflow, so once you find out which task
>> structure
>> is being corrupt, look for the stack just after that in memory.
>>
>> nm output piped to sort is your friend in locating that stack.
>>
> nm output?
[pi@raspberrypi:~/src/incubator-mynewt-blinky]$ arm-linux-gnueabihf-nm
bin/targets/bleprph_oic_linux/app/apps/bleprph_oic/bleprph_oic.elf | sort | more
I.e. get symbols from my elf-file, sort them by address.
And then let’s continue what the idle stack would overwrite to, if it was not
big enough:
...
0009e780 B g_idle_task
0009e7d0 B g_os_started
0009e7d4 B g_idle_task_stack
looks like idle stack overflow would most likely corrupt those 2 items first.
And if it corrupts that task structure,
it’s game over.
BTW, gdb scripts looking for task stack use are missing. We probably should
have such :)
Happy hacking,
M
>>
>>
> Hope this helps,
>> M
>>
>
> Thanks for the help