Reminds me of that old Wendy's TV commercial: "Where's the deadbeef?"
ALan -----Original Message----- From: marko kiiskila [mailto:[email protected]] Sent: Wednesday, April 19, 2017 5:00 PM To: [email protected] Subject: Re: newtmgr image upload nrf51dk disconnects with reason=8 > On Apr 19, 2017, at 4:33 PM, Jacob Rosenthal <[email protected]> wrote: > > On Wed, Apr 19, 2017 at 11:19 AM, marko kiiskila <[email protected]> wrote: > >> Just general comments, I hope I’m not saying things which are too >> obvious. >> > More specific would be even better :) I dont think my gdb is up to par > > Either g_os_run_list or one of the task structures is getting smashed. >> As you know you tasks beforehand, you can walk through them manually >> to figure which one it is. >> > How do I know the tasks beforehand? I would guess something in > imgr_upload is corrupting it? So print as that function starts and > ends? How do I walk through them manually? You could do this, for example: (gdb) source repos/apache-mynewt-core/compiler/gdbmacros/os.gdb (gdb) os_tasks prio state stack stksz task name * 255 0x1 0xae7d4 16384 0x9e780 idle 127 0x2 0x9b128 5376 0x95cd8 main 0 0x2 0x95a2c 16384 0x859dc uartpoll 2 0x2 0xb4338 4096 0x9d1d0 socket 9 0x2 0x85908 4096 0x818a8 ble_hs This was from native build target I happened to have debugger on with it. But you would get the same type of data from actual targets as well. The pointer to os_task structure is under the ‘task’ column. Here I'm picking the idle task for closer inspection: (gdb) set print pretty (gdb) p *(struct os_task *)0x9e780 $3 = { t_stackptr = 0xae63c <g_idle_task_stack+65128>, t_stacktop = 0xae7d4 <g_os_idle_ctr>, t_stacksize = 16384, t_taskid = 0 '\000', t_prio = 255 '\377', t_state = 1 '\001', t_flags = 0 '\000', t_lockcnt = 0 '\000', t_pad = 0 '\000', t_name = 0x6b8d8 "idle", t_func = 0x192f0 <os_idle_task>, t_arg = 0x0, t_obj = 0x0, t_sanity_check = { sc_checkin_last = 0, sc_checkin_itvl = 0, sc_func = 0x0, sc_arg = 0x0, sc_next = { sle_next = 0x0 } }, t_next_wakeup = 0, t_run_time = 52837, t_ctx_sw_cnt = 50124, t_os_task_list = { stqe_next = 0x95cd8 <os_main_task> }, t_os_list = { tqe_next = 0x0, tqe_prev = 0x8143c <g_os_run_list> }, t_obj_list = { sle_next = 0x0 } } And then I’ll compute where the task stack starts, t_stacktop - sizeof(os_stack_t) * t_stacksize (gdb) x/x 0xae7d4-16384*4 0x9e7d4 <g_idle_task_stack>: 0xdeadbeef So that’s where the stack starts. Then I’ll inspect the stack top, see if it still has the fill pattern ‘0xdeadbeef' (gdb) x/32x 0x9e7d4 0x9e7d4 <g_idle_task_stack>: 0xdeadbeef 0xdeadbeef 0xdeadbeef 0xdeadbeef 0x9e7e4 <g_idle_task_stack+16>: 0xdeadbeef 0xdeadbeef 0xdeadbeef 0xdeadbeef 0x9e7f4 <g_idle_task_stack+32>: 0xdeadbeef 0xdeadbeef 0xdeadbeef 0xdeadbeef 0x9e804 <g_idle_task_stack+48>: 0xdeadbeef 0xdeadbeef 0xdeadbeef 0xdeadbeef 0x9e814 <g_idle_task_stack+64>: 0xdeadbeef 0xdeadbeef 0xdeadbeef 0xdeadbeef So this stack has not been used completely. > >> >> Usually culprit is stack overflow, so once you find out which task >> structure is being corrupt, look for the stack just after that in >> memory. >> >> nm output piped to sort is your friend in locating that stack. >> > nm output? [pi@raspberrypi:~/src/incubator-mynewt-blinky]$ arm-linux-gnueabihf-nm bin/targets/bleprph_oic_linux/app/apps/bleprph_oic/bleprph_oic.elf | sort | more I.e. get symbols from my elf-file, sort them by address. And then let’s continue what the idle stack would overwrite to, if it was not big enough: ... 0009e780 B g_idle_task 0009e7d0 B g_os_started 0009e7d4 B g_idle_task_stack looks like idle stack overflow would most likely corrupt those 2 items first. And if it corrupts that task structure, it’s game over. BTW, gdb scripts looking for task stack use are missing. We probably should have such :) Happy hacking, M >> >> > Hope this helps, >> M >> > > Thanks for the help
