Indeed the disconnect is a result of the erase. If I comment that out I can get to a stack overflow
the newt tool uses 56 for first packet and 64 after, not sure why yet, but lets just say I hardcode 56 in my node tool bleprph and blesplit have OS_MAIN_STACK_SIZE: 428 oddly enough, has to be only 8 more to work OS_MAIN_STACK_SIZE: 436 though could probably use more overhead than that. Thoughts on what to do about flash erase disconnecting? On Thu, Apr 20, 2017 at 10:39 AM, Alan Graves <[email protected]> wrote: > Reminds me of that old Wendy's TV commercial: > "Where's the deadbeef?" > > ALan > > -----Original Message----- > From: marko kiiskila [mailto:[email protected]] > Sent: Wednesday, April 19, 2017 5:00 PM > To: [email protected] > Subject: Re: newtmgr image upload nrf51dk disconnects with reason=8 > > > > On Apr 19, 2017, at 4:33 PM, Jacob Rosenthal <[email protected]> > wrote: > > > > On Wed, Apr 19, 2017 at 11:19 AM, marko kiiskila <[email protected]> > wrote: > > > >> Just general comments, I hope I’m not saying things which are too > >> obvious. > >> > > More specific would be even better :) I dont think my gdb is up to par > > > > Either g_os_run_list or one of the task structures is getting smashed. > >> As you know you tasks beforehand, you can walk through them manually > >> to figure which one it is. > >> > > How do I know the tasks beforehand? I would guess something in > > imgr_upload is corrupting it? So print as that function starts and > > ends? How do I walk through them manually? > > You could do this, for example: > > (gdb) source repos/apache-mynewt-core/compiler/gdbmacros/os.gdb > (gdb) os_tasks > prio state stack stksz task name > * 255 0x1 0xae7d4 16384 0x9e780 idle > 127 0x2 0x9b128 5376 0x95cd8 main > 0 0x2 0x95a2c 16384 0x859dc uartpoll > 2 0x2 0xb4338 4096 0x9d1d0 socket > 9 0x2 0x85908 4096 0x818a8 ble_hs > > This was from native build target I happened to have debugger on with it. > But you would get the same type of data from actual targets as well. > > The pointer to os_task structure is under the ‘task’ column. Here I'm > picking the idle task for closer inspection: > > (gdb) set print pretty > (gdb) p *(struct os_task *)0x9e780 > $3 = { > t_stackptr = 0xae63c <g_idle_task_stack+65128>, > t_stacktop = 0xae7d4 <g_os_idle_ctr>, > t_stacksize = 16384, > t_taskid = 0 '\000', > t_prio = 255 '\377', > t_state = 1 '\001', > t_flags = 0 '\000', > t_lockcnt = 0 '\000', > t_pad = 0 '\000', > t_name = 0x6b8d8 "idle", > t_func = 0x192f0 <os_idle_task>, > t_arg = 0x0, > t_obj = 0x0, > t_sanity_check = { > sc_checkin_last = 0, > sc_checkin_itvl = 0, > sc_func = 0x0, > sc_arg = 0x0, > sc_next = { > sle_next = 0x0 > } > }, > t_next_wakeup = 0, > t_run_time = 52837, > t_ctx_sw_cnt = 50124, > t_os_task_list = { > stqe_next = 0x95cd8 <os_main_task> > }, > t_os_list = { > tqe_next = 0x0, > tqe_prev = 0x8143c <g_os_run_list> > }, > t_obj_list = { > sle_next = 0x0 > } > } > > And then I’ll compute where the task stack starts, t_stacktop - > sizeof(os_stack_t) * t_stacksize > > (gdb) x/x 0xae7d4-16384*4 > 0x9e7d4 <g_idle_task_stack>: 0xdeadbeef > > So that’s where the stack starts. Then I’ll inspect the stack top, see if > it still has the fill pattern ‘0xdeadbeef' > > (gdb) x/32x 0x9e7d4 > 0x9e7d4 <g_idle_task_stack>: 0xdeadbeef 0xdeadbeef > 0xdeadbeef 0xdeadbeef > 0x9e7e4 <g_idle_task_stack+16>: 0xdeadbeef 0xdeadbeef > 0xdeadbeef 0xdeadbeef > 0x9e7f4 <g_idle_task_stack+32>: 0xdeadbeef 0xdeadbeef > 0xdeadbeef 0xdeadbeef > 0x9e804 <g_idle_task_stack+48>: 0xdeadbeef 0xdeadbeef > 0xdeadbeef 0xdeadbeef > 0x9e814 <g_idle_task_stack+64>: 0xdeadbeef 0xdeadbeef > 0xdeadbeef 0xdeadbeef > > So this stack has not been used completely. > > > > >> > >> Usually culprit is stack overflow, so once you find out which task > >> structure is being corrupt, look for the stack just after that in > >> memory. > >> > >> nm output piped to sort is your friend in locating that stack. > >> > > nm output? > > [pi@raspberrypi:~/src/incubator-mynewt-blinky]$ arm-linux-gnueabihf-nm > bin/targets/bleprph_oic_linux/app/apps/bleprph_oic/bleprph_oic.elf | sort > | more > > I.e. get symbols from my elf-file, sort them by address. > And then let’s continue what the idle stack would overwrite to, if it was > not big enough: > > ... > 0009e780 B g_idle_task > 0009e7d0 B g_os_started > 0009e7d4 B g_idle_task_stack > > looks like idle stack overflow would most likely corrupt those 2 items > first. And if it corrupts that task structure, it’s game over. > > BTW, gdb scripts looking for task stack use are missing. We probably > should have such :) > > Happy hacking, > M > > >> > >> > > Hope this helps, > >> M > >> > > > > Thanks for the help > >
