Re: newtmgr image upload nrf51dk disconnects with reason=8

Vipul Rahane Thu, 20 Apr 2017 11:21:20 -0700

Hello Jacob,

You can try increasing the supervision timeout in the BLE settings, that’s what 
I needed to do to get the newtmgr working in Go.


Regards,
Vipul Rahane

> On Apr 20, 2017, at 11:16 AM, Jacob Rosenthal <[email protected]> wrote:
> 
> Indeed the disconnect is a result of the erase. If I comment that out I can
> get to a stack overflow
> 
> the newt tool uses 56 for first packet and 64 after, not sure why yet, but
> lets just say I hardcode 56 in my node tool
> 
> bleprph and blesplit have
> OS_MAIN_STACK_SIZE: 428
> 
> oddly enough, has to be only 8 more to work OS_MAIN_STACK_SIZE: 436
> 
> though could probably use more overhead than that.
> 
> Thoughts on what to do about flash erase disconnecting?
> 
> 
> On Thu, Apr 20, 2017 at 10:39 AM, Alan Graves <[email protected]>
> wrote:
> 
>> Reminds me of that old Wendy's TV commercial:
>> "Where's the deadbeef?"
>> 
>> ALan
>> 
>> -----Original Message-----
>> From: marko kiiskila [mailto:[email protected]]
>> Sent: Wednesday, April 19, 2017 5:00 PM
>> To: [email protected]
>> Subject: Re: newtmgr image upload nrf51dk disconnects with reason=8
>> 
>> 
>>> On Apr 19, 2017, at 4:33 PM, Jacob Rosenthal <[email protected]>
>> wrote:
>>> 
>>> On Wed, Apr 19, 2017 at 11:19 AM, marko kiiskila <[email protected]>
>> wrote:
>>> 
>>>> Just general comments, I hope I’m not saying things which are too
>>>> obvious.
>>>> 
>>> More specific would be even better :) I dont think my gdb is up to par
>>> 
>>> Either g_os_run_list or one of the task structures is getting smashed.
>>>> As you know you tasks beforehand, you can walk through them manually
>>>> to figure which one it is.
>>>> 
>>> How do I know the tasks beforehand? I would guess something in
>>> imgr_upload is corrupting it? So print as that function starts and
>>> ends? How do I walk through them manually?
>> 
>> You could do this, for example:
>> 
>> (gdb) source repos/apache-mynewt-core/compiler/gdbmacros/os.gdb
>> (gdb) os_tasks
>> prio state      stack  stksz       task name
>> * 255   0x1    0xae7d4  16384    0x9e780 idle
>>  127   0x2    0x9b128   5376    0x95cd8 main
>>    0   0x2    0x95a2c  16384    0x859dc uartpoll
>>    2   0x2    0xb4338   4096    0x9d1d0 socket
>>    9   0x2    0x85908   4096    0x818a8 ble_hs
>> 
>> This was from native build target I happened to have debugger on with it.
>> But you would get the same type of data from actual targets as well.
>> 
>> The pointer to os_task structure is under the ‘task’ column. Here I'm
>> picking the idle task for closer inspection:
>> 
>> (gdb) set print pretty
>> (gdb) p *(struct os_task *)0x9e780
>> $3 = {
>>  t_stackptr = 0xae63c <g_idle_task_stack+65128>,
>>  t_stacktop = 0xae7d4 <g_os_idle_ctr>,
>>  t_stacksize = 16384,
>>  t_taskid = 0 '\000',
>>  t_prio = 255 '\377',
>>  t_state = 1 '\001',
>>  t_flags = 0 '\000',
>>  t_lockcnt = 0 '\000',
>>  t_pad = 0 '\000',
>>  t_name = 0x6b8d8 "idle",
>>  t_func = 0x192f0 <os_idle_task>,
>>  t_arg = 0x0,
>>  t_obj = 0x0,
>>  t_sanity_check = {
>>    sc_checkin_last = 0,
>>    sc_checkin_itvl = 0,
>>    sc_func = 0x0,
>>    sc_arg = 0x0,
>>    sc_next = {
>>      sle_next = 0x0
>>    }
>>  },
>>  t_next_wakeup = 0,
>>  t_run_time = 52837,
>>  t_ctx_sw_cnt = 50124,
>>  t_os_task_list = {
>>    stqe_next = 0x95cd8 <os_main_task>
>>  },
>>  t_os_list = {
>>    tqe_next = 0x0,
>>    tqe_prev = 0x8143c <g_os_run_list>
>>  },
>>  t_obj_list = {
>>    sle_next = 0x0
>>  }
>> }
>> 
>> And then I’ll compute where the task stack starts, t_stacktop -
>> sizeof(os_stack_t) * t_stacksize
>> 
>> (gdb) x/x 0xae7d4-16384*4
>> 0x9e7d4 <g_idle_task_stack>:    0xdeadbeef
>> 
>> So that’s where the stack starts. Then I’ll inspect the stack top, see if
>> it still has the fill pattern ‘0xdeadbeef'
>> 
>> (gdb) x/32x 0x9e7d4
>> 0x9e7d4 <g_idle_task_stack>:    0xdeadbeef      0xdeadbeef
>> 0xdeadbeef      0xdeadbeef
>> 0x9e7e4 <g_idle_task_stack+16>: 0xdeadbeef      0xdeadbeef
>> 0xdeadbeef      0xdeadbeef
>> 0x9e7f4 <g_idle_task_stack+32>: 0xdeadbeef      0xdeadbeef
>> 0xdeadbeef      0xdeadbeef
>> 0x9e804 <g_idle_task_stack+48>: 0xdeadbeef      0xdeadbeef
>> 0xdeadbeef      0xdeadbeef
>> 0x9e814 <g_idle_task_stack+64>: 0xdeadbeef      0xdeadbeef
>> 0xdeadbeef      0xdeadbeef
>> 
>> So this stack has not been used completely.
>> 
>>> 
>>>> 
>>>> Usually culprit is stack overflow, so once you find out which task
>>>> structure is being corrupt, look for the stack just after that in
>>>> memory.
>>>> 
>>>> nm output piped to sort is your friend in locating that stack.
>>>> 
>>> nm output?
>> 
>> [pi@raspberrypi:~/src/incubator-mynewt-blinky]$ arm-linux-gnueabihf-nm
>> bin/targets/bleprph_oic_linux/app/apps/bleprph_oic/bleprph_oic.elf | sort
>> | more
>> 
>> I.e. get symbols from my elf-file, sort them by address.
>> And then let’s continue what the idle stack would overwrite to, if it was
>> not big enough:
>> 
>> ...
>> 0009e780 B g_idle_task
>> 0009e7d0 B g_os_started
>> 0009e7d4 B g_idle_task_stack
>> 
>> looks like idle stack overflow would most likely corrupt those 2 items
>> first. And if it corrupts that task structure, it’s game over.
>> 
>> BTW, gdb scripts looking for task stack use are missing. We probably
>> should have such :)
>> 
>> Happy hacking,
>> M
>> 
>>>> 
>>>> 
>>> Hope this helps,
>>>> M
>>>> 
>>> 
>>> Thanks for the help
>> 
>>

Re: newtmgr image upload nrf51dk disconnects with reason=8

Reply via email to