Re: Reducing GATT write attribute's timeout and read attribute's BLE_HS_ENOMEM

Lukasz Wolnik Sun, 05 Aug 2018 18:04:04 -0700

Hi Chris,

I have resolved the issue. It wasn't my mbuf structure but MSYS_1's pool
memory leak (caused by my app).


After verifying that the same actions recreated in btshell result in no
MSYS_1 memory leak (BTW, SHELL_TASK and its mpool stat is a great tool!) I
turned my attention back to my code. Ha!

A couple hours later I found below guy in my ble_gattc_read's callback.



static int cb_on_read(..., ..., struct ble_gatt_attr *attr, ...)

        // To *save* time I copy-pasted and inlined below while loop from
the print_mbuf function.
        while (attr->om != NULL)
        {
            strncpy(p_message, attr->om->om_data, attr->om->om_len);

            p_message += attr->om->om_len;
            attr->om = SLIST_NEXT(attr->om, om_next); // GUY
        }

        // To fix above simply use const struct os_mbuf *om = attr->om;
before the while
        // loop and operate on a copy of the received attribute's pointer.


I finally have a stable Mynewt app <-> Android repeated communication even
on MSYS_1_BLOCK_COUNT = 8! So happy with it. Thanks again for the ride this
bug turned out to be.

Kind regards,
Łukasz

On Sun, Aug 5, 2018 at 8:22 PM Lukasz Wolnik <[email protected]>
wrote:

> Hi Chris,
>
> It's been more than a year but I have finally got some findings.
>
> In my another Mynewt powered device I got the BLE_HS_ENOMEM error
> from ble_gattc_write_flat this time. I tried changing BLE_GATT_MAX_PROCS
> (down to 2 or up to 8) but with no effect.
>
> My device was keeping a connection live but stopped receiving any
> communictations from its peer. No notifications, read, write, etc.. Luckily
> it was always happening on the 5th consecutive connection (each previous
> ones terminated by my app using ble_gap_terminate(conn_handle,
> BLE_ERR_REM_USER_CONN_TERM)).
>
> By changing MSYS_1_BLOCK_COUNT I started to get the BLE_HS_ENOMEM
> respectively on:
>
> 8 - 4th connection
> 12 - 5th connection (default value)
> 16 - 6th connection
>
> Yay! This thing is not random at all.
>
> I assume it can't be NimBLE that's not freeing its resources up so I'm
> going to look at my dynamically allocated structure that
> uses os_memblock_get functions. I'm going to read up on Mynewt mbufs.
>
> Thank you very much for pointing me into the right direction. I regained
> my faith in BLE stack. No more panic attacks while pitching my device and
> getting to the demo part :)
>
> Kind regards,
> Łukasz
>
> On Mon, May 15, 2017 at 9:09 PM Łukasz Wolnik <[email protected]>
> wrote:
>
>> Hi Szymon,
>>
>> Thanks for the clarification. I was going to write a queue system for
>> GATT writes/reads but it looks like I should rely on lower layer and just
>> reconnect if my central app has problem with communicating to peripherial
>> devices.
>>
>> Kind regards,
>> Łukasz
>>
>> On Mon, May 15, 2017 at 9:07 PM, Łukasz Wolnik <[email protected]>
>> wrote:
>>
>>> Hi Chris,
>>>
>>> Thanks a lot for your responses. They are very helpful and are radically
>>> shaping the way I'm going to develop the second version of my app.
>>>
>>> I'm pretty sure, when I investigated the issue with gdb, that the
>>> problem was not enough GATT procs available. I'll experiment with 
>>> MSYS_1_BLOCK_COUNT
>>> and BLE_GATT_MAX_PROCS and let you know if increasing
>>> BLE_GATT_MAX_PROCS helped. Thanks a lot for sharing these two config
>>> values. And yes, that'd great if alongside the error it would tell which
>>> resource is not available and what are the current limits.
>>>
>>> Right, so it's a 30 not a 20-second timeout. My app is a wearable item
>>> and it's crucial for it to be robust. I think what I can do is to manually
>>> disconnect a connection handle when I'm not getting a confirmation within 1
>>> second.
>>>
>>> Kind regards,
>>> Łukasz
>>>
>>> On Mon, May 15, 2017 at 7:06 PM, Christopher Collins <[email protected]>
>>> wrote:
>>>
>>>> On Mon, May 15, 2017 at 11:01:38AM -0700, Christopher Collins wrote:
>>>> > Hi Łukasz,
>>>> >
>>>> > On Mon, May 15, 2017 at 12:33:59PM +0100, Łukasz Wolnik wrote:
>>>> > > Hello,
>>>> > >
>>>> > > From time to time my ble_gattc_write_flat (run as central) is
>>>> timing out
>>>> > > after 20 seconds while sending to an Android 6 phone (in
>>>> peripherial mode).
>>>> > > Is there a way to reduce the timeout to just 1 second? At the
>>>> moment if
>>>> > > there's an issue with writing, my newt program has to wait 20
>>>> seconds until
>>>> > > it can respond to a timeout.
>>>> > >
>>>> > > What's the best strategy here? Keep "bombarding" the peripherial
>>>> with
>>>> > > multiple writes until receiving first confirmation. Or reduce the
>>>> timeout
>>>> > > from 20 seconds (I don't know where this value is coming from) and
>>>> resend
>>>> > > only when got an HCI 19 timeout error in the callback?
>>>>
>>>> Oops, I forgot to respond to your actual question :).  Sorry about that.
>>>> The 30-second timeout is hardcoded in the spec, and is currently not
>>>> configurable (Vol. 3, Part F, 3.3.3).  It might be useful to make this
>>>> configurable, but the device would not be standards compliant.
>>>>
>>>> Chris
>>>>
>>>
>>>
>>

Re: Reducing GATT write attribute's timeout and read attribute's BLE_HS_ENOMEM

Reply via email to