Hi,

I just quickly looked through the code and it seems that your FreeRTOS port
uses basepri for critical sections. The problem here is that radio irq has
priority 0 so it cannot be blocked by basepri, that's one reason why NPL
uses primask for its own macros. One function that can be affected in
particular is npl_freertos_eventq_remove() which uses FreeRTOS calls to
enter critical section so if some corner case it can be called from both
ISR and task context messing up queue contents, although other calls may
have the same problem due to basepri usage in general. That could explain
problems which appear only in certain build configurations since they are
triggered by timing.

Best,
Andrzej



On Wed, Jun 17, 2020 at 9:05 PM <j...@codingfield.com> wrote:

> Hi,
>
> I use the init functions found in porting/nimble and
> porting/npl/freertos. There are functions to handle semaphores, queues,
> timers,... Is there any issue with this freertos port, if it's not
> official? I did not write any code related to the radio or irq as it
> seems to be handled by these functions.
>
> Here is the function I call in my main() to init nimble:
>
> void nimble_port_init(void) {
>    void os_msys_init(void);
>    void ble_store_ram_init(void);
>    ble_npl_eventq_init(&g_eventq_dflt);
>    os_msys_init();
>    ble_hs_init();
>    ble_store_ram_init();
>
>    int res;
>    res = hal_timer_init(5, NULL);
>    ASSERT(res == 0);
>    res = os_cputime_init(32768);
>    ASSERT(res == 0);
>    ble_ll_init();
>    ble_hci_ram_init();
>    nimble_port_freertos_init(BleHost);
> }
>
> And BleHost():
>
> void BleHost(void *) {
>    struct ble_npl_event *ev;
>
>    while (1) {
>      ev = ble_npl_eventq_get(&g_eventq_dflt, BLE_NPL_TIME_FOREVER);
>      ble_npl_event_run(ev);
>    }
> }
>
> If you want to have a look at the complete code, it's hosted on Github.
> main() is in src/main.cpp, the nimble is in src/libs/mynewt-nimble :
> https://github.com/JF002/Pinetime
>
> Is there anything specific I should check?
>
> Thanks,
>
> JF
>
>
> Le 17/06/2020 10:35, Andrzej Kaczmarek a écrit :
> > Hi,
> >
> > How did you setup interrupts on FreeRTOS (e.g. radio) and critical
> > section?
> > These kinds of problems are common if you have misconfigured locking,
> > interrupt priorities or smth. Since we do not have any official port
> > for
> > FreeRTOS (there's only NPL but it does not deal with hardware) I assume
> > you
> > did this on your own so would be good to know more details.
> >
> > Best,
> > Andrzej
> >
> >
> > On Tue, Jun 16, 2020 at 8:29 PM <j...@codingfield.com> wrote:
> >
> >> Hi,
> >>
> >> I was able to do more tests using BLE_MONITOR_RTT and btmon.
> >> Here are 3 captures from 2 phones (nexus 5 and huawei). The one from
> >> the
> >> Nexus contains 3 successful connections followed by a failed one. Both
> >> from nexus only contain 1 failed attempt. For each test, you'll find
> >> the
> >> output of btmon and the capture from wireshark.
> >> The files are available here :
> >> https://seafile.codingfield.com/d/3e17cb3f43da4eedaefd/
> >>
> >> Note that I observed that the connection is more often successful when
> >> BLE_MONITOR_RTT is enabled than when it is disabled. Remember it was
> >> the
> >> same with Debug/Release : there are more successful connections in
> >> Debug
> >> than in Release.
> >> It makes me think of a timing issue, but it's weird that it works
> >> better
> >> when the code is supposed to run slower...
> >>
> >> I hope you'll be able to make sense of these captures!
> >>
> >> Thanks for your help!
> >>
> >> JF
> >>
> >>
> >> Le 15/06/2020 22:54, j...@codingfield.com a écrit :
> >> > Hi,
> >> >
> >> > Sorry, I do not mean to spam this mailing list. I just wanted to say I
> >> > misread the article, and manage to use the monitor over RTT. I build
> >> > the tool rtt2pty, ran it and then launched "btmon -J nrf52" and I see
> >> > very detailed logs about BLE communication!
> >> >
> >> > It's running late, I'll analyze them tomorrow evening.
> >> >
> >> > Thanks for your help, and sorry for the spam :)
> >> >
> >> > JF
> >> >
> >> > Le 15/06/2020 22:25, j...@codingfield.com a écrit :
> >> >> Hi again,
> >> >>
> >> >> Ok, I had a look to BLE_MONITOR_RTT, but I can't figure out how to
> >> >> enable it in my code (which does not use mynewt as RTOS) : at first,
> >> >> the code did not compile. I tried to fix the compilation errors (for
> >> >> example, I had to define struct File_methods and struct File).
> >> >> Then, I built the code, but couldn't see any messages coming out of
> >> >> JLinkRTTClient.
> >> >>
> >> >> It is supposed to work with JLinkRTTClient, or do I need to use
> >> >> btshell running on another nrf board? (I do not have a nrf52840 on
> >> >> hands...).
> >> >>
> >> >> By the way, I already use the RTT for logging, and manage to enable
> >> >> some loggers some nimble. Are the same messages available using the
> >> >> logger, or do I really need bt_monitor?
> >> >>
> >> >> Thx!
> >> >>
> >> >> JF
> >> >>
> >> >> Le 15/06/2020 20:38, j...@codingfield.com a écrit :
> >> >>> Hi,
> >> >>>
> >> >>> The ACK you're talking about, are they the "Rcvd Read" listed in the
> >> >>> capture?
> >> >>>
> >> >>> If BLE_MONITOR_RTT means that nimble supports JLink RTT, then yes, I
> >> >>> have a console available. I'll try to enable it. I'll also have a
> >> >>> look
> >> >>> at the stats and how to use them.
> >> >>>
> >> >>> Thanks for the pointers, I'll report to you as soon as I have more
> >> >>> information!
> >> >>>
> >> >>> JF
> >> >>>
> >> >>> Le 14/06/2020 23:18, Łukasz Rymanowski a écrit :
> >> >>>> Hi,
> >> >>>>
> >> >>>> so indeed your logs show that after trying to read include services
> >> >>>> from
> >> >>>> GATT service, something bad happened. Controller was able to ACK 2
> >> >>>> following master packets then it stopped to send ACKs.
> >> >>>> Do you have a console available? If so, does it show anything
> >> >>>> interesting?
> >> >>>>
> >> >>>> For further debugging I suggest to look at BLE_MONITOR_RTT to grab
> >> >>>> hci logs
> >> >>>> (https://www.codecoup.pl/blog/support-for-btmon-in-mynewt/)
> >> >>>> With this we will know if response is still in the host or is in
> the
> >> >>>> controller.
> >> >>>>
> >> >>>> Then we could check some stats e.g. ble_ll_stats - maybe that could
> >> >>>> show us
> >> >>>> something interesting - but first let us see where it stuck.
> >> >>>>
> >> >>>> Best
> >> >>>> Łukasz
> >> >>>>
> >> >>>> On Sun, 14 Jun 2020 at 21:06, <j...@codingfield.com> wrote:
> >> >>>>
> >> >>>>> Hi,
> >> >>>>>
> >> >>>>> It looks like the attachements were dropped somewhere... You can
> >> >>>>> download them on my seafile instance :
> >> >>>>> https://seafile.codingfield.com/d/64d53a6ca6b44ae6b5d7/
> >> >>>>>
> >> >>>>> It seems to fail during the discovery of the characteristics on
> the
> >> >>>>> device. From what I understand, it happens during the discovery of
> >> >>>>> Generic Access Profile and Generic Attributes Profile, which are
> >> >>>>> handled
> >> >>>>> by nimble.
> >> >>>>> I may be wrong, I'm a beginner with BLE and nimble!
> >> >>>>> Maybe the captures will make more sens to you?
> >> >>>>>
> >> >>>>> Yeah, it's very strange that it works when the code is supposed to
> >> >>>>> run
> >> >>>>> slower... It makes me think of a timing or memory issue, but I
> >> >>>>> couldn't
> >> >>>>> find it...
> >> >>>>>
> >> >>>>> Thanks,
> >> >>>>> JF
> >> >>>>>
> >> >>>>> Le 14/06/2020 20:46, Łukasz Rymanowski a écrit :
> >> >>>>> > Hi,
> >> >>>>> >
> >> >>>>> > Thanks for the report, however looks like attachments are
> missing.
> >> >>>>> >
> >> >>>>> > First of all you need to understand what read fails i.e check
> >> handle
> >> >>>>> > and
> >> >>>>> > make sure that your application responds correctly.
> >> >>>>> > It is very suspicious that read works with -Og, but still I
> would
> >> focus
> >> >>>>> > on
> >> >>>>> > that failing read first.
> >> >>>>> >
> >> >>>>> > Thanks
> >> >>>>> > Lukasz
> >> >>>>> >
> >> >>>>> >
> >> >>>>> > On Sun, Jun 14, 2020, 20:25 <j...@codingfield.com> wrote:
> >> >>>>> >
> >> >>>>> >> Hello,
> >> >>>>> >>
> >> >>>>> >> I'm working on a firmware for the Pinetime, a smartwatch based
> on
> >> the
> >> >>>>> >> NRF52832. The code is written in C/C++ and uses FreeRTOS.
> >> >>>>> >> I've recently switched from the Nordic Softdevice to NimBLE as
> BLE
> >> >>>>> >> stack. I used the freertos port from the 'porting' folder of
> the
> >> >>>>> >> source
> >> >>>>> >> code of nimble.
> >> >>>>> >>
> >> >>>>> >> I did many test using my PC on Linux : it can connect and
> >> communicate
> >> >>>>> >> with the NRF52 without issue.
> >> >>>>> >> However, things are not that easy when I try to connect using
> >> Android
> >> >>>>> >> phone (I tried with a Huawei Psomething and my old Nexus5) :
> the
> >> >>>>> >> connection fails most of the time.
> >> >>>>> >>
> >> >>>>> >> I did a lot of debugging, logging, sniffing,.. and I still
> cannot
> >> >>>>> >> understand why it's not working as expected. Here are some of
> my
> >> >>>>> >> observations:
> >> >>>>> >>
> >> >>>>> >>   - When Android successfully connects, I receive the following
> >> GAP
> >> >>>>> >> events : BLE_GAP_EVENT_CONNECT and then
> BLE_GAP_EVENT_CONN_UPDATE
> >> 2
> >> >>>>> >> times. The first update sets the connection interval to 6, the
> >> second
> >> >>>>> >> one to 40.
> >> >>>>> >>   - When it fails, I receive only the BLE_GAP_EVENT_CONNECT
> event
> >> and
> >> >>>>> >> the
> >> >>>>> >> 1st BLE_GAP_EVENT_CONN_UPDATE.
> >> >>>>> >>   - Using the sniffer, I noticed that the last packet is a read
> >> >>>>> >> request
> >> >>>>> >> from the phone. It looks like the NRF52 never respond to this
> last
> >> >>>>> >> read.
> >> >>>>> >>   - It fails in the discovery steps (when the android phone
> >> discover
> >> >>>>> >> all
> >> >>>>> >> the services, characteristics and attributes) but not always at
> >> the
> >> >>>>> >> same
> >> >>>>> >> place.
> >> >>>>> >>   - I noticed that the tasks (ll task and host task are not in
> >> >>>>> >> deadlock
> >> >>>>> >> BUT it looks like the radio ISR (ble_phy_isr() in ble_phy.c) is
> >> not
> >> >>>>> >> called anymore.
> >> >>>>> >>   - When I build the very same code in DEBUG (-Og instead of
> >> -O3), it
> >> >>>>> >> works perfectly!
> >> >>>>> >>
> >> >>>>> >> You'll find in attachements 2 captures I did with Wireshark and
> >> the
> >> >>>>> >> NRF
> >> >>>>> >> Sniffer (running on a NRF52-DK), one failed and one successful
> >> attempt
> >> >>>>> >> to connect.
> >> >>>>> >>
> >> >>>>> >> I'm running out of idea to debug this further. Is there a
> >> >>>>> >> configuration
> >> >>>>> >> issue (there are so many parameters in syscfg.h)? What could I
> >> try?
> >> >>>>> >> Where should  I search ? Do you need more info to understand
> the
> >> >>>>> >> issue?
> >> >>>>> >>
> >> >>>>> >> Could you help me fix this issue?
> >> >>>>> >>
> >> >>>>> >> Thanks,
> >> >>>>> >>
> >> >>>>> >> JF
> >> >>>>>
> >>
>

Reply via email to