Hi Gregory, thank you for the quick response! RE: 'lockcount' needing to be in a critical section: That's good information to know, I wasn't aware before. I'm searching nuttx for exact matches of the string 'sched_lock();' to find occurrences and seeing this is not always done. Yeah a critical section (up_irq_save <https://github.com/apache/nuttx/blob/master/arch/arm/include/armv7-m/irq.h#L405>), at least on arm arch, calls get_base_pri() <https://github.com/apache/nuttx/blob/master/arch/arm/include/armv7-m/irq.h#L317-L326>, which does have a memory barrier so that memory will be coherent throughout the entire system.
> All places where the lockcount is *incremented* should be within a critical section Emphasis mine, I believe all places where lockcount is written OR read require atomicity and the most up-to-date value in memory with respect to another, so we'd need a critical section in those cases too. I like the idea of adding a DEBUGASSERT, my only fear is that others users such as myself have used 'sched_lock()' in their code and will be met with a runtime crash when they pull in these changes. > If we are in a critical section, then other threads > will not run at least until the thread blocks or terminates (even in SMP > mode). Yeah, I agree now, the critical section should solve this since it has the barrier i mentioned above. On a similar note, I believe all mutex and semaphore functions should have a critical section as well. E.g., when nxmutex_is_hold() <https://github.com/apache/nuttx/blob/master/libs/libc/misc/lib_mutex.c#L147> checks mutex->holder, it must absolutely have the latest up-to-date value to verify ownership, otherwise there is a bug. What are your thoughts on this? On Wed, Aug 28, 2024 at 5:28 PM Gregory Nutt <spudan...@gmail.com> wrote: > > On 8/28/2024 6:14 PM, Daniel Appiagyei wrote: > > Hi all, > > Looking at sched_lock.c ( > > > https://github.com/apache/nuttx/blob/master/sched/sched/sched_lock.c#L187 > ), > > , specifically at the non-SMP sched_lock() implementation, how do we > ensure > > that: > > 1. There's no data-race when incrementing the lock count: > > 'rtcb->lockcount++;', (given that no mutex or critical section is > > surrounding that code so other tasks or IRQs are free to interrupt the > > assembly instructions), and > > All places where the lockcount is incremented should be within a > critical section. I believe that is true. But if you find place where > that is not true, that would indeed be a bug. > > Perhaps a DEBUGASSERTion that we are in a critical section before > incrementing lockcount would be a good verification of this belief. > > > 2. The scheduler is 100% prevented from allowing a different task to run > > after lock_count is incremented given that we don't use a data memory > > barrier or data synchronization barrier to ensure the write to > 'lock_count' > > is visible to the rest of the system in memory? > Adding a barrier might be a good idea, although I have never seen this > to be an issue. If we are in a critical section, then other threads > will not run at least until the thread blocks or terminates (even in SMP > mode). > > For 1, see cppreference 'Threads and data races' on data races being > > undefined behavior ( > https://en.cppreference.com/w/c/language/memory_model). > > > > Best, > > Daniel -- *Daniel Appiagyei | Embedded Software Engineer *Email: daniel.appiag...@braincorp.com <bog...@braincorporation.com>*Brain* * Corp™ *10182 Telesis Ct, Suite 100 San Diego, CA 92121 (858)-689-7600 www.braincorp.com