Hi Gregory, thank you for the quick response!

RE: 'lockcount' needing to be in a critical section:
That's good information to know, I wasn't aware before. I'm searching nuttx
for exact matches of the string 'sched_lock();' to find occurrences and
seeing this is not always done. Yeah a critical section (up_irq_save
<https://github.com/apache/nuttx/blob/master/arch/arm/include/armv7-m/irq.h#L405>),
at least on arm arch, calls get_base_pri()
<https://github.com/apache/nuttx/blob/master/arch/arm/include/armv7-m/irq.h#L317-L326>,
which does have a memory barrier so that memory will be coherent throughout
the entire system.

> All places where the lockcount is *incremented* should be within a
critical section
Emphasis mine, I believe all places where lockcount is written OR read
require atomicity and the most up-to-date value in memory with respect to
another, so we'd need a critical section in those cases too.
I like the idea of adding a DEBUGASSERT, my only fear is that others users
such as myself have used 'sched_lock()' in their code and will be met with
a runtime crash when they pull in these changes.

> If we are in a critical section, then other threads
> will not run at least until the thread blocks or terminates (even in SMP
> mode).

Yeah, I agree now, the critical section should solve this since it has the
barrier i mentioned above.

On a similar note, I believe all mutex and semaphore functions should have
a critical section as well. E.g., when nxmutex_is_hold()
<https://github.com/apache/nuttx/blob/master/libs/libc/misc/lib_mutex.c#L147>
checks mutex->holder, it must absolutely have the latest up-to-date value
to verify ownership, otherwise there is a bug. What are your thoughts on
this?

On Wed, Aug 28, 2024 at 5:28 PM Gregory Nutt <spudan...@gmail.com> wrote:

>
> On 8/28/2024 6:14 PM, Daniel Appiagyei wrote:
> > Hi all,
> > Looking at sched_lock.c (
> >
> https://github.com/apache/nuttx/blob/master/sched/sched/sched_lock.c#L187
> ),
> > , specifically at the non-SMP sched_lock() implementation, how do we
> ensure
> > that:
> > 1. There's no data-race when incrementing the lock count:
> > 'rtcb->lockcount++;', (given that no mutex or critical section is
> > surrounding that code so other tasks or IRQs are free to interrupt the
> > assembly instructions), and
>
> All places where the lockcount is incremented should be within a
> critical section.   I believe that is true.  But if you find place where
> that is not true, that would indeed be a bug.
>
> Perhaps a DEBUGASSERTion that we are in a critical section before
> incrementing lockcount would be a good verification of this belief.
>
> > 2. The scheduler is 100% prevented from allowing a different task to run
> > after lock_count is incremented given that we don't use a data memory
> > barrier or data synchronization barrier to ensure the write to
> 'lock_count'
> > is visible to the rest of the system in memory?
> Adding a barrier might be a good idea, although I have never seen this
> to be an issue.  If we are in a critical section, then other threads
> will not run at least until the thread blocks or terminates (even in SMP
> mode).
> > For 1, see cppreference 'Threads and data races' on data races being
> > undefined behavior (
> https://en.cppreference.com/w/c/language/memory_model).
> >
> > Best,
> > Daniel



-- 


*Daniel Appiagyei | Embedded Software Engineer *Email:
daniel.appiag...@braincorp.com
<bog...@braincorporation.com>*Brain*
* Corp™ *10182 Telesis Ct, Suite 100
San Diego, CA 92121

(858)-689-7600
www.braincorp.com

Reply via email to