This file: https://github.com/apache/nuttx/commit/9de9f8168d6de8eab8d3a97aac21aacc4e84dd84#diff-ac1d3cade3d9ab380d40fe31f8c006b3035c7b53d04c28e23d9f1ce9a176572c
On Tue, May 6, 2025 at 1:18 PM Alan C. Assis <acas...@gmail.com> wrote: > Hi Serg, > > I did an analysis here and this macro was introduced in Jan 10 2024, so it > was more than 1 year ago: > > > https://github.com/apache/nuttx/commit/9de9f8168d6de8eab8d3a97aac21aacc4e84dd84 > > BR, > > Alan > > On Tue, May 6, 2025 at 1:12 PM Alan C. Assis <acas...@gmail.com> wrote: > >> WOW! Nice catch! >> >> Question to some people with more experience in the scheduler: >> >> Why wasn't this issue detected before? It was added more than 9 months >> ago. >> >> Is there some way to test and enforce that nxsched_process_delivered() >> and other schedule functions are working as expected? >> >> BR, >> >> Alan >> >> On Tue, May 6, 2025 at 11:14 AM Serg Podtynnyi <s...@podtynnyi.com.invalid> >> wrote: >> >>> Hi, I am still researching the sched problem on my config >>> >>> Found this macro code from last September, looks like pre vs prev >>> typo, right? >>> >>> #definedq_insert_mid(pre,mid,next)\ >>> do\ >>> {\ >>> mid->flink =next;\ >>> mid->blink =prev;\ >>> pre->flink =mid;\ >>> next->blink =mid;\ >>> }\ >>> while(0) >>> >>> >>> >>> On 4/24/25 15:06, hujun260 wrote: >>> > The lockcount can be understood as a local variable, so no race >>> conditions will occur. >>> > 编辑 >>> > 分享 >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > At 2025-04-24 15:54:48, "Serg Podtynnyi"<s...@podtynnyi.com.INVALID> >>> wrote: >>> >> Hi, I tried my best to understand the logic behind unlock and >>> >> merge_pending routines and come up with this fix that works for me. >>> >> I split the lockcount handling in 2 parts when lockcount is > 1 just >>> >> decrement fast as always, >>> >> but when it's 1 we enter critical section and make it 0 inside >>> >> it(pre-emption is enabled). >>> >> I think there is a small time after decrement in the if statement and >>> >> enter_critical_section_wo_note() in original code where can race >>> happen. >>> >> Does this make sense? Sorry if it's a dumb question - scheduler stuff >>> >> with SMP is rocket-science level for me yet. >>> >> >>> >> diff --git a/sched/sched/sched_unlock.c b/sched/sched/sched_unlock.c >>> >> >>> >> index 4c7d681454..15fc39b546 100644 >>> >> --- a/sched/sched/sched_unlock.c >>> >> +++ b/sched/sched/sched_unlock.c >>> >> @@ -65,13 +65,14 @@ void sched_unlock(void) >>> >> >>> >> DEBUGASSERT(rtcb == NULL || rtcb->lockcount > 0); >>> >> >>> >> - /* Check if the lock counter has decremented to zero. If so, >>> >> - * then pre-emption has been re-enabled. >>> >> - */ >>> >> - >>> >> - if (rtcb != NULL && --rtcb->lockcount == 0) >>> >> + if (rtcb != NULL && rtcb->lockcount > 1) >>> >> + { >>> >> + rtcb->lockcount--; >>> >> + } >>> >> + else if (rtcb != NULL && rtcb->lockcount == 1) >>> >> { >>> >> irqstate_t flags = enter_critical_section_wo_note(); >>> >> + rtcb->lockcount = 0; >>> >> >>> >> /* Note that we no longer have pre-emption disabled. */ >>> >> >>> >> -- >>> >> 2.49.0 >>> >> On 4/22/25 14:32, hujun260 wrote: >>> >> >>> >> It seems to have nothing to do with the race condition, and since the >>> rtcb is this_task, it will only be modified by one CPU.<br/><br/>As for why >>> your modification can solve the problem, I haven't figured it out yet. Can >>> you further analyze the reasons for the failure to boot? >>> >> At 2025-04-22 13:25:13, "Serg Podtynnyi"<s...@podtynnyi.com.INVALID> >>> wrote: >>> >> >>> >>> Hi All, Gregory, Sebastien, >>> >>> >>> >>> I'm porting NuttX on rp2350(pico2 board inside ClockWork PicoCalc) >>> and >>> >>> having trouble getting SMP (2-core) config running >>> >>> >>> >>> without patching the sched lock/unlock routines. >>> >>> >>> >>> I am looking at the commit >>> >>> >>> https://github.com/apache/nuttx/commit/914ae532e68bf4ca75c758d6f541a1d2fcdce8c3 >>> >>> and reverting it helps to boot the system. >>> >>> >>> >>> This the the patch to make it work for me, I updated the current >>> >>> lock/unlock by moving the enter_critical_section above the access to >>> >>> rtcb->lockcount >>> >>> >>> >>> Do you think it could be the case of a race condition with >>> rtcb->lockcount ? >>> >>> >>> >>> PS >>> >>> >>> >>> It's almost 11 years since I last touched NuttX - it's a please to >>> work >>> >>> with it again. >>> >>> >>> >>> >>> >>> From 86fa0a65fdd37804154faf3db52e2826281cdfbd Mon Sep 17 00:00:00 >>> 2001 >>> >>> From: Serg Podtynnyi<s...@podtynnyi.com> >>> >>> Date: Tue, 22 Apr 2025 11:37:48 +0700 >>> >>> Subject: sched: update lock/unlock critical section entry >>> >>> >>> >>> Level up enter_critical_section in sched lock/unlock to cover >>> >>> rtcb->lockcount access >>> >>> >>> >>> Signed-off-by: Serg Podtynnyi<s...@podtynnyi.com> >>> >>> --- >>> >>> sched/sched/sched_lock.c | 4 ++-- >>> >>> sched/sched/sched_unlock.c | 4 ++-- >>> >>> 2 files changed, 4 insertions(+), 4 deletions(-) >>> >>> >>> >>> diff --git a/sched/sched/sched_lock.c b/sched/sched/sched_lock.c >>> >>> index d8f75e7c90..9cc8da3df4 100644 >>> >>> --- a/sched/sched/sched_lock.c >>> >>> +++ b/sched/sched/sched_lock.c >>> >>> @@ -82,18 +82,18 @@ void sched_lock(void) >>> >>> * operations on this thread (on any CPU) >>> >>> */ >>> >>> >>> >>> + irqstate_t flags = enter_critical_section_wo_note(); >>> >>> if (rtcb != NULL && rtcb->lockcount++ == 0) >>> >>> { >>> >>> #if (CONFIG_SCHED_CRITMONITOR_MAXTIME_PREEMPTION >= 0) || \ >>> >>> defined(CONFIG_SCHED_INSTRUMENTATION_PREEMPTION) >>> >>> - irqstate_t flags = enter_critical_section_wo_note(); >>> >>> >>> >>> /* Note that we have pre-emption locked */ >>> >>> >>> >>> nxsched_critmon_preemption(rtcb, true, >>> return_address(0)); >>> >>> sched_note_preemption(rtcb, true); >>> >>> - leave_critical_section_wo_note(flags); >>> >>> #endif >>> >>> } >>> >>> + leave_critical_section_wo_note(flags); >>> >>> } >>> >>> } >>> >>> diff --git a/sched/sched/sched_unlock.c b/sched/sched/sched_unlock.c >>> >>> index 4c7d681454..d21e007587 100644 >>> >>> --- a/sched/sched/sched_unlock.c >>> >>> +++ b/sched/sched/sched_unlock.c >>> >>> @@ -69,9 +69,9 @@ void sched_unlock(void) >>> >>> * then pre-emption has been re-enabled. >>> >>> */ >>> >>> >>> >>> + irqstate_t flags = enter_critical_section_wo_note(); >>> >>> if (rtcb != NULL && --rtcb->lockcount == 0) >>> >>> { >>> >>> - irqstate_t flags = enter_critical_section_wo_note(); >>> >>> >>> >>> /* Note that we no longer have pre-emption disabled. */ >>> >>> >>> >>> @@ -162,7 +162,7 @@ void sched_unlock(void) >>> >>> } >>> >>> #endif >>> >>> >>> >>> - leave_critical_section_wo_note(flags); >>> >>> } >>> >>> + leave_critical_section_wo_note(flags); >>> >>> } >>> >>> } >>> >>> -- >>> >>> 2.49.0 >>> >>> >>> >>> >>> >>> -- >>> >>> Serg Podtynnyi >>> >>> >>> >> -- Serg Podtynnyi >>> >>> -- >>> Serg Podtynnyi >>> >>>