Yes, I understood that, what if CPU A and CPU B enters the unlock routine with 1 lockcount on the same task and after the decrement they both go into preemption and manipulating with linked list task list?
On April 24, 2025 8:06:36 AM UTC, hujun260 <hujun...@163.com> wrote: >The lockcount can be understood as a local variable, so no race conditions >will occur. >编辑 >分享 > > > > > > > > > > > > > > > > > >At 2025-04-24 15:54:48, "Serg Podtynnyi" <s...@podtynnyi.com.INVALID> wrote: >>Hi, I tried my best to understand the logic behind unlock and >>merge_pending routines and come up with this fix that works for me. >>I split the lockcount handling in 2 parts when lockcount is > 1 just >>decrement fast as always, >>but when it's 1 we enter critical section and make it 0 inside >>it(pre-emption is enabled). >>I think there is a small time after decrement in the if statement and >> enter_critical_section_wo_note() in original code where can race happen. >>Does this make sense? Sorry if it's a dumb question - scheduler stuff >>with SMP is rocket-science level for me yet. >> >>diff --git a/sched/sched/sched_unlock.c b/sched/sched/sched_unlock.c >> >>index 4c7d681454..15fc39b546 100644 >>--- a/sched/sched/sched_unlock.c >>+++ b/sched/sched/sched_unlock.c >>@@ -65,13 +65,14 @@ void sched_unlock(void) >> >> DEBUGASSERT(rtcb == NULL || rtcb->lockcount > 0); >> >>- /* Check if the lock counter has decremented to zero. If so, >>- * then pre-emption has been re-enabled. >>- */ >>- >>- if (rtcb != NULL && --rtcb->lockcount == 0) >>+ if (rtcb != NULL && rtcb->lockcount > 1) >>+ { >>+ rtcb->lockcount--; >>+ } >>+ else if (rtcb != NULL && rtcb->lockcount == 1) >> { >> irqstate_t flags = enter_critical_section_wo_note(); >>+ rtcb->lockcount = 0; >> >> /* Note that we no longer have pre-emption disabled. */ >> >>-- >>2.49.0 >>On 4/22/25 14:32, hujun260 wrote: >> >>It seems to have nothing to do with the race condition, and since the rtcb is >>this_task, it will only be modified by one CPU.<br/><br/>As for why your >>modification can solve the problem, I haven't figured it out yet. Can you >>further analyze the reasons for the failure to boot? >>At 2025-04-22 13:25:13, "Serg Podtynnyi" <s...@podtynnyi.com.INVALID> wrote: >> >>> Hi All, Gregory, Sebastien, >>> >>> I'm porting NuttX on rp2350(pico2 board inside ClockWork PicoCalc) and >>> having trouble getting SMP (2-core) config running >>> >>> without patching the sched lock/unlock routines. >>> >>> I am looking at the commit >>> https://github.com/apache/nuttx/commit/914ae532e68bf4ca75c758d6f541a1d2fcdce8c3 >>> and reverting it helps to boot the system. >>> >>> This the the patch to make it work for me, I updated the current >>> lock/unlock by moving the enter_critical_section above the access to >>> rtcb->lockcount >>> >>> Do you think it could be the case of a race condition with rtcb->lockcount ? >>> >>> PS >>> >>> It's almost 11 years since I last touched NuttX - it's a please to work >>> with it again. >>> >>> >>> From 86fa0a65fdd37804154faf3db52e2826281cdfbd Mon Sep 17 00:00:00 2001 >>> From: Serg Podtynnyi <s...@podtynnyi.com> >>> Date: Tue, 22 Apr 2025 11:37:48 +0700 >>> Subject: sched: update lock/unlock critical section entry >>> >>> Level up enter_critical_section in sched lock/unlock to cover >>> rtcb->lockcount access >>> >>> Signed-off-by: Serg Podtynnyi <s...@podtynnyi.com> >>> --- >>> sched/sched/sched_lock.c | 4 ++-- >>> sched/sched/sched_unlock.c | 4 ++-- >>> 2 files changed, 4 insertions(+), 4 deletions(-) >>> >>> diff --git a/sched/sched/sched_lock.c b/sched/sched/sched_lock.c >>> index d8f75e7c90..9cc8da3df4 100644 >>> --- a/sched/sched/sched_lock.c >>> +++ b/sched/sched/sched_lock.c >>> @@ -82,18 +82,18 @@ void sched_lock(void) >>> * operations on this thread (on any CPU) >>> */ >>> >>> + irqstate_t flags = enter_critical_section_wo_note(); >>> if (rtcb != NULL && rtcb->lockcount++ == 0) >>> { >>> #if (CONFIG_SCHED_CRITMONITOR_MAXTIME_PREEMPTION >= 0) || \ >>> defined(CONFIG_SCHED_INSTRUMENTATION_PREEMPTION) >>> - irqstate_t flags = enter_critical_section_wo_note(); >>> >>> /* Note that we have pre-emption locked */ >>> >>> nxsched_critmon_preemption(rtcb, true, return_address(0)); >>> sched_note_preemption(rtcb, true); >>> - leave_critical_section_wo_note(flags); >>> #endif >>> } >>> + leave_critical_section_wo_note(flags); >>> } >>> } >>> diff --git a/sched/sched/sched_unlock.c b/sched/sched/sched_unlock.c >>> index 4c7d681454..d21e007587 100644 >>> --- a/sched/sched/sched_unlock.c >>> +++ b/sched/sched/sched_unlock.c >>> @@ -69,9 +69,9 @@ void sched_unlock(void) >>> * then pre-emption has been re-enabled. >>> */ >>> >>> + irqstate_t flags = enter_critical_section_wo_note(); >>> if (rtcb != NULL && --rtcb->lockcount == 0) >>> { >>> - irqstate_t flags = enter_critical_section_wo_note(); >>> >>> /* Note that we no longer have pre-emption disabled. */ >>> >>> @@ -162,7 +162,7 @@ void sched_unlock(void) >>> } >>> #endif >>> >>> - leave_critical_section_wo_note(flags); >>> } >>> + leave_critical_section_wo_note(flags); >>> } >>> } >>> -- >>> 2.49.0 >>> >>> >>> -- >>> Serg Podtynnyi >>> >>-- Serg Podtynnyi