Re:Re: sched lock/unlock issue on rp2350

Serg Podtynnyi Thu, 24 Apr 2025 05:14:26 -0700

Yes, I understood that, what if CPU A and CPU B enters the unlock routine with 
1 lockcount on the same task and after the decrement they both go into 
preemption and manipulating with linked list task list?


On April 24, 2025 8:06:36 AM UTC, hujun260 <[email protected]> wrote:
>The lockcount can be understood as a local variable, so no race conditions 
>will occur.
>编辑
>分享
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>At 2025-04-24 15:54:48, "Serg Podtynnyi" <[email protected]> wrote:
>>Hi, I tried my best to understand the logic behind unlock and 
>>merge_pending routines and come up with this fix that works for me.
>>I split the lockcount handling in 2 parts when lockcount is > 1 just 
>>decrement fast as always,
>>but when it's 1 we enter critical section and make it 0 inside 
>>it(pre-emption is enabled).
>>I think there is a small time after decrement in the if statement and 
>>  enter_critical_section_wo_note() in original code where can race happen.
>>Does this make sense? Sorry if it's a dumb question - scheduler stuff 
>>with SMP is rocket-science level for me yet.
>>
>>diff --git a/sched/sched/sched_unlock.c b/sched/sched/sched_unlock.c
>>
>>index 4c7d681454..15fc39b546 100644
>>--- a/sched/sched/sched_unlock.c
>>+++ b/sched/sched/sched_unlock.c
>>@@ -65,13 +65,14 @@ void sched_unlock(void)
>>
>>        DEBUGASSERT(rtcb == NULL || rtcb->lockcount > 0);
>>
>>-      /* Check if the lock counter has decremented to zero. If so,
>>-       * then pre-emption has been re-enabled.
>>-       */
>>-
>>-      if (rtcb != NULL && --rtcb->lockcount == 0)
>>+      if (rtcb != NULL && rtcb->lockcount > 1)
>>+        {
>>+          rtcb->lockcount--;
>>+        }
>>+      else if (rtcb != NULL && rtcb->lockcount == 1)
>>          {
>>            irqstate_t flags = enter_critical_section_wo_note();
>>+          rtcb->lockcount  = 0;
>>
>>            /* Note that we no longer have pre-emption disabled. */
>>
>>--
>>2.49.0
>>On 4/22/25 14:32, hujun260 wrote:
>>
>>It seems to have nothing to do with the race condition, and since the rtcb is 
>>this_task, it will only be modified by one CPU.<br/><br/>As for why your 
>>modification can solve the problem, I haven't figured it out yet. Can you 
>>further analyze the reasons for the failure to boot?
>>At 2025-04-22 13:25:13, "Serg Podtynnyi" <[email protected]> wrote:
>>
>>> Hi All, Gregory, Sebastien,
>>>
>>> I'm porting NuttX on rp2350(pico2 board inside ClockWork PicoCalc) and
>>> having trouble getting SMP (2-core) config running
>>>
>>> without patching the sched lock/unlock routines.
>>>
>>> I am looking at the commit
>>> https://github.com/apache/nuttx/commit/914ae532e68bf4ca75c758d6f541a1d2fcdce8c3
>>> and reverting it helps to boot the system.
>>>
>>> This the the patch to make it work for me, I updated the current
>>> lock/unlock by moving the enter_critical_section above the access to
>>> rtcb->lockcount
>>>
>>> Do you think it could be the case of a race condition with rtcb->lockcount ?
>>>
>>> PS
>>>
>>> It's almost 11 years since I last touched NuttX - it's a please to work
>>> with it again.
>>>
>>>
>>>  From 86fa0a65fdd37804154faf3db52e2826281cdfbd Mon Sep 17 00:00:00 2001
>>> From: Serg Podtynnyi <[email protected]>
>>> Date: Tue, 22 Apr 2025 11:37:48 +0700
>>> Subject: sched: update lock/unlock critical section entry
>>>
>>> Level up enter_critical_section in sched lock/unlock to cover
>>> rtcb->lockcount access
>>>
>>> Signed-off-by: Serg Podtynnyi <[email protected]>
>>> ---
>>>   sched/sched/sched_lock.c   | 4 ++--
>>>   sched/sched/sched_unlock.c | 4 ++--
>>>   2 files changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/sched/sched/sched_lock.c b/sched/sched/sched_lock.c
>>> index d8f75e7c90..9cc8da3df4 100644
>>> --- a/sched/sched/sched_lock.c
>>> +++ b/sched/sched/sched_lock.c
>>> @@ -82,18 +82,18 @@ void sched_lock(void)
>>>          * operations on this thread (on any CPU)
>>>          */
>>>
>>> +      irqstate_t flags = enter_critical_section_wo_note();
>>>         if (rtcb != NULL && rtcb->lockcount++ == 0)
>>>           {
>>>   #if (CONFIG_SCHED_CRITMONITOR_MAXTIME_PREEMPTION >= 0) || \
>>>       defined(CONFIG_SCHED_INSTRUMENTATION_PREEMPTION)
>>> -          irqstate_t flags = enter_critical_section_wo_note();
>>>
>>>             /* Note that we have pre-emption locked */
>>>
>>>             nxsched_critmon_preemption(rtcb, true, return_address(0));
>>>             sched_note_preemption(rtcb, true);
>>> -          leave_critical_section_wo_note(flags);
>>>   #endif
>>>           }
>>> +      leave_critical_section_wo_note(flags);
>>>       }
>>>   }
>>> diff --git a/sched/sched/sched_unlock.c b/sched/sched/sched_unlock.c
>>> index 4c7d681454..d21e007587 100644
>>> --- a/sched/sched/sched_unlock.c
>>> +++ b/sched/sched/sched_unlock.c
>>> @@ -69,9 +69,9 @@ void sched_unlock(void)
>>>          * then pre-emption has been re-enabled.
>>>          */
>>>
>>> +      irqstate_t flags = enter_critical_section_wo_note();
>>>         if (rtcb != NULL && --rtcb->lockcount == 0)
>>>           {
>>> -          irqstate_t flags = enter_critical_section_wo_note();
>>>
>>>             /* Note that we no longer have pre-emption disabled. */
>>>
>>> @@ -162,7 +162,7 @@ void sched_unlock(void)
>>>               }
>>>   #endif
>>>
>>> -          leave_critical_section_wo_note(flags);
>>>           }
>>> +      leave_critical_section_wo_note(flags);
>>>       }
>>>   }
>>> -- 
>>> 2.49.0
>>>
>>>
>>> -- 
>>> Serg Podtynnyi
>>>
>>-- Serg Podtynnyi

Re:Re: sched lock/unlock issue on rp2350

Reply via email to