Re: The behavior of spin_lock needs everyone's advice

Sebastien Lorquet Wed, 05 Feb 2025 00:54:08 -0800

Hello Xiaomi dev team,

If I understand correctly spinlocks are basically a no-op when no SMP isinvolved, which is the majority of CPUs supported by NuttX. You shallnot break all platforms.

I urge you all to revert any commit that might have broken something inthis domain, redesign it carefully on your own code fork, and then pusha final working version to nuttx.

Even if nuttx has releases, you are basically pushing untested changesto a main branch (in core code!) that is used by many people.


You are critically breaking nuttx for everyone.

You are a large company with financial resources, so PLEASE do your job,install a gitea or gitlab on a raspberry pi in your company and do yourdev in your private fork.


It is too many influence to push THAT MUCH amount of breakage to anyone.

The NuttX project is mature, it has real industrial users, it is not aplayground for your funny OS hacking.

PLEASE RESPOND and ACT to save the project. This problem need to beadressed at the root level.

Let me remind you that NuttX 12.8.0 is fully broken on my stm32f4 boardand that I cannot repair it until next week because my schedule is fullybooked on other critical project.

Our customer does not understand why adding tcp keepalive requires thatmuch work. We are going to loose customers because of your NuttX badcommunity behaviour.

I do not want to additionnally debug your locking crap on a cpu thatdoes not need any of it.



PLEASE DO SOMETHING RESPONSIBLE.

Sebastien


On 05/02/2025 08:13, chao an wrote:

>It isn't an initialization problem, the real cause is some code abusing
>spin lock(lock/unlock in the different thread).
>After holding sched_lock in spinlock_irqsave, the api requires that the
>lock/unlock come from the same thread.

So this brings up a potential problem with spin_lock, right?

>All performance critical code is evaluating carefully to skip the
>sched_lock/sched_unlock, like this:
>https://github.com/apache/nuttx/pull/15695
>And the optimization of sched_lock is under heavy development.
*raw_spin_lock_irqsave()* is not equivalent to without *sched_lock()*.On the contrary, in the Linux kernel, raw_spin_lock_irqsave() meansthat preemption is disabled by default.
https://github.com/torvalds/linux/blob/master/include/linux/spinlock_api_smp.h#L104-L113

image.png


>Here is the summary why I prefer to add sched_lock in
>spin_lock/spin_lock_irqsave by default:
> - The default(short) api should be safe since not all peopleunderstand
 >  the schedule behaviour deeply, please see the related discussion:

> https://github.com/apache/nuttx/issues/9531
> https://github.com/apache/nuttx/issues/1138
> https://www.kernel.org/doc/Documentation/preempt-locking.txt
> - spin_lock without sched_lock is unusable in the normal threadcontext,
>   since the scheduler may suspend the spinlock owner thread at any time
1. spin_lock can be applied in scenarios requiring more fine-grainedcontrol. For example, in some cases, we can conditionally decidewhether to hold the spin_lock, rather than simply forcing all users touse the version with sched_lock:
image.png
> - API semantic align with Linux to avoid Linux developer use theunsafe
>   api without any notification
2. NuttX is not Linux. We need to enable API users to truly understandwhat happens inside the API just by looking at its name.
> - the caller of enter_critical section can call sem_post/mq_sendwithout> problem, spin_lock/spin_lock_irqsave can only achieve the samebehaviour by
>   holding sched lock
3. We should handle scenarios that may cause scheduling differently.We should use *spin_lock_irqsave_nopreempt()* instead of providingdevelopers with a low - performance version by default.

Re: The behavior of spin_lock needs everyone's advice

Reply via email to