anchao commented on PR #15705:
URL: https://github.com/apache/nuttx/pull/15705#issuecomment-2623338648

   
   > Yes, but you need first change the exuction of ALL IRQ from the interupt 
context to the thread context. Before that happen, the sched lock must be taken 
at the same time we hold spinlock.
   
   > Let's emphasis the key point:
   > 
   > 1. The busy loop lock(e.g. spinlock) must hold sched lock too
   > 2. The sleep lock(e.g. mutex) don't/shouldn't hold sched lock
   > 
   > So, you can't do the change in seperate step without breaking the code:
   > 
   > 1. Remove sched lock first
   > 2. Change spinlock to mutex at the late time
   > 
   > Let's consider a VERY COMMON case in SMP:
   > 
   > 1. thread0 on CPU0 hold a spinlock without hold sched lock
   > 2. thread1 on CPU1 wake up thread2
   > 3. OS decide suspend thread0 and resume thread2 on CPU0
   > 4. thread1 on CPU1 want to hold the spinlock hold by thread0
   > 
   > but thread1 is ver hard to know when it can finish the busy wait loop 
since thread0 is already in the suspend state. so, it's VERY VERY dangerous to 
schedule a thread which hold a spinlock.
   
   Let’s first unify our understanding:
   1. There was no deadlock problem before in nuttx's API semantics, and all 
codes works very well, regardless of SMP or AMP
   
   Right?
   
   2. The deadlock problem was caused by xiaomi's changes(from you 
@xiaoxiang781216  and @hujun260 )
   3. You first replaced **enter_critical_section()** with 
**spin_lock_irqsave()**,
   4. After finding the deadlock, you guys replace **spin_lock_irqsave()** with 
**spin_lock_irqsave()**+**sched_lock()**
   5. In order to simplify the code, you inside **sched_lock()** into 
**spin_lock_irqsave()**. Now you guys find that such a change will break all 
ESP devices and the performance of the entire system will be degraded
   6. Look at what you are doing now,
     Prepare to replace all kernel codes from **spin_lock_irqsave()** to 
**raw_spin_lock_irqsave()**
   In order to solve the problem of startup crash, replace all 
**spin_lock_irqsave()** with **raw_spin_lock_irqsave()**
   
   I can't imagine how many board-level and chip codes will need similar 
adjustments in the future. 
   How can other developers continue to trust NuttX when the kernel API is so 
unstable?
   
   Most kernel code does not need to hold **sched_lock()**, which is unfair to 
some code that does not cause scheduling. New API semantics should be expressed 
with new APIs instead of modifying the behavior of the original API. 
   
   > > 3. You can look at the spin_lock implementation in other RTOS, no one 
will disable preemption(**zephyr**)
   > >    
https://github.com/zephyrproject-rtos/zephyr/blob/c66a0bb2a9269dc483079d5cc24ba77b8c332a9a/include/zephyr/spinlock.h#L182-L214
   > 
   > zephyr's k_spin_lock disable irq too, but nuttx's spin_lock doesn't 
disable irq.
   
   let remove **spin_lock_irqsave()** and move the **irqsave** to 
**spin_lock()** ok?
   
   > > Now we have 4 APIs available to developers
   > > 1.spin_lock 2. spin_lock_nopreempt 3. spin_lock_irqsave 4. 
spin_lock_irqsave_nopreempt
   > > If you think the code needs to hold sched_lock, please use 
*_spin_lock__nopreempt__ instead of **spin_lock()**
   > 
   > No, the correct semantics(same as Linux) is:
   > 
   > 1. spin_lock
   > 2. spin_lock_preempt
   > 3. spin_lock_irqsave
   > 4. spin_lock_irqsave_preempt
   > 
   > Otherwise, the most code will sufer the random deadlock or long time busy 
loop like the example I mention previously. Only the special designed code can 
use spin_lock_preempt/spin_lock_irqsave_preempt as expect. The bottom line is 
that the most common used API should be SMP SAFE.
   
   No, keep the current API semantics unchanged, and use new APIs to represent 
new features


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to