wyr-7 commented on PR #18210:
URL: https://github.com/apache/nuttx/pull/18210#issuecomment-3822130045

   
   
   
   
   > Please provide test logs for these cases:
   > 
   > > ```
   > > Verified on QEMU ARMv7-A simulator with multimedia profile
   > > Reader-writer semaphore operations verified:
   > >     Single reader access patterns
   > >     Single writer access patterns
   > >     Mixed reader-writer contention scenarios
   > >     Waiter notification behavior verified
   > >     No unnecessary context switches on optimized paths
   > > Static analysis shows improved compliance metrics
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > Testing
   > > Reader-writer semaphore scenarios:
   > > Multiple readers with no writers
   > > Multiple writers (exclusive access)
   > > Mixed reader-writer access patterns
   > > Waiter wake-up correctness
   > > Lock holder tracking
   > > Context switch reduction verification
   > 
   > > Performance: Reduced context switch overhead in high-contention scenarios
   > 
   > How did you verify this performance improvement? Can you please share the 
scenario and the results?
   
   Thank you for the review. This optimization has been validated in Vela OS 
(Xiaomi's embedded operating system based on NuttX) and is running in 
production across multiple product lines including wearable devices, IoT 
devices, and automotive systems.
   
   The optimization addresses two inefficiencies:
   up_write(): Original code calls up_wait() unconditionally, even during 
partial release of recursive locks (when writer > 0). This causes unnecessary 
semaphore posts when no waiter can actually acquire the lock.
   
   up_read(): Original code checks waiter > 0 instead of reader > 0, 
potentially calling up_wait() while other readers still hold the lock. Write 
waiters wake up, find the lock unavailable, and go back to sleep.
   
   The fix ensures up_wait() is only called when the lock is actually available 
for acquisition - when writer reaches 0 or when the last reader releases.
   
   This has been running in Vela OS production for several months on ARM 
Cortex-A SMP platforms. We observed:
   
   Reduced scheduler overhead in recursive locking scenarios (VFS layer, 
graphics subsystem)
   Improved responsiveness under high reader concurrency
   No regressions in extensive stress testing and long-running stability tests
   The issue was identified through production profiling showing unnecessary 
nxsem_post() calls and spurious wake-ups in rwsem release paths.
   
   Correctness
   The optimization preserves all semantics - waiters are still woken at the 
correct time, all operations remain mutex-protected, and it passes all existing 
tests. It simply eliminates redundant wake-up operations.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to