Fix-Point opened a new pull request, #18131: URL: https://github.com/apache/nuttx/pull/18131
## Summary This PR is part of the https://github.com/apache/nuttx/pull/17675. It introduces a series of fixes and improvements to the high-resolution timer (hrtimer) subsystem, primarily focusing on functional correctness, synchronization in SMP environments, and API enhancements. The changes were necessary to address several issues: 1. **SMP Synchronization:** On 32-bit platforms, read/write operations to the global `g_hrtimer_running` array were not atomic, potentially leading to data races. This is fixed by replacing the spinlock with a sequence lock (`seqcount`) to protect concurrent access. 2. **Timer State Management:** The previous mechanism using `hrtimer->expired` as a version number was flawed as it's not monotonic. The new implementation introduces an ownership encoding scheme (using the LSB of the pointer in `g_hrtimer_running`) to enforce the invariant that a cancelled timer cannot be modified again, ensuring correct behavior during concurrent cancellation and callback execution. 3. **API Improvements:** A new `hrtimer_gettime()` API is added to allow querying the remaining delay of a timer in nanoseconds. For code reusability, the internal `hrtimer_gettime()` helper is moved to the public interface as `clock_systime_nsec()`. 4. **Functional Fixes:** Corrects the timer programming logic for tickless and non-tickless alarm modes, ensures relative delays in `hrtimer_start` are capped at `HRTIMER_MAX_DELAY` to prevent overflow, and simplifies the cancellation logic. 5. **Code Simplification:** The `hrtimer_cancel` and `hrtimer_cancel_sync` functions are refactored to have clearer semantics (asynchronous vs. synchronous cancellation) and a helper function `hrtimer_wait` is introduced to consolidate the waiting logic. The main modifications include: * Adding `clock_systime_nsec()` inline function to `clock.h`. * Defining `HRTIMER_MAX_DELAY` and updating documentation in `hrtimer.h`. * Adding the `hrtimer_gettime()` API prototype. * Replacing the global spinlock with a sequence lock (`g_hrtimer_lock`) in the internal header and implementation files. * Rewriting SMP ownership tracking using the new encoding scheme. * Adding a new source file `hrtimer_gettime.c`. * Updating `hrtimer_start`, `hrtimer_cancel`, `hrtimer_process`, and related functions to use the new synchronization and state management mechanisms. ## Impact * **Users (Developers):** * The semantics of `hrtimer_cancel` and `hrtimer_cancel_sync` are now more explicitly documented. `hrtimer_cancel` provides limited ownership (allows restarting but not freeing), while `hrtimer_cancel_sync` provides full ownership (allows safe deallocation). Users must choose the appropriate function based on their needs to avoid concurrency errors. * A new API `hrtimer_gettime(FAR hrtimer_t *timer)` is available to query the time until a timer's next expiration in nanoseconds. * The utility function `clock_systime_nsec()` is available for other parts of the system to get the current system time in nanoseconds. * **Build Process:** The build system files (`CMakeLists.txt`, `Make.defs`) are updated to include the new `hrtimer_gettime.c` source file. * **Hardware (SMP):** The changes significantly improve the correctness of hrtimer operations in SMP environments by ensuring proper synchronization and state management across cores. For non-SMP systems, the synchronization overhead might change slightly due to the lock type change. * **Documentation:** The in-code documentation for the modified functions (`hrtimer_cancel`, `hrtimer_cancel_sync`, `hrtimer_start`) has been updated to reflect the new behavior, assumptions, and return values. This provides clearer guidance for developers. * **Security & Robustness:** The fixes prevent potential race conditions and invalid state transitions in SMP, leading to a more stable system. Capping the relative delay prevents overflow-related undefined behavior. * **Compatibility:** The return value of `hrtimer_cancel` now has an additional meaning: a value > 0 indicates the timer callback is running. Code that only checked for negative values (errors) might need adjustment if it intends to handle the "running" case. The internal locking mechanism change is transparent to the API but ensures correct concurrent operation. ## Test Tested on `rv-virt:smp`, `ostest` passed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
