Fix-Point opened a new pull request, #17276: URL: https://github.com/apache/nuttx/pull/17276
## Summary This PR proposed ClockDevice, a new timer driver abstraction for NuttX. The new CLOCKDEVICE timer hardware abstraction delivers: - Functional correctness: Thread-safe and overflow-free - High performance: Up to 3x faster on ARMv7A platforms - Theoretically optimal timing precision: Uses hardware cycle counts as the time unit. - Time-unit independent interfaces: Decouples timer drivers from OS time subsystems - Minimalist driver implementation: Reduces driver code size by nearly 70%. ## Impact These code commits affect the timing subsystem, as well as the following architecture: - arm-v7a/arm-v7r/arm-v8r - arm-v8a - riscv - sim - tricore - intel64. ## Testing To evaluate the performance improvements, we conducted tests on three platforms: qemu-inte64/KVM, imx8qm-mek/arm64, and qemu-armv7a. We measured the CPU cycle overhead for: - Reading the current time - Setting a timer - Handling a timer callback Each operation was executed 10 million times, and the results were averaged. The results demonstrated significant performance improvments. On qemu-armv7a, the software division is the performance bottle-neck. ClockDevice bring significant performance improvements: - clock_gettime 2.56x - clock_systime_ticks 3.00x - wd_start 1.67x - wd_start_cancel 1.43x - timer expiration 2.36x <img width="579" height="435" alt="image" src="https://github.com/user-attachments/assets/dc7394c2-0cc8-4c99-81a8-2920f14fbc2b" /> On qemu-inte64/KVM, ClockDevice achieved up to 1.42x performance improved. Especially, for clock_gettime API, NuttX with ClockDevice improvements had 1.31x better performance than Linux Kernel 6.8.0-51. <img width="597" height="448" alt="image" src="https://github.com/user-attachments/assets/5c3a2dba-5823-41f3-9e73-17f0327de25c" /> On imx8qm-mek/arm64, ClockDevice achieved up to 1.68x performance improvement. Note on the early ARM64 platform, the INVDIV optimization can not work well since the hardware division instruction UDIV costs less CPU cycles on average than the INVDIV. <img width="607" height="455" alt="image" src="https://github.com/user-attachments/assets/576edd31-8a26-4a1c-8a5f-c86b8402157a" /> ## Plan Due to the need for extensive code modifications, some work remains unfinished. The following is a list of the planned tasks. - [x] 1. Simplify the timer drivers and add SMP initialization. - [x] 2. Remove the callback and arg from the oneshot API. - [x] 3. Remove tick-based oneshot API. - [x] 4. Add new count-based oneshot API (CLKDEV). - [x] 5. Introduce optimized fast-path for count-based oneshot API. - [x] 6. Reimplemented the common-use timer drivers with count-based oneshot API. - [WIP] 7. Inlining arch_alarm for performance (3% ~ 5% less execution time for clock_gettime, wd_start and wd_expiration). Architectures support: - [x] arm-v7a/v7r/v8r generic timer - [x] goldfish - [x] arm-v8a generic timer - [x] sim - [x] intel64 tsc - [x] tricore systimer - [WIP/Soon] risc-v -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
