hujun260 commented on PR #13486: URL: https://github.com/apache/nuttx/pull/13486#issuecomment-2357375331
> > The benefits are very significant. Before the modification, if we needed to obtain the interrupt status, it required three steps: > > 1 Obtain the CPU index 2 Access the global variable 4 Disable/Enable interrupts This process involved at least 6 CPU instructions. > > However, now it only requires a single CPU instruction. > > 1. The switch irq enable/disable in `up_interrupt_context()` could be removed actually, as 32-bit is atomic type on arm32 CPU core > 2. The instructions cycle timings of MCR may bring more overhead, requiring **6 cycles in the worst case** > >  > > https://developer.arm.com/documentation/100026/0104/smr1465219161191 > > Do we have relevant performance test? For example, how many cycles does it take to call up_set_current_regs()/up_current_regs() 10,000 times with/out this PR? Firstly, irq masking cannot be removed here due to the crucial reason that we must ensure no scheduling occurs for the current task after the cpuindex is acquired. Otherwise, the cpuindex will not correspond to the CPU where the current task resides, leading to logical errors. The implementation of this_task follows a similar principle. The current implementation need at least 3 executions of msr/mrs instructions plus 4 normal instructions, making this optimization evident. After optimization, only a single msr instruction is needed, with no additional overhead. Unfortunately, we haven't conducted tests specifically for this single optimization point alone. Instead, we've tested the entire message sending/receiving process, and each test incorporates multiple optimization points. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
