jlaitine opened a new issue, #16430: URL: https://github.com/apache/nuttx/issues/16430
### Description / Steps to reproduce the issue **Problem statement** This is an old bug/vulnerability, which has been in NuttX for ages. signal delivery is done from several contexts, and in some cases the only way to execute the signal properly is to keep the critical section. For example, for risc-v there is code like this: https://github.com/apache/nuttx/blob/f4cc2c262853bce7321bf82661a93426eedfd854/arch/risc-v/src/common/riscv_schedulesigaction.c#L93 Similar code exists for all platforms, some platforms even have more branches to deal with different situations. User wouldn't know that sometims (randomly), the interrupts and scheduling are disabled for the whole duration of the signal action handler. **Suggested resolution** All of this is in principle useless. Signals should be executed only from one place/context for every platform. The correct place is after the context switch, before returning to the user code. And in order to keep the real-time behaviour, there should be a way to directly trigger the context switch at the time of signal dispatch. Also in the SMP, it is not necessary to force the delivery of the signal by locking the other CPU, and scheduling the signal action via an SMP call. It should be enough just to add the action to the pending action queue, and trigger a re-schedule on that CPU. If the task/thread happens to migrate to other CPU before the triggered re-schedule happens, it just doesn't matter. The signal action is anyhow executed when the thread is scheduled to run next time on the other CPU. Some time ago, I made a quick hack to test an alternative way to schedule signal actions for risc-v, removing the critical section: https://github.com/apache/nuttx/blob/f4cc2c262853bce7321bf82661a93426eedfd854/arch/risc-v/src/common/riscv_schedulesigaction.c#L93 Which works fine in flat mode, on platforms where context switch is done via the IRQ. But obviously this is not a generic solution - all platforms don't enter the irq_dispatch() at context switch. So, in order to fix this one would need to identify: - What would be the correct place in code to schedule the signal action for each platform (in the context switch path). There should be just one place and one context to call the "up_schedule_sigaction" - What is the correct way to trigger a context switch from within "nxsig_queue_action"; for single CPU case "up_switch_context" is probably the easiest one. In SMP case the cleanest solution is probably using some dedicated syscall, common to all platforms, requesting re-schedule. This can be triggered via an IPI in the same way as the "smp_call" is done. - Clean up the "up_schedule_sigaction" for each platform, removing the extra branches. I started looking into this some time ago, but never found time to continue, so I decided to just write this issue - so that anyone interested can continue on this. ### On which OS does this issue occur? [OS: Linux] ### What is the version of your OS? 22.04.1-Ubuntu ### NuttX Version master ### Issue Architecture [Arch: all] ### Issue Area [Area: Kernel] ### Host information _No response_ ### Verification - [x] I have verified before submitting the report. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@nuttx.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org