[I] [BUG] Signal action handlers executed in critical section [nuttx]

via GitHub Thu, 22 May 2025 23:26:56 -0700


jlaitine opened a new issue, #16430:
URL: https://github.com/apache/nuttx/issues/16430

### Description / Steps to reproduce the issue

**Problem statement**

This is an old bug/vulnerability, which has been in NuttX for ages. signal
delivery is done from several contexts, and in some cases the only way to
execute the signal properly is to keep the critical section.

For example, for risc-v there is code like this:

https://github.com/apache/nuttx/blob/f4cc2c262853bce7321bf82661a93426eedfd854/arch/risc-v/src/common/riscv_schedulesigaction.c#L93

Similar code exists for all platforms, some platforms even have more
branches to deal with different situations.

User wouldn't know that sometims (randomly), the interrupts and scheduling
are disabled for the whole duration of the signal action handler.

**Suggested resolution**

All of this is in principle useless. Signals should be executed only from
one place/context for every platform. The correct place is after the context
switch, before returning to the user code. And in order to keep the real-time
behaviour, there should be a way to directly trigger the context switch at the
time of signal dispatch.

Also in the SMP, it is not necessary to force the delivery of the signal by
locking the other CPU, and scheduling the signal action via an SMP call. It
should be enough just to add the action to the pending action queue, and
trigger a re-schedule on that CPU. If the task/thread happens to migrate to
other CPU before the triggered re-schedule happens, it just doesn't matter. The
signal action is anyhow executed when the thread is scheduled to run next time
on the other CPU.

Some time ago, I made a quick hack to test an alternative way to schedule
signal actions for risc-v, removing the critical section:

https://github.com/apache/nuttx/blob/f4cc2c262853bce7321bf82661a93426eedfd854/arch/risc-v/src/common/riscv_schedulesigaction.c#L93

Which works fine in flat mode, on platforms where context switch is done via
the IRQ. But obviously this is not a generic solution - all platforms don't
enter the irq_dispatch() at context switch.

So, in order to fix this one would need to identify:

- What would be the correct place in code to schedule the signal action for
each platform (in the context switch path). There should be just one place and
one context to call the "up_schedule_sigaction"
- What is the correct way to trigger a context switch from within
"nxsig_queue_action"; for single CPU case "up_switch_context" is probably the
easiest one. In SMP case the cleanest solution is probably using some dedicated
syscall, common to all platforms, requesting re-schedule. This can be triggered
via an IPI in the same way as the "smp_call" is done.
- Clean up the "up_schedule_sigaction" for each platform, removing the extra
branches.

I started looking into this some time ago, but never found time to continue,
so I decided to just write this issue - so that anyone interested can continue
on this.

### On which OS does this issue occur?

[OS: Linux]

### What is the version of your OS?

22.04.1-Ubuntu

### NuttX Version

master

### Issue Architecture

[Arch: all]

### Issue Area

[Area: Kernel]

### Host information

_No response_

### Verification

- [x] I have verified before submitting the report.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@nuttx.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[I] [BUG] Signal action handlers executed in critical section [nuttx]

Reply via email to