pussuw opened a new issue, #14448:
URL: https://github.com/apache/nuttx/issues/14448

   ### Description / Steps to reproduce the issue
   
   There is a serious issue with the current asynchronous signal delivery 
system; it will forcibly make another CPU resume
   code at places where this must not happen. Take the following example where 
CPU0 takes a semaphore and CPU1 sends
   a signal to it:
   
   ```
   CPU0                                                CPU1
   nxsem_wait() // Take semaphore                                     
   enter_critical_section()                 
   ... in atomic section ...                           
   up_switch_context(this_task(), rtcb)
   ---> next process , atomic section over             enter_critical_section()
                                                       nxsig_queue_action()
                                                       nxsched_smp_call_single()
                                                         // Setup interrupt on 
CPU0 to run sig_handler
                                                         
nxsched_smp_call_single(stcb->cpu, sig_handler, &arg, true);
                                                       <--- SMP_CALL interrupt 
pends on CPU0
                                                       leave_critical_section()
   ---> SMP_CALL interrupt fires on CPU0
                  |
                  v
   nxsched_smp_call_handler()
     // Run sig_handler on CPU0
     sig_handler()
     up_schedule_sigaction()
     // up_schedule_sigaction makes task on CPU1 return to riscv_sigdeliver
     tcb->xcp.regs[REG_EPC] = (uintptr_t)riscv_sigdeliver;
                 |
                 v
   riscv_smp_call_handler()
     // riscv_smp_call_handler restores (new) context, EPC=riscv_sigdeliver
     tcb = current_task(cpu);
     riscv_savecontext(tcb);
     nxsched_process_delivered(cpu);
     tcb = current_task(cpu);
     riscv_restorecontext(tcb);
                  |
                  v
   riscv_smp_call_handler() interrupt returns
                  |
                  v
   riscv_sigdeliver()
                  |
                  v
   signal_handler()
     // Signal handler runs in userspace
             ***CRASH***
   ```
   
   If the process on CPU0 crashes in the signal handler, the semaphore taken on 
CPU0 does *not* get freed,
   causing a resource leak.
    
   The leak is not an issue for user resources but is catastrophic for kernel 
resources!
   
   
   
   
   ### On which OS does this issue occur?
   
   [OS: Linux]
   
   ### What is the version of your OS?
   
   Irrelevant
   
   ### NuttX Version
   
   master
   
   ### Issue Architecture
   
   [Arch: all]
   
   ### Issue Area
   
   [Area: Posix]
   
   ### Verification
   
   - [X] I have verified before submitting the report.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to