hartmannathan commented on PR #17468:
URL: https://github.com/apache/nuttx/pull/17468#issuecomment-3660536525
> @anchao @acassis
>
> **Patch Motivation & Proper Solution:**
>
> The main purpose of this patch is to address use-after-free issues that
can occur when a thread unexpectedly exits. For example, in scenarios such as
proc_open(), when nxsched_get_tcb obtains the TCB but the corresponding thread
exits at a higher priority, or during calls to functions like`
nxclock_gettime(), sched_backtrace(), or mm_memdump(),` a similar risk exists.
These problems should be systematically addressed by introducing critical
sections or by employing other synchronization strategies in those critical
paths.
>
>
>
> **Not Just SMP:**
>
> These use-after-free problems can occur in both SMP and UP environments —
this is not an SMP-only issue.
>
>
>
> **Limitations of Critical Sections:**
>
> Even with the addition of critical sections, these issues may not be fully
resolved. For instance, consider the following example:
>
>
>
> ```
>
>
>
> mm_memdump()
>
> {
>
> enter_critical_section();
>
> tcb = nxsched_get_tcb(pid);
>
> syslog(); // might wake up or schedule another thread
>
> name = get_task_name(tcb);
>
> leave_critical_section();
>
> }
>
> ```
>
>
>
> In this scenario, even with enter_critical_section(), there can still be a
race if syslog() internally leads to context switches that affect the TCB’s
lifetime.
>
>
>
> **On Mainstream RTOS Practice:**
>
> The notion that "mainstream RTOS do not incorporate this feature, as the
number of threads is predetermined in most scenarios" is not accurate. In a
typical IoT system, creating and destroying threads (pthreads) is frequent and
widespread—thread counts are rarely static. While in safety-critical automotive
contexts thread counts might be fixed, such scenarios are the exception, not
the rule, for consumer and general-purpose products.
>
>
>
> **On Performance and Size Impact:**
>
> Statements such as "it will exert an extremely detrimental impact on both
size and performance" are not substantiated without concrete measurement or
testing. According to the data provided by @hujun260, the actual performance
impact is minimal; please carefully review the test report. Regarding code
size: yes, adding additional safety checks will increase flash usage, and we
will supplement the PR with precise flash size measurements.
>
>
>
> If there is a better alternative to solve this problem, then this patch is
unnecessary. Patches are welcome.
@GUIDINGLI @hujun260 This explanation is extremely helpful for giving
context and understanding this change. I recommend to please add this
explanation to the PR description and also to the commit log, so that it can be
easily understood in the future. Thank you!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]