wangzhi16 opened a new pull request, #17564: URL: https://github.com/apache/nuttx/pull/17564
This reverts commit 29e50ffa7374fa4c473d8d4f8cb3506665443d3e. ## Summary Placing the main thread and the gourd in the same memory block, and allocating and freeing memory simultaneously, presents the following two problems: 1. When the main thread creates a child thread and performs a detach operation, there is a possibility that the main thread may have exited, but the main thread's TCB (Transaction Control Block) may not have been released. 2. This could potentially cause the main thread's TCB to be double-freed. The core contradiction in this problem lies in binding the main thread's TCB (Trust Container Registry) and the group together. When releasing the main thread's TCB, an additional check is needed to ensure the main thread was the last to leave the group. If this check and the `free` operation are atomically guaranteed, the logic is sound, and double freeing won't occur. However, this atomicity cannot be completely guaranteed. If other `free` operations cause a block, problems still arise. For example: > The main thread exits first, executes `group_leave` to remove itself from the group, executes another `free` operation (such as `nxtask_joindestroy`) causing a block, and switches to a child thread. > When the child thread executes `group_leave`, it discovers it was the last thread to exit, releases the group (the child thread assumes the main thread has already performed `realloc`, so it only releases the group structure, actually releasing the main thread's TCB along with it), releases its own TCB, and exits. > When switching back to the main thread, during `release`, it checks that there are no other threads in the group, assumes it was the last thread to exit (both the main thread and the child thread believe they were the last to leave the group, which is the root cause of the problem), and then executes the `free` operation, resulting in double freeing. ## Impact None ## Testing ### Before revert: Configuration that needs to be opened: ``` CONFIG_SCHED_HAVE_PARENT=y CONFIG_CANCELLATION_POINTS=y ``` test on qemu-armv7a: build:`tools/configure.sh qemu-armv7a:nsh; make -j16` result: <img width="1290" height="359" alt="image" src="https://github.com/user-attachments/assets/69390a1e-e934-40d7-bf2e-f34b1333f9bb" /> The crash mentioned above was caused by the execution flow described below. ``` thread_parent ------------------------------------thread_child detach() group_leave() nxsig_cleanup() free()---->block! ---------------------------------------------------group_leave() ---------------------------------------------------check no member in group, release thread_parent. ---------------------------------------------------exit() release_tcb()--->double free!!! ``` ### after revert: test on qemu-armv7a: build:`tools/configure.sh qemu-armv7a:nsh; make -j16` result: <img width="932" height="154" alt="image" src="https://github.com/user-attachments/assets/e6685ccf-98c9-454f-bcfb-ad09c2e7d449" /> **_PRs without testing information will not be accepted. We will request test logs._** -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
