[email protected], le mar. 03 mars 2026 04:08:15 +0000, a ecrit:
> #### S-1: `_hurd_sigstate_delete` use-after-free (CRITICAL)
> 
> **Files**: `hurdsig.c:149-170` (delete), `hurdsig.c:1210-1226` (signal 
> delivery)
> 
> `_hurd_sigstate_delete` unlinks `ss` from the list under `_hurd_siglock`, then
> frees it **without acquiring `ss->lock`**. Meanwhile, the signal thread in
> `post_all_pending_signals` holds `ss->lock` (acquired during list traversal)
> but releases `_hurd_siglock` before calling `post_pending(ss, ...)`. A 
> concurrent
> `_hurd_sigstate_delete` will free `ss` while the signal thread still holds
> `ss->lock` and accesses `ss->pending`, `ss->thread`, etc.
> 
> **Impact**: Heap corruption, crash. A thread exiting during process-wide 
> signal
> delivery triggers this. This is the most likely cause of ext2fs deadlocks and
> random crashes under SMP load.

> #### H-1: `__pthread_kill` TOCTOU on `kernel_thread` (CRITICAL)
> 
> **File**: `sysdeps/hurd/htl/pt-kill.c:40-44`
> 
> ```c
> if (pthread->kernel_thread == MACH_PORT_DEAD)  // line 40: no lock
>     return 0;
> ss = _hurd_thread_sigstate (pthread->kernel_thread);  // line 44: race!
> ```
> 
> Between the check at line 40 and the use at line 44, the target thread can 
> exit,
> setting `kernel_thread = MACH_PORT_DEAD` and destroying the port. Thread A
> then passes a dead/recycled port to `_hurd_thread_sigstate`. No lock protects
> `kernel_thread`.
> 
> **Impact**: Use of dead Mach port, misrouted signals, potential crash. This 
> is a
> classic TOCTOU race that manifests readily on SMP.

> #### H-2: `kernel_thread` written with plain store (CRITICAL)
> 
> **File**: `sysdeps/mach/htl/pt-thread-terminate.c:61`
> 
> ```c
> thread->kernel_thread = MACH_PORT_DEAD;  // plain store, no barrier
> ```
> 
> No atomic operation, no memory barrier. Combined with H-1's unlocked read,
> this creates a data race. On x86 the store is atomic (aligned word), but
> there's no ordering guarantee that other fields (e.g., `nr_refs`) are visible
> before the `MACH_PORT_DEAD` sentinel is seen.

> #### H-6: `__pthread_dealloc_finish` writes `terminated` outside lock (LOW on 
> x86)
> 
> **File**: `htl/pt-dealloc.c:72`
> 
> `terminated = TRUE` is written after the thread structure was added to the 
> free
> list (under `__pthread_free_threads_lock`). The store happens outside the lock
> scope, with no memory barrier. `__pthread_alloc` reads `terminated` under the
> lock but the happens-before relationship is broken. On x86 TSO this works; on
> weaker architectures, alloc could see stale `terminated == FALSE`.

I was looking at such issue the other week, prompted by some race
conditions seen on thread termination.

The relation between a thread existing and ss has always been sketchy,
at least because htl used to be completely separate from the upstream
glibc.

One of the main blockers I see is that the liveness of ss is not
tied to the liveness of a pthread_t. We should migrate to making
the ss getting allocated by __pthread_thread_alloc and freed by
__pthread_thread_terminate, to keep coherency between the two, and
making sure that we don't leak resources. Notably, if one calls
_hurd_thread_sigstate on a thread that is terminating and has already
unregistered itself from the list of threads, _hurd_thread_sigstate will
allocate a new ss which will never be freed.

Along the path, we should use the thread->nr_refs reference counter in
functions that manipulate a pthread_t. For instance, in __pthread_kill,
we want __pthread_getid to take a reference so the thread structure
stays there, as well as the ss, until we are finished with it.

I have added that as todo on the contributing page.

Samuel

Reply via email to