mutex contention by pre-allocating outside the lock

Oleg Nesterov Wed, 17 Jun 2026 01:53:07 -0700

On 06/16, Josh Triplett wrote:
>
> On Sun, May 24, 2026 at 07:44:57AM -0700, Breno Leitao wrote:
> > This series pre-allocates pages outside pipe->mutex in
> > anon_pipe_write(): for writes that span more than one full page, up
> > to PIPE_PREALLOC_MAX (8) pages are allocated via a per-page
> > alloc_page() loop before the mutex is taken. anon_pipe_get_page()
> > then drains the prealloc array first, falls back to the per-pipe
> > tmp_page[] cache, and only enters the allocator under the mutex for
> > the leftover pages (writes larger than PIPE_PREALLOC_MAX, single-page
> > writes that skip prealloc, or shortfalls when the prealloc loop
> > fails). Leftover prealloc pages are recycled into tmp_page[] before
> > unlock and any remainder is put_page()'d after unlock, keeping the
> > allocator out of the critical section on both sides.
> [...]
> > I also vibe-coded a microbenchmark to validate the change. It sweeps
> > writers x readers over {1,2,5} x {1,5,10} with 64KB writes against a
> > 1 MB pipe and prints throughput + latency percentiles per config.
>
> How do the numbers compare with 1-byte writes/reads? (It's fine if
> they're not *faster*, just want to make sure they don't get any
> *worse*. This case comes up a lot with pipes used for synchronization or
> event reporting, such as with make.)


Note the "for writes that span more than one full page" above. Pre-allocate
does nothing if total_len <= PAGE_SIZE.

Oleg.

Re: [PATCH v3 0/2] fs/pipe: reduce pipe->mutex contention by pre-allocating outside the lock

Reply via email to