mutex contention by pre-allocating outside the lock

Breno Leitao Wed, 17 Jun 2026 03:26:48 -0700

On Wed, Jun 17, 2026 at 10:52:40AM +0200, Oleg Nesterov wrote:
> On 06/16, Josh Triplett wrote:
> >
> > On Sun, May 24, 2026 at 07:44:57AM -0700, Breno Leitao wrote:
> > > This series pre-allocates pages outside pipe->mutex in
> > > anon_pipe_write(): for writes that span more than one full page, up
> > > to PIPE_PREALLOC_MAX (8) pages are allocated via a per-page
> > > alloc_page() loop before the mutex is taken. anon_pipe_get_page()
> > > then drains the prealloc array first, falls back to the per-pipe
> > > tmp_page[] cache, and only enters the allocator under the mutex for
> > > the leftover pages (writes larger than PIPE_PREALLOC_MAX, single-page
> > > writes that skip prealloc, or shortfalls when the prealloc loop
> > > fails). Leftover prealloc pages are recycled into tmp_page[] before
> > > unlock and any remainder is put_page()'d after unlock, keeping the
> > > allocator out of the critical section on both sides.
> > [...]
> > > I also vibe-coded a microbenchmark to validate the change. It sweeps
> > > writers x readers over {1,2,5} x {1,5,10} with 64KB writes against a
> > > 1 MB pipe and prints throughput + latency percentiles per config.
> >
> > How do the numbers compare with 1-byte writes/reads? (It's fine if
> > they're not *faster*, just want to make sure they don't get any
> > *worse*. This case comes up a lot with pipes used for synchronization or
> > event reporting, such as with make.)
> 
> Note the "for writes that span more than one full page" above. Pre-allocate
> does nothing if total_len <= PAGE_SIZE.


Exactly.


The pre-allocation only triggers for multi-page writes:

anon_pipe_get_page_prealloc() returns immediately when total_len <= PAGE_SIZE,
so a 1-byte (or any sub-page) write never enters the new path.

anon_pipe_get_page() then falls through to the existing tmp_page/alloc_page
logic exactly as before; the only added cost is one length check and a NULL
prealloc pop, both trivially predicted.

Measured it to _just be sure_, 1-byte ping-pong (perf bench sched pipe -s 1):

    baseline:  2.674 usecs/op
    patched:   2.710 usecs/op   (+1.3%, within run-to-run noise)

--breno

Re: [PATCH v3 0/2] fs/pipe: reduce pipe->mutex contention by pre-allocating outside the lock

Reply via email to