Barry Song <[email protected]> writes: > On Fri, Jun 19, 2026 at 12:41 PM Ritesh Harjani (IBM) > <[email protected]> wrote: >> >> THP_SWAP avoids splitting of a transparent huge folio into 32 smaller >> 64K folios (Radix-64K pagesize / 2M PMD) or into 256 smaller 64K folios >> (Hash-64K pagesize / 16M PMD), during swapout. This improves the >> swapping performance since all the bookking & I/O submission happens >> once per large folio. More details at [1]. >> >> PowerPC Book3S64 could not enable this before because PMD_ORDER is >> selected at runtime depending upon the chosen MMU. The earlier patches >> in this series turn SWAPFILE_CLUSTER into a runtime value and introduce >> an ARCH_MAX_PMD_ORDER upperbound override for SWAP_NR_ORDERS. With those >> changes, we can now enable THP SWAP for Book3S64. >> >> This increases bandwidth throughput with zram backend for swapout by >> 40-50% with Radix and 100-130% with Hash (Tested by Sayali) > > Thanks! > > I am curious about the contents of the anonymous memory being tested > and the compression algorithm used by zram. >
I am sure it was derived from your microbenchmark itself which you had shared here (so repetitive pattern) with default zram compression algorithm. Thanks for that :) https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d0637c505f I think I got your point - I can mention that it was a microbenchmark similar to yours and not a real world workload test. Is this what you meant here? -ritesh >> >> [1]: https://lore.kernel.org/all/[email protected]/ > > Best Regards > Barry
