On Mon, Feb 9, 2026 at 3:29 PM Andres Freund <[email protected]> wrote:
>
> Hi,
>
> On 2026-02-09 14:41:12 +0100, Jakub Wartak wrote:
> > I've thought that the potential main reason of the hit would be slow fork(),
> > so I had an idea why we fork() with majority of memory being shared_buffers
> > (BufferBlocks) that is not really used inside postmaster itself
> > (I mean it does not use it, only backends do use it). I've thought it could
> > be cool if we could just init the memory, leave just the fd from 
> > memfd_create
> > for s_b around (that is unmap() BufferBlocks from the postmaster thus 
> > lowering
> > its RSS/smaps footprint) and then on fork() the fork() would NOT have to 
> > copy
> > that big kernel VMA for shared_buffers. Instead (in theory - only the fd 
> > that
> > is the reference  - thereby we could increase the scalability of the 
> > postmaster
> > (kernel would need to perform less work during fork()). Later on, the 
> > classic
> > backends on their side would mmap() the region back from the fd created 
> > earlier
> > (in postmaster) using memfd_create(2), but that would happen as part of many
> > backends (so workload would be spread across many CPUs).
>
> FWIW, when looking at this in the past there were two noteworthy things:
>
> 1) The main driver of slowness was *NOT* shared buffers, but all the libraries
>    we link to. Particularly openssl makes things a *lot* slower, due to all
>    the small mappings it creates.  If you compare the fork speed of a postgres
>    with minimal dependencies and one with all the dependencies, you'll see a
>    huge difference.
>
>    The reason that openssl is so bad is that it modifies data in all the
>    copy-on-write mappings during process exit processing. See [1].
>
>
> 2) A lot of the slowness isn't actually from the fork overhead itself, but
>    from fork competing with the processing during process exit, as both taking
>    conflicting locks.

Interesting, thanks for sharing this. I've studied fork() itself a
little bit more
(the fork() vs various factors without crazy exit() handlers). See attached
results from 2 machines or just run fork_bench C proggie. My conclusions on
on 6.14.x are following (those are mostly notes for myself while
studying those, but I
think I'll share, maybe just one variable is missing here: how fork() ends up
being affected by NUMA - future TODO for me ):

MAP_SHARED (findings for this $thread)
--------------------------------------
a) In "mmap-MAP_SHARED" cases, the max number of fork()/s drops but very
slightly as the number of (still only MAP_SHARED!) segments increase. This
applies to both with huge pages and without them. Memfd_normal seems to
behave almost in identical way, so at least from that angle the patch seems
to be ok (assuming it has just two segments today, yesterday it had 6 for
me ;))

b) My wild trick/assumption - not related to $thread - under "memfd_unmap"
that I've posted earlier - assuming it will double postmaster scalability -
is double fizzled right now, as you say the overhead of unmapping segments
Before fork()ing and keeping just mem fd to restore that mmap MAP_SHARED
segment from child for some reason degrades performance compared to just
letting them persist or using MADV_DONTNEED. Probably it's page faulting
as you say, I haven't measured that. RIP idea.

MAP_PRIVATE (this can be ignored for the purposes of this $thread)
------------------------------------------------------------------
Nevertheless quite interesting to see how those two modes compare and it
touches aspect of openssl and e.g. io_uring using to create many VMAs too
[1]

c) MAP_PRIVATE seems to be way slower because fork() must copy PTEs and
mark them as CoW. Performance drops as the total memory (number of pages)
increases. We should not have big MAP_ANONYMOUS|MAP_PRIVATE segments (or
even just many segments [1]) in the postmaster if we want fast fork().

But even still having a lot of MAP_PRIVATE (in some edge case? large
heap?), really benefits from huge pages there.

My takeaway from this is - and it's unrelated to this $thread, but still
interesting finding for future: once we'll have multithreading, we
might be not able to fork() efficiently from there (or it will be big
huge impact for MAP_PRIVATE/big heap for all threads). It will clearly
depend on the architecture: but if postmaster will be removed and one
a giant PID will have multiple TIDs and somebody does want to run COPY
TO/FROM PROGRAM often from there, we are screwed unless those segments will
be MADV_DONOTFORK.

> I seriously doubt it's a good idea to delay the mmapping until after the fork,
> that'll just lead to more different mappings to exist that then all need to be
> tracked separately by the kernel.

Right, the raw numbers are not showing this as a good idea.

-J.

[1] - 
https://www.postgresql.org/message-id/7bduf2aqh6ygz7qugmb65ohczozeed36oscviebhjcvussjqt4%405fcoh7427txo
Starting Sweep (HugePages: ENABLED, Max Mem: 16GB, Max Time/Test: 2.0s)
Mode                 | Segs  | SizeGB | Forks/sec   
----------------------------------------------------------------------
mmap-MAP_SHARED      | 1     | 1      | 11147.96    
mmap-MAP_SHARED      | 2     | 1      | 11159.49    
mmap-MAP_SHARED      | 4     | 1      | 11352.20    
mmap-MAP_SHARED      | 8     | 1      | 10733.16    
mmap-MAP_SHARED      | 16    | 1      | 10264.63    
mmap-MAP_SHARED      | 1     | 2      | 11745.78    
mmap-MAP_SHARED      | 2     | 2      | 11873.96    
mmap-MAP_SHARED      | 4     | 2      | 11821.10    
mmap-MAP_SHARED      | 8     | 2      | 10799.39    
mmap-MAP_SHARED      | 1     | 4      | 11943.26    
mmap-MAP_SHARED      | 2     | 4      | 11842.58    
mmap-MAP_SHARED      | 4     | 4      | 10946.43    
mmap-MAP_SHARED      | 1     | 8      | 11402.92    
mmap-MAP_SHARED      | 2     | 8      | 11701.11    
mmap-MAP_SHARED      | 1     | 16     | 11924.66    
----------------------------------------------------------------------
mmap-MAP_PRIVATE     | 1     | 1      | 6691.29     
mmap-MAP_PRIVATE     | 2     | 1      | 2640.46     
mmap-MAP_PRIVATE     | 4     | 1      | 1094.15     
mmap-MAP_PRIVATE     | 8     | 1      | 583.11      
mmap-MAP_PRIVATE     | 16    | 1      | 315.90      
mmap-MAP_PRIVATE     | 1     | 2      | 2611.27     
mmap-MAP_PRIVATE     | 2     | 2      | 1171.92     
mmap-MAP_PRIVATE     | 4     | 2      | 623.90      
mmap-MAP_PRIVATE     | 8     | 2      | 315.12      
mmap-MAP_PRIVATE     | 1     | 4      | 1136.03     
mmap-MAP_PRIVATE     | 2     | 4      | 601.53      
mmap-MAP_PRIVATE     | 4     | 4      | 318.12      
mmap-MAP_PRIVATE     | 1     | 8      | 626.62      
mmap-MAP_PRIVATE     | 2     | 8      | 313.79      
mmap-MAP_PRIVATE     | 1     | 16     | 318.76      
----------------------------------------------------------------------
memfd_normal         | 1     | 1      | 12030.31    
memfd_normal         | 2     | 1      | 11910.85    
memfd_normal         | 4     | 1      | 11290.21    
memfd_normal         | 8     | 1      | 10934.57    
memfd_normal         | 16    | 1      | 10211.38    
memfd_normal         | 1     | 2      | 11841.26    
memfd_normal         | 2     | 2      | 11679.65    
memfd_normal         | 4     | 2      | 11603.79    
memfd_normal         | 8     | 2      | 11159.31    
memfd_normal         | 1     | 4      | 12076.96    
memfd_normal         | 2     | 4      | 12020.41    
memfd_normal         | 4     | 4      | 11157.79    
memfd_normal         | 1     | 8      | 11907.32    
memfd_normal         | 2     | 8      | 11647.34    
memfd_normal         | 1     | 16     | 11596.94    
----------------------------------------------------------------------
memfd_MADV_DONTNEED  | 1     | 1      | 11883.35    
memfd_MADV_DONTNEED  | 2     | 1      | 11558.44    
memfd_MADV_DONTNEED  | 4     | 1      | 11272.62    
memfd_MADV_DONTNEED  | 8     | 1      | 11406.86    
memfd_MADV_DONTNEED  | 16    | 1      | 10397.43    
memfd_MADV_DONTNEED  | 1     | 2      | 12038.87    
memfd_MADV_DONTNEED  | 2     | 2      | 11832.76    
memfd_MADV_DONTNEED  | 4     | 2      | 11595.20    
memfd_MADV_DONTNEED  | 8     | 2      | 11045.98    
memfd_MADV_DONTNEED  | 1     | 4      | 12084.83    
memfd_MADV_DONTNEED  | 2     | 4      | 11696.95    
memfd_MADV_DONTNEED  | 4     | 4      | 11359.32    
memfd_MADV_DONTNEED  | 1     | 8      | 12103.40    
memfd_MADV_DONTNEED  | 2     | 8      | 12009.92    
memfd_MADV_DONTNEED  | 1     | 16     | 11995.43    
----------------------------------------------------------------------
memfd_unmap          | 1     | 1      | 11489.67    
memfd_unmap          | 2     | 1      | 10878.10    
memfd_unmap          | 4     | 1      | 10294.98    
memfd_unmap          | 8     | 1      | 9788.87     
memfd_unmap          | 16    | 1      | 8231.27     
memfd_unmap          | 1     | 2      | 11091.21    
memfd_unmap          | 2     | 2      | 10794.88    
memfd_unmap          | 4     | 2      | 10565.36    
memfd_unmap          | 8     | 2      | 9575.96     
memfd_unmap          | 1     | 4      | 11341.85    
memfd_unmap          | 2     | 4      | 11077.75    
memfd_unmap          | 4     | 4      | 10298.29    
memfd_unmap          | 1     | 8      | 11251.97    
memfd_unmap          | 2     | 8      | 11038.77    
memfd_unmap          | 1     | 16     | 10997.57    
----------------------------------------------------------------------
Starting Sweep (HugePages: ENABLED, Max Mem: 16GB, Max Time/Test: 2.0s)
Mode                 | Segs  | SizeGB | Forks/sec   
----------------------------------------------------------------------
mmap-MAP_SHARED      | 1     | 1      | 11908.55    
mmap-MAP_SHARED      | 2     | 1      | 11832.81    
mmap-MAP_SHARED      | 4     | 1      | 11613.15    
mmap-MAP_SHARED      | 8     | 1      | 11246.94    
mmap-MAP_SHARED      | 16    | 1      | 10601.63    
mmap-MAP_SHARED      | 1     | 2      | 11903.21    
mmap-MAP_SHARED      | 2     | 2      | 11802.68    
mmap-MAP_SHARED      | 4     | 2      | 11574.65    
mmap-MAP_SHARED      | 8     | 2      | 11223.44    
mmap-MAP_SHARED      | 1     | 4      | 11907.58    
mmap-MAP_SHARED      | 2     | 4      | 11794.99    
mmap-MAP_SHARED      | 4     | 4      | 11588.97    
mmap-MAP_SHARED      | 1     | 8      | 11909.40    
mmap-MAP_SHARED      | 2     | 8      | 11791.38    
mmap-MAP_SHARED      | 1     | 16     | 11873.59    
----------------------------------------------------------------------
mmap-MAP_PRIVATE     | 1     | 1      | 4570.23     
mmap-MAP_PRIVATE     | 2     | 1      | 2839.87     
mmap-MAP_PRIVATE     | 4     | 1      | 1606.92     
mmap-MAP_PRIVATE     | 8     | 1      | 861.98      
mmap-MAP_PRIVATE     | 16    | 1      | 447.01      
mmap-MAP_PRIVATE     | 1     | 2      | 2823.46     
mmap-MAP_PRIVATE     | 2     | 2      | 1608.86     
mmap-MAP_PRIVATE     | 4     | 2      | 862.26      
mmap-MAP_PRIVATE     | 8     | 2      | 448.19      
mmap-MAP_PRIVATE     | 1     | 4      | 1617.32     
mmap-MAP_PRIVATE     | 2     | 4      | 865.24      
mmap-MAP_PRIVATE     | 4     | 4      | 449.12      
mmap-MAP_PRIVATE     | 1     | 8      | 864.51      
mmap-MAP_PRIVATE     | 2     | 8      | 449.74      
mmap-MAP_PRIVATE     | 1     | 16     | 449.38      
----------------------------------------------------------------------
memfd_normal         | 1     | 1      | 11923.72    
memfd_normal         | 2     | 1      | 11801.36    
memfd_normal         | 4     | 1      | 11627.44    
memfd_normal         | 8     | 1      | 11222.97    
memfd_normal         | 16    | 1      | 10520.60    
memfd_normal         | 1     | 2      | 11912.25    
memfd_normal         | 2     | 2      | 11829.10    
memfd_normal         | 4     | 2      | 11582.64    
memfd_normal         | 8     | 2      | 11195.80    
memfd_normal         | 1     | 4      | 11889.64    
memfd_normal         | 2     | 4      | 11811.02    
memfd_normal         | 4     | 4      | 11588.37    
memfd_normal         | 1     | 8      | 11864.42    
memfd_normal         | 2     | 8      | 11753.83    
memfd_normal         | 1     | 16     | 11923.17    
----------------------------------------------------------------------
memfd_MADV_DONTNEED  | 1     | 1      | 11948.33    
memfd_MADV_DONTNEED  | 2     | 1      | 11801.81    
memfd_MADV_DONTNEED  | 4     | 1      | 11588.46    
memfd_MADV_DONTNEED  | 8     | 1      | 11190.21    
memfd_MADV_DONTNEED  | 16    | 1      | 10510.68    
memfd_MADV_DONTNEED  | 1     | 2      | 11963.67    
memfd_MADV_DONTNEED  | 2     | 2      | 11818.99    
memfd_MADV_DONTNEED  | 4     | 2      | 11579.91    
memfd_MADV_DONTNEED  | 8     | 2      | 11237.64    
memfd_MADV_DONTNEED  | 1     | 4      | 11936.34    
memfd_MADV_DONTNEED  | 2     | 4      | 11797.59    
memfd_MADV_DONTNEED  | 4     | 4      | 11604.26    
memfd_MADV_DONTNEED  | 1     | 8      | 11905.17    
memfd_MADV_DONTNEED  | 2     | 8      | 11779.81    
memfd_MADV_DONTNEED  | 1     | 16     | 11917.30    
----------------------------------------------------------------------
memfd_unmap          | 1     | 1      | 11180.23    
memfd_unmap          | 2     | 1      | 10989.48    
memfd_unmap          | 4     | 1      | 10652.56    
memfd_unmap          | 8     | 1      | 9901.21     
memfd_unmap          | 16    | 1      | 8851.35     
memfd_unmap          | 1     | 2      | 11193.39    
memfd_unmap          | 2     | 2      | 10970.04    
memfd_unmap          | 4     | 2      | 10627.30    
memfd_unmap          | 8     | 2      | 9870.38     
memfd_unmap          | 1     | 4      | 11204.30    
memfd_unmap          | 2     | 4      | 10998.67    
memfd_unmap          | 4     | 4      | 10628.48    
memfd_unmap          | 1     | 8      | 11190.34    
memfd_unmap          | 2     | 8      | 11004.03    
memfd_unmap          | 1     | 16     | 11134.26    
----------------------------------------------------------------------

Starting Sweep (HugePages: DISABLED, Max Mem: 16GB, Max Time/Test: 2.0s)
Mode                 | Segs  | SizeGB | Forks/sec   
----------------------------------------------------------------------
mmap-MAP_SHARED      | 1     | 1      | 11567.86    
mmap-MAP_SHARED      | 2     | 1      | 11508.67    
mmap-MAP_SHARED      | 4     | 1      | 11391.57    
mmap-MAP_SHARED      | 8     | 1      | 11107.72    
mmap-MAP_SHARED      | 16    | 1      | 10611.98    
mmap-MAP_SHARED      | 1     | 2      | 11586.96    
mmap-MAP_SHARED      | 2     | 2      | 11546.36    
mmap-MAP_SHARED      | 4     | 2      | 11404.84    
mmap-MAP_SHARED      | 8     | 2      | 11115.03    
mmap-MAP_SHARED      | 1     | 4      | 11595.14    
mmap-MAP_SHARED      | 2     | 4      | 11461.75    
mmap-MAP_SHARED      | 4     | 4      | 11353.02    
mmap-MAP_SHARED      | 1     | 8      | 11582.73    
mmap-MAP_SHARED      | 2     | 8      | 11540.00    
mmap-MAP_SHARED      | 1     | 16     | 11604.37    
----------------------------------------------------------------------
mmap-MAP_PRIVATE     | 1     | 1      | 48.95       
mmap-MAP_PRIVATE     | 2     | 1      | 28.03       
mmap-MAP_PRIVATE     | 4     | 1      | 15.89       
mmap-MAP_PRIVATE     | 8     | 1      | 8.36        
mmap-MAP_PRIVATE     | 16    | 1      | 4.28        
mmap-MAP_PRIVATE     | 1     | 2      | 28.43       
mmap-MAP_PRIVATE     | 2     | 2      | 15.69       
mmap-MAP_PRIVATE     | 4     | 2      | 8.37        
mmap-MAP_PRIVATE     | 8     | 2      | 4.29        
mmap-MAP_PRIVATE     | 1     | 4      | 15.80       
mmap-MAP_PRIVATE     | 2     | 4      | 8.37        
mmap-MAP_PRIVATE     | 4     | 4      | 4.28        
mmap-MAP_PRIVATE     | 1     | 8      | 8.37        
mmap-MAP_PRIVATE     | 2     | 8      | 4.28        
mmap-MAP_PRIVATE     | 1     | 16     | 4.28        
----------------------------------------------------------------------
memfd_normal         | 1     | 1      | 11611.36    
memfd_normal         | 2     | 1      | 11537.67    
memfd_normal         | 4     | 1      | 11371.99    
memfd_normal         | 8     | 1      | 11109.29    
memfd_normal         | 16    | 1      | 10559.06    
memfd_normal         | 1     | 2      | 11586.08    
memfd_normal         | 2     | 2      | 11529.02    
memfd_normal         | 4     | 2      | 11387.26    
memfd_normal         | 8     | 2      | 11111.67    
memfd_normal         | 1     | 4      | 11605.29    
memfd_normal         | 2     | 4      | 11545.69    
memfd_normal         | 4     | 4      | 11391.74    
memfd_normal         | 1     | 8      | 11600.11    
memfd_normal         | 2     | 8      | 11551.25    
memfd_normal         | 1     | 16     | 11631.39    
----------------------------------------------------------------------
memfd_MADV_DONTNEED  | 1     | 1      | 11579.61    
memfd_MADV_DONTNEED  | 2     | 1      | 11506.17    
memfd_MADV_DONTNEED  | 4     | 1      | 11371.25    
memfd_MADV_DONTNEED  | 8     | 1      | 11072.90    
memfd_MADV_DONTNEED  | 16    | 1      | 10534.44    
memfd_MADV_DONTNEED  | 1     | 2      | 11604.92    
memfd_MADV_DONTNEED  | 2     | 2      | 11499.87    
memfd_MADV_DONTNEED  | 4     | 2      | 11335.11    
memfd_MADV_DONTNEED  | 8     | 2      | 11074.28    
memfd_MADV_DONTNEED  | 1     | 4      | 11591.28    
memfd_MADV_DONTNEED  | 2     | 4      | 11517.90    
memfd_MADV_DONTNEED  | 4     | 4      | 11359.59    
memfd_MADV_DONTNEED  | 1     | 8      | 11609.35    
memfd_MADV_DONTNEED  | 2     | 8      | 11501.70    
memfd_MADV_DONTNEED  | 1     | 16     | 11554.79    
----------------------------------------------------------------------
memfd_unmap          | 1     | 1      | 10947.47    
memfd_unmap          | 2     | 1      | 10753.28    
memfd_unmap          | 4     | 1      | 10460.15    
memfd_unmap          | 8     | 1      | 9794.75     
memfd_unmap          | 16    | 1      | 8898.24     
memfd_unmap          | 1     | 2      | 10941.07    
memfd_unmap          | 2     | 2      | 10773.05    
memfd_unmap          | 4     | 2      | 10450.26    
memfd_unmap          | 8     | 2      | 9812.40     
memfd_unmap          | 1     | 4      | 10900.59    
memfd_unmap          | 2     | 4      | 10745.32    
memfd_unmap          | 4     | 4      | 10483.82    
memfd_unmap          | 1     | 8      | 10931.35    
memfd_unmap          | 2     | 8      | 10743.44    
memfd_unmap          | 1     | 16     | 10864.12    
----------------------------------------------------------------------
Starting Sweep (HugePages: DISABLED, Max Mem: 16GB, Max Time/Test: 2.0s)
Mode                 | Segs  | SizeGB | Forks/sec   
----------------------------------------------------------------------
mmap-MAP_SHARED      | 1     | 1      | 12244.18    
mmap-MAP_SHARED      | 2     | 1      | 11631.35    
mmap-MAP_SHARED      | 4     | 1      | 12374.73    
mmap-MAP_SHARED      | 8     | 1      | 11916.43    
mmap-MAP_SHARED      | 16    | 1      | 10885.38    
mmap-MAP_SHARED      | 1     | 2      | 12701.33    
mmap-MAP_SHARED      | 2     | 2      | 12488.93    
mmap-MAP_SHARED      | 4     | 2      | 12483.24    
mmap-MAP_SHARED      | 8     | 2      | 11703.18    
mmap-MAP_SHARED      | 1     | 4      | 12659.76    
mmap-MAP_SHARED      | 2     | 4      | 12417.44    
mmap-MAP_SHARED      | 4     | 4      | 12642.99    
mmap-MAP_SHARED      | 1     | 8      | 12463.28    
mmap-MAP_SHARED      | 2     | 8      | 12486.95    
mmap-MAP_SHARED      | 1     | 16     | 11921.65    
----------------------------------------------------------------------
mmap-MAP_PRIVATE     | 1     | 1      | 26.91       
mmap-MAP_PRIVATE     | 2     | 1      | 13.85       
mmap-MAP_PRIVATE     | 4     | 1      | 8.65        
mmap-MAP_PRIVATE     | 8     | 1      | 6.53        
mmap-MAP_PRIVATE     | 16    | 1      | 4.14        
mmap-MAP_PRIVATE     | 1     | 2      | 13.70       
mmap-MAP_PRIVATE     | 2     | 2      | 8.57        
mmap-MAP_PRIVATE     | 4     | 2      | 6.21        
mmap-MAP_PRIVATE     | 8     | 2      | 4.20        
mmap-MAP_PRIVATE     | 1     | 4      | 8.73        
mmap-MAP_PRIVATE     | 2     | 4      | 6.34        
mmap-MAP_PRIVATE     | 4     | 4      | 4.09        
mmap-MAP_PRIVATE     | 1     | 8      | 6.31        
mmap-MAP_PRIVATE     | 2     | 8      | 4.03        
mmap-MAP_PRIVATE     | 1     | 16     | 4.13        
----------------------------------------------------------------------
memfd_normal         | 1     | 1      | 12169.04    
memfd_normal         | 2     | 1      | 12204.76    
memfd_normal         | 4     | 1      | 12002.28    
memfd_normal         | 8     | 1      | 11272.83    
memfd_normal         | 16    | 1      | 10703.16    
memfd_normal         | 1     | 2      | 12351.89    
memfd_normal         | 2     | 2      | 11459.10    
memfd_normal         | 4     | 2      | 12159.86    
memfd_normal         | 8     | 2      | 11333.70    
memfd_normal         | 1     | 4      | 12522.09    
memfd_normal         | 2     | 4      | 11972.86    
memfd_normal         | 4     | 4      | 11639.81    
memfd_normal         | 1     | 8      | 11995.48    
memfd_normal         | 2     | 8      | 11801.37    
memfd_normal         | 1     | 16     | 12156.54    
----------------------------------------------------------------------
memfd_MADV_DONTNEED  | 1     | 1      | 11040.77    
memfd_MADV_DONTNEED  | 2     | 1      | 9978.94     
memfd_MADV_DONTNEED  | 4     | 1      | 10240.39    
memfd_MADV_DONTNEED  | 8     | 1      | 10383.98    
memfd_MADV_DONTNEED  | 16    | 1      | 10137.84    
memfd_MADV_DONTNEED  | 1     | 2      | 11614.28    
memfd_MADV_DONTNEED  | 2     | 2      | 10961.29    
memfd_MADV_DONTNEED  | 4     | 2      | 10311.64    
memfd_MADV_DONTNEED  | 8     | 2      | 10350.50    
memfd_MADV_DONTNEED  | 1     | 4      | 11184.21    
memfd_MADV_DONTNEED  | 2     | 4      | 11274.44    
memfd_MADV_DONTNEED  | 4     | 4      | 10870.33    
memfd_MADV_DONTNEED  | 1     | 8      | 11501.86    
memfd_MADV_DONTNEED  | 2     | 8      | 11055.71    
memfd_MADV_DONTNEED  | 1     | 16     | 11550.38    
----------------------------------------------------------------------
memfd_unmap          | 1     | 1      | 10653.19    
memfd_unmap          | 2     | 1      | 10162.33    
memfd_unmap          | 4     | 1      | 9651.69     
memfd_unmap          | 8     | 1      | 8701.46     
memfd_unmap          | 16    | 1      | 7492.18     
memfd_unmap          | 1     | 2      | 10435.16    
memfd_unmap          | 2     | 2      | 10020.18    
memfd_unmap          | 4     | 2      | 10151.69    
memfd_unmap          | 8     | 2      | 8746.93     
memfd_unmap          | 1     | 4      | 10220.18    
memfd_unmap          | 2     | 4      | 9965.44     
memfd_unmap          | 4     | 4      | 9342.17     
memfd_unmap          | 1     | 8      | 10609.20    
memfd_unmap          | 2     | 8      | 10210.61    
memfd_unmap          | 1     | 16     | 10569.24    
----------------------------------------------------------------------
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/wait.h>
#include <sys/time.h>
#include <fcntl.h>
#include <getopt.h>

#define TEST_DURATION_SEC 2.0
#define MAX_TOTAL_GB 16

typedef enum {
    MODE_MMAP_SHARED,
    MODE_MMAP_PRIVATE,
    MODE_MEMFD_NORMAL,
    MODE_MEMFD_DONTNEED,
    MODE_MEMFD_UNMAP
} bench_mode_t;

const char* get_mode_name(bench_mode_t mode) {
    switch(mode) {
        case MODE_MMAP_SHARED:    return "mmap-MAP_SHARED";
        case MODE_MMAP_PRIVATE:   return "mmap-MAP_PRIVATE";
        case MODE_MEMFD_NORMAL:   return "memfd_normal";
        case MODE_MEMFD_DONTNEED: return "memfd_MADV_DONTNEED";
        case MODE_MEMFD_UNMAP:    return "memfd_unmap";
        default: return "unknown";
    }
}

double get_now() {
    struct timeval tv;
    gettimeofday(&tv, NULL);
    return (double)tv.tv_sec + (double)tv.tv_usec / 1000000.0;
}

double run_benchmark(bench_mode_t mode, int num_segs, int size_gb, int use_huge) {
    size_t segment_size = (size_t)size_gb * 1024 * 1024 * 1024ULL;
    int *fds = malloc(num_segs * sizeof(int));
    void **ptrs = malloc(num_segs * sizeof(void *));
    
    for (int i = 0; i < num_segs; i++) {
        int shared_flag = (mode == MODE_MMAP_PRIVATE) ? MAP_PRIVATE : MAP_SHARED;
        
        if (mode >= MODE_MEMFD_NORMAL) {
            char name[32];
            sprintf(name, "bench_%d", i);
            int mfd_flags = MFD_CLOEXEC | (use_huge ? MFD_HUGETLB : 0);
            fds[i] = memfd_create(name, mfd_flags);
            if (fds[i] < 0) return -1.0;
            if (ftruncate(fds[i], segment_size) < 0) return -1.0;
            ptrs[i] = mmap(NULL, segment_size, PROT_READ | PROT_WRITE, shared_flag, fds[i], 0);
        } else {
            int flags = shared_flag | MAP_ANONYMOUS | (use_huge ? MAP_HUGETLB : 0);
            ptrs[i] = mmap(NULL, segment_size, PROT_READ | PROT_WRITE, flags, -1, 0);
        }

        if (ptrs[i] == MAP_FAILED) return -1.0;
        memset(ptrs[i], 'A', segment_size);

        if (mode == MODE_MEMFD_DONTNEED) madvise(ptrs[i], segment_size, MADV_DONTNEED);
        if (mode == MODE_MEMFD_UNMAP) munmap(ptrs[i], segment_size);
    }

    long fork_count = 0;
    double start_time = get_now();
    while ((get_now() - start_time) < TEST_DURATION_SEC) {
        pid_t pid = fork();
        if (pid < 0) break;
        if (pid == 0) {
            if (mode == MODE_MEMFD_UNMAP) {
                for (int j = 0; j < num_segs; j++) 
                    mmap(NULL, segment_size, PROT_READ | PROT_WRITE, MAP_SHARED, fds[j], 0);
            }
            _exit(0);
        } else {
            waitpid(pid, NULL, 0);
            fork_count++;
        }
    }
    double total_time = get_now() - start_time;

    for (int i = 0; i < num_segs; i++) {
        if (mode != MODE_MEMFD_UNMAP) munmap(ptrs[i], segment_size);
        if (mode >= MODE_MEMFD_NORMAL) close(fds[i]);
    }
    free(fds); free(ptrs);
    return (double)fork_count / total_time;
}

void print_usage(char* prog) {
    printf("Usage: %s [--huge-pages | --no-huge-pages]\n", prog);
    exit(1);
}

int main(int argc, char **argv) {
    int huge_flag = -1;
    
    static struct option long_options[] = {
        {"huge-pages", no_argument, 0, 'h'},
        {"no-huge-pages", no_argument, 0, 'n'},
        {0, 0, 0, 0}
    };

    int opt;
    while ((opt = getopt_long(argc, argv, "hn", long_options, NULL)) != -1) {
        switch (opt) {
            case 'h': huge_flag = 1; break;
            case 'n': huge_flag = 0; break;
            default: print_usage(argv[0]);
        }
    }

    if (huge_flag == -1) print_usage(argv[0]);

    printf("Starting Sweep (HugePages: %s, Max Mem: %dGB, Max Time/Test: %.1fs)\n", 
            huge_flag ? "ENABLED" : "DISABLED", MAX_TOTAL_GB, TEST_DURATION_SEC);
    printf("%-20s | %-5s | %-6s | %-12s\n", "Mode", "Segs", "SizeGB", "Forks/sec");
    printf("----------------------------------------------------------------------\n");

    bench_mode_t modes[] = {MODE_MMAP_SHARED, MODE_MMAP_PRIVATE, MODE_MEMFD_NORMAL, MODE_MEMFD_DONTNEED, MODE_MEMFD_UNMAP};

    for (int m = 0; m < 5; m++) {
        for (int s = 1; s <= 16; s *= 2) {
            for (int n = 1; n <= 16; n *= 2) {
                if (n * s > MAX_TOTAL_GB) continue;

                double rate = run_benchmark(modes[m], n, s, huge_flag);
                
                if (rate < 0) {
                    printf("%-20s | %-5d | %-6d | [ERR: OOM/POOL]\n", get_mode_name(modes[m]), n, s);
                } else {
                    printf("%-20s | %-5d | %-6d | %-12.2f\n", get_mode_name(modes[m]), n, s, rate);
                }
            }
        }
        printf("----------------------------------------------------------------------\n");
    }

    return 0;
}

Reply via email to