On Tue, 10 Jun 2025, Nathan Bossart wrote:

I also wrote a couple of test programs to show the difference between
fseeko-ing and fread-ing through a file with various sizes.  On a Linux
machine, I see this:

    log2(n) | fseeko  | fread
   ---------+---------+-------
          1 | 109.288 | 5.528
          2 |  54.881 | 2.848
          3 |   27.65 | 1.504
          4 |  13.953 | 0.834
          5 |     7.1 |  0.49
          6 |   3.665 | 0.322
          7 |   1.944 | 0.244
          8 |   1.085 | 0.201
          9 |   0.658 | 0.185
         10 |   0.443 | 0.175
         11 |   0.253 | 0.171
         12 |   0.102 | 0.162
         13 |   0.075 |  0.13
         14 |   0.061 | 0.114
         15 |   0.054 |   0.1

So, fseeko() starts winning around 4096 bytes.  On macOS, the differences
aren't quite as dramatic, but 4096 bytes is the break-even point there,
too.  I imagine there's a buffer around that size somewhere...

Thank you for benchmarking! Before answering in more depth, I'm curious, what read-seek pattern do you see on the system call level (as shown by strace)? In pg_restore it was a constant loop of read(4K)-lseek(8-16K).

Dimitris


Reply via email to