Em qui., 12 de mar. de 2026 às 16:21, Bryan Green <[email protected]>
escreveu:

> I modified your memcpy1.c program to not inline the version functions.  I
> changed the memcpy function
> call in version 1, added volatile to keep some DCE opportunities from
> happening and added a range
> of N values to keep the compiler from specializing the code for N = 4.
> Before it did DCE and the test1
> function was just a ret.
>
> The interesting issue is the use of malloc versus the stack.  The use of
> malloc will probably track closer
> with PG's use of palloc so I would say in that case this is an
> optimization.  It might be fun to compile PG
> with and without the patch (in debug mode) and actually see what gets
> generated for this function.
>
> Here are the results I got using your modified benchmark:
> --- stack allocated ---
> stack  n=1  v1(patch): 49721599 ns  v2(original): 21477302 ns  ratio:
> 2.315  original wins
> stack  n=2  v1(patch): 52065462 ns  v2(original): 28765199 ns  ratio:
> 1.810  original wins
> stack  n=3  v1(patch): 58914958 ns  v2(original): 39726110 ns  ratio:
> 1.483  original wins
> stack  n=4  v1(patch): 64585275 ns  v2(original): 47046397 ns  ratio:
> 1.373  original wins
> stack  n=5  v1(patch): 73929844 ns  v2(original): 58588698 ns  ratio:
> 1.262  original wins
> stack  n=6  v1(patch): 95465376 ns  v2(original): 67807817 ns  ratio:
> 1.408  original wins
> stack  n=7  v1(patch): 86910226 ns  v2(original): 76999488 ns  ratio:
> 1.129  original wins
> stack  n=8  v1(patch): 107765417 ns  v2(original): 86046016 ns  ratio:
> 1.252  original wins
>
> --- malloc allocated ---
> malloc n=1  v1(patch): 133283824 ns  v2(original): 141361091 ns  ratio:
> 0.943  patch wins
> malloc n=2  v1(patch): 145625895 ns  v2(original): 180912711 ns  ratio:
> 0.805  patch wins
> malloc n=3  v1(patch): 153975594 ns  v2(original): 228459879 ns  ratio:
> 0.674  patch wins
> malloc n=4  v1(patch): 154483094 ns  v2(original): 248157408 ns  ratio:
> 0.623  patch wins
> malloc n=5  v1(patch): 157710598 ns  v2(original): 298795018 ns  ratio:
> 0.528  patch wins
> malloc n=6  v1(patch): 165196636 ns  v2(original): 332940132 ns  ratio:
> 0.496  patch wins
> malloc n=7  v1(patch): 169576370 ns  v2(original): 358438778 ns  ratio:
> 0.473  patch wins
> malloc n=8  v1(patch): 184463815 ns  v2(original): 403721513 ns  ratio:
> 0.457  patch wins
>
Thanks for your attention and tests.

I think that patch can continue then.

best regards,
Ranier Vilela

Reply via email to