On Thu, 1 Dec 2022 at 18:18, John Naylor <john.nay...@enterprisedb.com> wrote: > I then tested a Power8 machine (also kernel 3.10 gcc 4.8). Configure reports > "checking for __builtin_prefetch... yes", but I don't think it does anything > here, as the results are within noise level. A quick search didn't turn up > anything informative on this platform, and I'm not motivated to dig deeper. > In any case, it doesn't make things worse.
Thanks for testing the power8 hardware. Andres just let me test on some Apple M1 hardware (those cores are insanely fast!) Using the table and running the script from [1], with trimmed-down output, I see: Master @ edf12e7bbd Testing a -> 158.037 ms Testing a2 -> 164.442 ms Testing a3 -> 171.523 ms Testing a4 -> 189.892 ms Testing a5 -> 217.197 ms Testing a6 -> 186.790 ms Testing a7 -> 189.491 ms Testing a8 -> 195.384 ms Testing a9 -> 200.547 ms Testing a10 -> 206.149 ms Testing a11 -> 211.708 ms Testing a12 -> 217.976 ms Testing a13 -> 224.565 ms Testing a14 -> 230.642 ms Testing a15 -> 237.372 ms Testing a16 -> 244.110 ms (checking for __builtin_prefetch... yes) Master + v2-0001 + v2-0005 Testing a -> 157.477 ms Testing a2 -> 163.720 ms Testing a3 -> 171.159 ms Testing a4 -> 186.837 ms Testing a5 -> 205.220 ms Testing a6 -> 184.585 ms Testing a7 -> 189.879 ms Testing a8 -> 195.650 ms Testing a9 -> 201.220 ms Testing a10 -> 207.162 ms Testing a11 -> 213.255 ms Testing a12 -> 219.313 ms Testing a13 -> 225.763 ms Testing a14 -> 237.337 ms Testing a15 -> 239.440 ms Testing a16 -> 245.740 ms It does not seem like there's any improvement on this architecture. There is a very small increase from "a" to "a6", but a very small decrease in performance from "a7" to "a16". It's likely within the expected noise level. David [1] https://postgr.es/m/caaphdvqwexy_6jgmb39vr3oqxz_w6stafkq52hodvwaw-19...@mail.gmail.com