On Tue, Jan 13, 2026 at 1:12 PM Mark Wong <[email protected]> wrote:
> On Fri, Jan 09, 2026 at 05:21:45PM +0300, Nazir Bilal Yavuz wrote: > > Were you able to understand why Mark's benchmark results are different > > from ours? > > Not yet... I had some guesses, which is why I suggested the processor > pinning > and using a ramdisk. But we haven't tried applying all of those to my > laptop, > which has 3 core types, or the POWER system, which may be interesting to > use a > ram disk on. > > I'm curious though, and admittedly haven't tried looking myself yet, about > how > the SIMD calls might look across different processor architectures. We'll > try > to get that on the POWER system soon... > > Regards, > Mark > -- > Mark Wong <[email protected]> > EDB https://enterprisedb.com > Hello! Nazir, I'm glad you are finding the benchmarks useful. I have more! :-) All of these benchmarks are all-in-RAM, because I do think that is the best way of getting closest to the theoretical best and worst case scenarios. My laptop: master: (852558b9) text, no special: 14996 text, 1/3 special: 17270 csv, no special: 18274 csv, 1/3 special: 23852 v3 text, no special: 11282 (24.7% speedup) text, 1/3 special: 15748 (8.8% speedup) <-- I don't believe this but it's what I got csv, no special: 11571 (36.6% speedup) csv, 1/3 special: 19934 (16.4% speedup) <-- I don't believe this but it's what I got v4.2 text, no special: 11139 (25.7% speedup) text, 1/3 special: 18900 (9.4% regression) csv, no special: 11490 (37.1% speedup) csv, 1/3 special: 26134 (9.5% regression) An AWS EC2 t2.2xlarge instance master: (852558b9) text, no special: 20677 text, 1/3 special: 22660 csv, no special: 24534 csv, 1/3 special: 30999 v3 text, no special: 17534 (15.2% speedup) text, 1/3 special: 22816 (0.6% regression) csv, no special: 17664 (28.0% speedup) csv, 1/3 special: 29338 (5.3% speedup) <-- I don't believe this but it's what I got v4.2 text, no special: 17459 (15.5% speedup) text, 1/3 special: 25051 (10.5% regression) csv, no special: 17574 (28.3% speedup) csv, 1/3 special: 32092 (3.5% regression) An AWS EC2 t4g.2xlarge instance (using ARM processor; first test of ARM processor!) master: (852558b9) text, no special: 22081 text, 1/3 special: 25100 csv, no special: 27296 csv, 1/3 special: 32344 v3 text, no special: 17724 (19.7% speedup) text, 1/3 special: 27606 (9.9% regression) <-- yikes! We would want to test this more csv, no special: 17597 (35.5% speedup) csv, 1/3 special: 32597 (0.8% regression) v4.2 text, no special: 17674 (20% speedup) text, 1/3 special: 25773 (2.6% regression) <-- this regression is less than for the v3 patch? Atypical... csv, no special: 17651 (35.3% speedup) csv, 1/3 special: 34055 (5.3% regression) Yes, I think I agree with you that the everything-in-RAM benchmarks will make the regressions more pronounced, just like the everything-in-RAM benchmarks make the improvements more pronounced. I am not sure why the CSV regression, compared to the TXT regression (even for the v3 patch which has smaller regressions than the v4.2 patch) is usually worse. I probably should look over some flame graphs and see if I can find the place where the CSV-parsing code is so much slower. The CSV regression is actually a bit frustrating (at around 5%) because the TXT regression, at less than 1% (for the v3 patch) is so much easier to bare. Here are some copy-to benchmarks for the v4 patch that applies SIMD to the copy-to code. These were all-in-RAM tests. My laptop master: (852558b9) text, no special: 2948 text, 1/3 special: 11258 csv, no special: 6245 csv, 1/3 special: 11258 v4 (copy to) text, no special: 2126 (27.9% speedup) text, 1/3 special: 12080 (7.3% regression) <-- did not see such a big regression before csv, no special: 2432 (61.0% speedup) csv, 1/3 special: 12344 (4.0% regression) <-- did not see such a big regression before An AWS EC2 t2.2xlarge instance master: (852558b9) text, no special: 4647 text, 1/3 special: 13865 csv, no special: 5421 csv, 1/3 special: 15284 v4 (copy to) text, no special: 2460 (47.0% speedup) text, 1/3 special: 14023 (1.1% regression) csv, no special: 2667 (50.7% speedup) csv, 1/3 special: 15251 (0.2% speedup) An AWS EC2 t4g.2xlarge instance (using ARM processor; first test of ARM processor!) master: (852558b9) text, no special: 6951 text, 1/3 special: 17857 csv, no special: 7951 csv, 1/3 special: 18504 v4 (copy to) text, no special: 3372 (51.4% speedup) text, 1/3 special: 15713 (12.0% speedup) csv, no special: 3233 (59.3% speedup) csv, 1/3 special: 1622 (12.3% speedup) Once again, the v4 patch for copy-to seems like a clearer win, though, to be fair, there were regressions when running on my laptop. (I'm starting to think servers or desktops are better than laptops for testing these things, though maybe that's my bias: it just seems like the server results are always less surprising.) Hope you all continue to find these useful... -- -- Manni Wood EDB: https://www.enterprisedb.com
