On Wed, Jul 3, 2024, at 13:17, Dean Rasheed wrote: > Anyway, here are both patches for comparison. I'll stop hacking for a > while and let you see what you make of these. > > Regards, > Dean > > Attachments: > * v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch > * v5-add-mul_var_int.patch
I've now benchmarked the patches on all my machines, see bench_mul_var.sql for details. Summary of benchmark results: cpu | var1ndigits | winner ----------------------+-------------+------------------------------------------------------------- AMD Ryzen 9 7950X3D | 1 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch AMD Ryzen 9 7950X3D | 2 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch AMD Ryzen 9 7950X3D | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch Apple M3 Max | 1 | v5-add-mul_var_int.patch Apple M3 Max | 2 | v5-add-mul_var_int.patch Apple M3 Max | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch Intel Core i9-14900K | 1 | v5-add-mul_var_int.patch Intel Core i9-14900K | 2 | v5-add-mul_var_int.patch Intel Core i9-14900K | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch (9 rows) Performance ratio against HEAD per CPU and var1ndigits: cpu | var1ndigits | version | performance_ratio ----------------------+-------------+-------------------------------------------------------------+------------------- AMD Ryzen 9 7950X3D | 1 | HEAD | 1.00 AMD Ryzen 9 7950X3D | 1 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.11 AMD Ryzen 9 7950X3D | 1 | v5-add-mul_var_int.patch | 1.07 AMD Ryzen 9 7950X3D | 1 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.12 AMD Ryzen 9 7950X3D | 2 | HEAD | 1.00 AMD Ryzen 9 7950X3D | 2 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.10 AMD Ryzen 9 7950X3D | 2 | v5-add-mul_var_int.patch | 1.11 AMD Ryzen 9 7950X3D | 2 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.13 AMD Ryzen 9 7950X3D | 3 | HEAD | 1.00 AMD Ryzen 9 7950X3D | 3 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.10 AMD Ryzen 9 7950X3D | 3 | v5-add-mul_var_int.patch | 0.98 AMD Ryzen 9 7950X3D | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.15 Apple M3 Max | 1 | HEAD | 1.00 Apple M3 Max | 1 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.07 Apple M3 Max | 1 | v5-add-mul_var_int.patch | 1.08 Apple M3 Max | 1 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.07 Apple M3 Max | 2 | HEAD | 1.00 Apple M3 Max | 2 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.09 Apple M3 Max | 2 | v5-add-mul_var_int.patch | 1.21 Apple M3 Max | 2 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.10 Apple M3 Max | 3 | HEAD | 1.00 Apple M3 Max | 3 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.09 Apple M3 Max | 3 | v5-add-mul_var_int.patch | 0.99 Apple M3 Max | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.09 Intel Core i9-14900K | 1 | HEAD | 1.00 Intel Core i9-14900K | 1 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.05 Intel Core i9-14900K | 1 | v5-add-mul_var_int.patch | 1.07 Intel Core i9-14900K | 1 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.06 Intel Core i9-14900K | 2 | HEAD | 1.00 Intel Core i9-14900K | 2 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.06 Intel Core i9-14900K | 2 | v5-add-mul_var_int.patch | 1.08 Intel Core i9-14900K | 2 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.06 Intel Core i9-14900K | 3 | HEAD | 1.00 Intel Core i9-14900K | 3 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.04 Intel Core i9-14900K | 3 | v5-add-mul_var_int.patch | 1.00 Intel Core i9-14900K | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.04 (36 rows) The queries to produce the above are in bench_csv_queries.txt /Joel
CREATE TABLE bench (cpu text, var1ndigits int, version text, time numeric); \COPY bench FROM bench.csv WITH CSV HEADER; WITH ranked_bench AS ( SELECT cpu, var1ndigits, version, ROW_NUMBER() OVER (PARTITION BY cpu, var1ndigits ORDER BY AVG(time)) AS rn FROM bench GROUP BY cpu, var1ndigits, version ) SELECT cpu, var1ndigits, version AS winner FROM ranked_bench WHERE rn = 1 ORDER BY cpu, var1ndigits; WITH avg_times AS ( SELECT cpu, var1ndigits, version, AVG(time) AS avg_time FROM bench GROUP BY cpu, var1ndigits, version ), head_times AS ( SELECT cpu, var1ndigits, avg_time AS head_avg_time FROM avg_times WHERE version = 'HEAD' ) SELECT a.cpu, a.var1ndigits, a.version, ROUND(h.head_avg_time / a.avg_time,2) AS performance_ratio FROM avg_times a JOIN head_times h ON a.cpu = h.cpu AND a.var1ndigits = h.var1ndigits ORDER BY a.cpu, a.var1ndigits, a.version;
cpu,var1ndigits,version,time Apple M3 Max,1,HEAD,3090.147 Apple M3 Max,1,HEAD,3095.153 Apple M3 Max,1,HEAD,3096.725 Apple M3 Max,1,HEAD,3070.083 Apple M3 Max,1,HEAD,3081.267 Apple M3 Max,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2883.311 Apple M3 Max,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2882.971 Apple M3 Max,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2884.639 Apple M3 Max,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2884.728 Apple M3 Max,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2887.346 Apple M3 Max,1,v5-add-mul_var_int.patch,2859.045 Apple M3 Max,1,v5-add-mul_var_int.patch,2854.941 Apple M3 Max,1,v5-add-mul_var_int.patch,2851.976 Apple M3 Max,1,v5-add-mul_var_int.patch,2863.930 Apple M3 Max,1,v5-add-mul_var_int.patch,2864.494 Apple M3 Max,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2869.741 Apple M3 Max,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2870.023 Apple M3 Max,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2870.653 Apple M3 Max,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2869.711 Apple M3 Max,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2878.178 Apple M3 Max,2,HEAD,3397.181 Apple M3 Max,2,HEAD,3371.557 Apple M3 Max,2,HEAD,3356.081 Apple M3 Max,2,HEAD,3371.946 Apple M3 Max,2,HEAD,3385.859 Apple M3 Max,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3084.038 Apple M3 Max,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3082.308 Apple M3 Max,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3089.160 Apple M3 Max,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3083.793 Apple M3 Max,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3081.382 Apple M3 Max,2,v5-add-mul_var_int.patch,2782.388 Apple M3 Max,2,v5-add-mul_var_int.patch,2780.564 Apple M3 Max,2,v5-add-mul_var_int.patch,2781.664 Apple M3 Max,2,v5-add-mul_var_int.patch,2776.481 Apple M3 Max,2,v5-add-mul_var_int.patch,2781.443 Apple M3 Max,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3074.596 Apple M3 Max,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3073.305 Apple M3 Max,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3078.297 Apple M3 Max,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3084.720 Apple M3 Max,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3076.612 Apple M3 Max,3,HEAD,3500.698 Apple M3 Max,3,HEAD,3490.170 Apple M3 Max,3,HEAD,3481.300 Apple M3 Max,3,HEAD,3486.962 Apple M3 Max,3,HEAD,3473.165 Apple M3 Max,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3196.107 Apple M3 Max,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3203.074 Apple M3 Max,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3215.363 Apple M3 Max,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3204.599 Apple M3 Max,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3203.819 Apple M3 Max,3,v5-add-mul_var_int.patch,3505.878 Apple M3 Max,3,v5-add-mul_var_int.patch,3544.366 Apple M3 Max,3,v5-add-mul_var_int.patch,3521.562 Apple M3 Max,3,v5-add-mul_var_int.patch,3510.695 Apple M3 Max,3,v5-add-mul_var_int.patch,3523.758 Apple M3 Max,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3203.702 Apple M3 Max,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3206.802 Apple M3 Max,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3208.966 Apple M3 Max,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3198.790 Apple M3 Max,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3202.307 Intel Core i9-14900K,1,HEAD,3294.044 Intel Core i9-14900K,1,HEAD,3296.176 Intel Core i9-14900K,1,HEAD,3263.968 Intel Core i9-14900K,1,HEAD,3262.892 Intel Core i9-14900K,1,HEAD,3263.531 Intel Core i9-14900K,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3125.577 Intel Core i9-14900K,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3126.274 Intel Core i9-14900K,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3122.623 Intel Core i9-14900K,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3123.057 Intel Core i9-14900K,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3124.143 Intel Core i9-14900K,1,v5-add-mul_var_int.patch,3056.697 Intel Core i9-14900K,1,v5-add-mul_var_int.patch,3053.200 Intel Core i9-14900K,1,v5-add-mul_var_int.patch,3053.484 Intel Core i9-14900K,1,v5-add-mul_var_int.patch,3053.548 Intel Core i9-14900K,1,v5-add-mul_var_int.patch,3052.770 Intel Core i9-14900K,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3076.845 Intel Core i9-14900K,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3078.734 Intel Core i9-14900K,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3078.344 Intel Core i9-14900K,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3079.325 Intel Core i9-14900K,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3077.803 Intel Core i9-14900K,2,HEAD,3398.412 Intel Core i9-14900K,2,HEAD,3380.465 Intel Core i9-14900K,2,HEAD,3349.829 Intel Core i9-14900K,2,HEAD,3351.368 Intel Core i9-14900K,2,HEAD,3346.712 Intel Core i9-14900K,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3180.399 Intel Core i9-14900K,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3179.782 Intel Core i9-14900K,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3181.158 Intel Core i9-14900K,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3180.896 Intel Core i9-14900K,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3184.998 Intel Core i9-14900K,2,v5-add-mul_var_int.patch,3116.356 Intel Core i9-14900K,2,v5-add-mul_var_int.patch,3111.412 Intel Core i9-14900K,2,v5-add-mul_var_int.patch,3116.011 Intel Core i9-14900K,2,v5-add-mul_var_int.patch,3117.422 Intel Core i9-14900K,2,v5-add-mul_var_int.patch,3118.254 Intel Core i9-14900K,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3189.580 Intel Core i9-14900K,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3188.140 Intel Core i9-14900K,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3187.830 Intel Core i9-14900K,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3189.053 Intel Core i9-14900K,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3189.717 Intel Core i9-14900K,3,HEAD,4171.285 Intel Core i9-14900K,3,HEAD,4162.330 Intel Core i9-14900K,3,HEAD,4142.480 Intel Core i9-14900K,3,HEAD,4137.238 Intel Core i9-14900K,3,HEAD,4137.180 Intel Core i9-14900K,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3983.753 Intel Core i9-14900K,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3983.630 Intel Core i9-14900K,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3983.684 Intel Core i9-14900K,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3992.266 Intel Core i9-14900K,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3984.959 Intel Core i9-14900K,3,v5-add-mul_var_int.patch,4159.903 Intel Core i9-14900K,3,v5-add-mul_var_int.patch,4155.236 Intel Core i9-14900K,3,v5-add-mul_var_int.patch,4156.482 Intel Core i9-14900K,3,v5-add-mul_var_int.patch,4154.075 Intel Core i9-14900K,3,v5-add-mul_var_int.patch,4152.739 Intel Core i9-14900K,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3982.057 Intel Core i9-14900K,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3983.134 Intel Core i9-14900K,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3984.192 Intel Core i9-14900K,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3982.792 Intel Core i9-14900K,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3980.789 AMD Ryzen 9 7950X3D,1,HEAD,3306.653 AMD Ryzen 9 7950X3D,1,HEAD,3307.260 AMD Ryzen 9 7950X3D,1,HEAD,3268.231 AMD Ryzen 9 7950X3D,1,HEAD,3276.653 AMD Ryzen 9 7950X3D,1,HEAD,3267.375 AMD Ryzen 9 7950X3D,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2954.318 AMD Ryzen 9 7950X3D,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2961.550 AMD Ryzen 9 7950X3D,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2963.000 AMD Ryzen 9 7950X3D,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2958.896 AMD Ryzen 9 7950X3D,1,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2957.998 AMD Ryzen 9 7950X3D,1,v5-add-mul_var_int.patch,3080.628 AMD Ryzen 9 7950X3D,1,v5-add-mul_var_int.patch,3064.396 AMD Ryzen 9 7950X3D,1,v5-add-mul_var_int.patch,3073.567 AMD Ryzen 9 7950X3D,1,v5-add-mul_var_int.patch,3074.281 AMD Ryzen 9 7950X3D,1,v5-add-mul_var_int.patch,3072.741 AMD Ryzen 9 7950X3D,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2932.485 AMD Ryzen 9 7950X3D,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2931.049 AMD Ryzen 9 7950X3D,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2932.589 AMD Ryzen 9 7950X3D,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2927.167 AMD Ryzen 9 7950X3D,1,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,2932.993 AMD Ryzen 9 7950X3D,2,HEAD,3505.663 AMD Ryzen 9 7950X3D,2,HEAD,3510.422 AMD Ryzen 9 7950X3D,2,HEAD,3470.290 AMD Ryzen 9 7950X3D,2,HEAD,3490.446 AMD Ryzen 9 7950X3D,2,HEAD,3470.264 AMD Ryzen 9 7950X3D,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3166.622 AMD Ryzen 9 7950X3D,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3162.453 AMD Ryzen 9 7950X3D,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3168.570 AMD Ryzen 9 7950X3D,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3173.971 AMD Ryzen 9 7950X3D,2,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3170.914 AMD Ryzen 9 7950X3D,2,v5-add-mul_var_int.patch,3135.849 AMD Ryzen 9 7950X3D,2,v5-add-mul_var_int.patch,3131.497 AMD Ryzen 9 7950X3D,2,v5-add-mul_var_int.patch,3138.993 AMD Ryzen 9 7950X3D,2,v5-add-mul_var_int.patch,3135.383 AMD Ryzen 9 7950X3D,2,v5-add-mul_var_int.patch,3143.074 AMD Ryzen 9 7950X3D,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3075.388 AMD Ryzen 9 7950X3D,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3078.407 AMD Ryzen 9 7950X3D,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3075.033 AMD Ryzen 9 7950X3D,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3078.055 AMD Ryzen 9 7950X3D,2,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3075.422 AMD Ryzen 9 7950X3D,3,HEAD,3809.949 AMD Ryzen 9 7950X3D,3,HEAD,3824.012 AMD Ryzen 9 7950X3D,3,HEAD,3782.624 AMD Ryzen 9 7950X3D,3,HEAD,3783.997 AMD Ryzen 9 7950X3D,3,HEAD,3761.864 AMD Ryzen 9 7950X3D,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3459.177 AMD Ryzen 9 7950X3D,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3474.086 AMD Ryzen 9 7950X3D,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3456.303 AMD Ryzen 9 7950X3D,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3475.646 AMD Ryzen 9 7950X3D,3,v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3450.919 AMD Ryzen 9 7950X3D,3,v5-add-mul_var_int.patch,3882.610 AMD Ryzen 9 7950X3D,3,v5-add-mul_var_int.patch,3885.023 AMD Ryzen 9 7950X3D,3,v5-add-mul_var_int.patch,3884.721 AMD Ryzen 9 7950X3D,3,v5-add-mul_var_int.patch,3894.463 AMD Ryzen 9 7950X3D,3,v5-add-mul_var_int.patch,3878.118 AMD Ryzen 9 7950X3D,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3302.385 AMD Ryzen 9 7950X3D,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3320.810 AMD Ryzen 9 7950X3D,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3306.740 AMD Ryzen 9 7950X3D,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3307.842 AMD Ryzen 9 7950X3D,3,v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch,3302.253
bench_mul_var.sql
Description: Binary data