Hi, first of all, thanks for all the great MPIR work! I've been using it for about 4 years to compute visually compelling deep Mandelbrot zoom videos.
Yesterday I've downloaded 3.0.0 and compiled it using VS 2015 U3 on an Intel Core i7 6900K (8 cores, Broadwell) on Windows 10 64. Unfortunately, 3.0.0 seems to be slower than 2.7.2 by about 8% when small floats are used. By small floats I mean a precision of up to 256 bits (4 limbs on x64). Compilation worked flawlessly for all of the 10 architectures I've selected. Just to make sure Visual Studio updates are not the source of the problem, I also recompiled the 7 architectures I've been testing with 2.7.2. The stats below are based on several hundred million Mandelbrot iterations for each data point. All 16 threads of the 6900K are used and all of them are at 100% capacity. I get the following speedup matrix for 128 precision floats over all compiled versions and architectures: Results from file: Run_2017-03-04T20_10_18.xml; number model: GMP128 1 mpir_3_0_0_x64_gc MFlops: 252.6 2 mpir_2_7_2_x64_gc MFlops: 274.7 Speedup: 8.73% 3 mpir_3_0_0_x64_haswell_avx MFlops: 357.9 Speedup: 41.67% 30.30% 4 mpir_3_0_0_x64_skylake_avx MFlops: 365.9 Speedup: 44.82% 33.20% 2.22% 5 mpir_3_0_0_x64_haswell MFlops: 368.1 Speedup: 45.72% 34.02% 2.86% 0.62% 6 mpir_3_0_0_x64_skylake MFlops: 371.0 Speedup: 46.84% 35.05% 3.65% 1.39% 0.77% 7 mpir_3_0_0_x64_core2 MFlops: 377.0 Speedup: 49.23% 37.26% 5.34% 3.05% 2.41% 1.63% 8 mpir_3_0_0_x64_sandybridge_ivybridge MFlops: 386.7 Speedup: 53.07% 40.79% 8.05% 5.70% 5.05% 4.25% 2.57% 9 mpir_3_0_0_x64_nehalem_westmere MFlops: 389.3 Speedup: 54.10% 41.74% 8.78% 6.41% 5.76% 4.95% 3.26% 0.67% 10 mpir_3_0_0_x64_nehalem MFlops: 389.5 Speedup: 54.19% 41.82% 8.84% 6.47% 5.82% 5.01% 3.32% 0.73% 0.06% 11 mpir_3_0_0_x64_sandybridge MFlops: 395.1 Speedup: 56.39% 43.84% 10.39% 7.99% 7.33% 6.51% 4.80% 2.17% 1.48% 1.43% 12 mpir_2_7_2_x64_haswell MFlops: 398.3 Speedup: 57.66% 45.01% 11.28% 8.87% 8.20% 7.37% 5.65% 3.00% 2.31% 2.25% 0.81% 13 mpir_2_7_2_x64_sandybridge_ivybridge MFlops: 404.3 Speedup: 60.04% 47.20% 12.97% 10.51% 9.83% 8.99% 7.24% 4.55% 3.85% 3.79% 2.33% 1.51% 14 mpir_2_7_2_x64_sandybridge MFlops: 405.2 Speedup: 60.40% 47.52% 13.22% 10.76% 10.07% 9.23% 7.48% 4.78% 4.08% 4.02% 2.56% 1.74% 0.22% 15 mpir_2_7_2_x64_nehalem_westmere MFlops: 417.3 Speedup: 65.16% 51.91% 16.58% 14.05% 13.35% 12.48% 10.67% 7.90% 7.18% 7.12% 5.61% 4.76% 3.20% 2.97% 16 mpir_2_7_2_x64_core2 MFlops: 419.0 Speedup: 65.85% 52.54% 17.07% 14.53% 13.82% 12.95% 11.14% 8.35% 7.62% 7.56% 6.05% 5.20% 3.63% 3.40% 0.42% 17 mpir_2_7_2_x64_nehalem MFlops: 422.8 Speedup: 67.37% 53.94% 18.14% 15.58% 14.86% 13.99% 12.16% 9.34% 8.61% 8.55% 7.02% 6.16% 4.58% 4.35% 1.34% 0.92% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 I've taken these measurements three times with the same results. The six fastest versions are all 2.7.2. Note that architectural compilation and the Broadwell CPU do not seem to be the issue, since the slowest two versions, the generic C mpir_3_0_0_x64_gc and mpir_2_7_2_x64_gc also differ by about 8%. Both compiled on the same machine within 5 minutes of each other with VS 2015. Another hint that architectural compilation and optimization is working fine, is that once I test with 1024 bits precision, the fastest version is mpir_3_0_0_x64_skylake_avx (the Broadwell CPU used in this test already has most of the improvements of Skylake). Unfortunately, I very rarely zoom down to a magnification that needs 1024 bits. I have not done any tuning yet, but my understanding is that for limb sizes 1, 2 or 3 it should not matter anyway. Any hints or ideas on what I may be doing wrong? Does this also happen on other OSes/CPUs? Thanks and best regards, Marcus -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to mpir-devel+unsubscr...@googlegroups.com. To post to this group, send email to mpir-devel@googlegroups.com. Visit this group at https://groups.google.com/group/mpir-devel. For more options, visit https://groups.google.com/d/optout.