Torbjörn Granlund <t...@gmplib.org> writes: > OK, so the code is 3-ways unrolled. That's always a bit inconvenient > and tends to cause some code bloat.
I don't remember at all why I did it that way. Maybe it was faster than two-way, and too few registers for 4-way? Do you expect one really needs to go beyond 2-way for this type of loop, where each iteration does a fair amout of work? > * Accumulate differently, say 4 consecutive limbs at a time, with carry > being alive. That will require more registers for sure. By using > adcx and adox, one may accumulate to the same registers in two chains > semi-simultaneously. With adox/adcx, I think it should work to compute both of a U and b V one (or a few) limbs at a time, using O-flag as a short-lived carry flag, and longer lived register to hold the high limb. And then add the results together using C as a long-lived carry, living between iterations. Is there a neat way to clear the O flag without clobbering C ? Another thing that I think could give a substantial speedup for gcd in the lehmer range, is to implement addsubmul_1msb0. If we have a single carry flag, each iteration would take as input a, b (both < B/2), two full limbs u, v, and a carry limb which is a two's complement signed number. Then compute a u + b v + c as a 2-limb two's complement number (fits, thank's to restrictions on a, b). Store low half, high half becomes the c input to next iteration. If we have adox/adcx, use same strategy as suggested for addaddmul_1msb0, but subtract rather than add in the chain with long lived carry. > I suspect te present code is far from optimal on modern x86 CPUs which > can sustain 1 64x64->128 multiply per cycle. I feel confident that we > could reach close to 1 c/l. That's the fun thing with GMP, there's always ways to improve the code you wrote some years ago ;-) Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677. Internet email is subject to wholesale government surveillance. _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel