Ciao, Il Lun, 19 Dicembre 2016 6:21 pm, Adrien Prost-Boucle ha scritto: > That said, the interesting part in my code is these functions: > - sqrt32_inv() for single 32-bit words > - sqrt64_inv() for single 64-bit words > - sqrt64x2_inv() for double 64-bit words
Is there a reason why you defined three different invsqrt8_ arrays? Doesn't invsqrttab contain suitable values? On the other side, both sqrt64_ and sqrt64x2_ use invroot*invroot, maybe table can store both the value and the squared value. > I noted that GMP fallback function umul_ppmm(), in longlong.h in GMP code, > uses 4 multiplications where the Karatsuba method would only requires 3, > I was wondering whether optimization was possible... Reducing the number of multiplications is possible... but I bet a Karatsuba umul_ppmm() is not faster than the plain version (at least not on current 64-bits CPUs ;-) Regards, m -- http://bodrato.it/papers/ _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel