This implementation I made in <https://github.com/SciNim/impulse/blob/49b813232507470a047727712acda105b84c7815/impulse/fft/factorization.nim#L23-L39> is faster on my machine func gcd_impulse(u, v: int): int = # From https://github.com/SciNim/impulse/blob/49b813232507470a047727712acda105b84c7815/impulse/fft/factorization.nim#L23-L39 if u == 0: return v if v == 0: return u let shift = countTrailingZeroBits(u or v) var u = u shr u.countTrailingZeroBits() var v = v while true: v = v shr v.countTrailingZeroBits() if u > v: swap(u, v) v = v - u if v == 0: return u shl shift Run
gist: <https://gist.github.com/mratsim/afb82da37287fb79dcd6825acec23f65> gcd in stdlib: 982 micro second gcdLAR: 1195 micro second gcdLAR2: 1648 micro second gcdLAR3: 1479 micro second gcdLAR4: 871 micro second gcdSub: 871 micro second gcdSub2: 1081 micro second gcd_impulse: 857 micro second Run