On Thursday, 13 October 2016 at 00:32:36 UTC, safety0ff wrote:


It made little difference: LDC compiled into AVX2 vectorized addition (vpmovzxbq & vpaddq.)

Measurements without -mcpu=native:
overhead 0.336s
bytes    0.610s
without branch hints 0.852s
code pasted 0.766s

Reply via email to