On Thursday, 23 May 2019 at 21:50:38 UTC, Alex wrote:
I've used very small LUT's like a length of 5 and it didn't significantly change anything.

Use a size that is 2^n, then mask the index and hopefully that will turn off bounds checks.

E.g. If LUT size is 16, then index the lut with "i&15"?

I haven't tested this well but was just thrown off by the results as it should easily have been inverted and I expected quite a significant speed up(several factors) and not the reverse.

Well, you could take the time times clock frequency, divide it by number of iterations and calculate number of cycles per iteration. If it is more than a dozen, then something is wrong.


Reply via email to