On Thursday, 27 February 2020 at 09:41:20 UTC, Basile B. wrote:
On Thursday, 27 February 2020 at 09:33:28 UTC, Dennis Cote
wrote:
[...]
Sorry but no. I think that you have missed how this has changed
since the first message.
1. the way it was tested initially was wrong because LLVM was
optimizing some stuff in some tests and not others, due to
literals constants.
2. Apparently there would be a branchless version that's fast
when testing with unbiased input (to be verified)
this version is:
---
ubyte decimalLength9_4(const uint v) pure nothrow
{
return 1 + (v >= 10) +
(v >= 100) +
(v >= 1000) +
(v >= 10000) +
(v >= 100000) +
(v >= 1000000) +
(v >= 10000000) +
(v >= 100000000) ;
}
---
but i cannot see the improvment when use time on the test
program and 100000000 calls feeded with a random number.
see
https://forum.dlang.org/post/ctidwrnxvwwkouprj...@forum.dlang.org for the latest evolution of the discussion.
maybe just add you version to the test program and run
time ./declen -c100000000 -f0 -s137 // original
time ./declen -c100000000 -f4 -s137 // the 100% branchless
time ./declen -c100000000 -f5 -s137 // the LUT + branchless for
the bit num that need attention
time ./declen -c100000000 -f6 -s137 // assumed to be your version
to see if it beats the original. Thing is that i cannot do it
right now but otherwise will try tomorrow.