On Thursday, 27 February 2020 at 09:41:20 UTC, Basile B. wrote:
On Thursday, 27 February 2020 at 09:33:28 UTC, Dennis Cote wrote:
[...]

Sorry but no. I think that you have missed how this has changed since the first message. 1. the way it was tested initially was wrong because LLVM was optimizing some stuff in some tests and not others, due to literals constants. 2. Apparently there would be a branchless version that's fast when testing with unbiased input (to be verified)

this version is:

---
ubyte decimalLength9_4(const uint v) pure nothrow
{
    return 1 +  (v >= 10) +
                (v >= 100) +
                (v >= 1000) +
                (v >= 10000) +
                (v >= 100000) +
                (v >= 1000000) +
                (v >= 10000000) +
                (v >= 100000000) ;
}
---

but i cannot see the improvment when use time on the test program and 100000000 calls feeded with a random number.

see https://forum.dlang.org/post/ctidwrnxvwwkouprj...@forum.dlang.org for the latest evolution of the discussion.

maybe just add you version to the test program and run

time ./declen -c100000000 -f0 -s137 // original
time ./declen -c100000000 -f4 -s137 // the 100% branchless
time ./declen -c100000000 -f5 -s137 // the LUT + branchless for the bit num that need attention
time ./declen -c100000000 -f6 -s137 // assumed to be your version

to see if it beats the original. Thing is that i cannot do it right now but otherwise will try tomorrow.

Reply via email to