Hi Martin!

Good point about the command line flags, thanks!

These variants are close to numberOfTrailingZeros_07 that I've already tested, though you did better by saving one arithmetic operation at the return line!

I'll rerun the benchmarks.

With kind regards,

Ivan


On 8/13/18 7:56 AM, Martin Buchholz wrote:
The number of plausible variants is astonishing!

---

Your use of -client and -server is outdated, which explains why you get the same results for both (-client is ignored).

I'm not sure what's blessed by hotspot team, but for C1 I use -XX:+TieredCompilation -XX:TieredStopAtLevel=1 and for C2 I use -XX:-TieredCompilation -server

---

Now I understand the advantage of using ~i & (i - 1): the subsequent zero check is a short-circuit for all odd numbers, better than i & -i, which explains your results - they depend on being able to short-circuit.

So just use a more faithful inlining of nlz without trying to improve on it.

    static int ntz_inlineNlz5(int i) {
        i = ~i & (i - 1);
        if (i <= 0)
            return (i == 0) ? 0 : 32;
        int n = 1;
        if (i >= 1 << 16) { n += 16; i >>>= 16; }
        if (i >= 1 <<  8) { n +=  8; i >>>=  8; }
        if (i >= 1 <<  4) { n +=  4; i >>>=  4; }
        if (i >= 1 <<  2) { n +=  2; i >>>=  2; }
        return n + (i >>> 1);
    }

But it's hard to resist the urge to optimize out a branch:

    static int ntz_inlineNlz6(int i) {
        i = ~i & (i - 1);
        if (i <= 0) return i & 32;
        int n = 1;
        if (i >= 1 << 16) { n += 16; i >>>= 16; }
        if (i >= 1 <<  8) { n +=  8; i >>>=  8; }
        if (i >= 1 <<  4) { n +=  4; i >>>=  4; }
        if (i >= 1 <<  2) { n +=  2; i >>>=  2; }
        return n + (i >>> 1);
    }


--
With kind regards,
Ivan Gerasimov

Reply via email to