Hi Martin!
Good point about the command line flags, thanks!
These variants are close to numberOfTrailingZeros_07 that I've already
tested, though you did better by saving one arithmetic operation at the
return line!
I'll rerun the benchmarks.
With kind regards,
Ivan
On 8/13/18 7:56 AM, Martin Buchholz wrote:
The number of plausible variants is astonishing!
---
Your use of -client and -server is outdated, which explains why you
get the same results for both (-client is ignored).
I'm not sure what's blessed by hotspot team, but for C1 I
use -XX:+TieredCompilation -XX:TieredStopAtLevel=1 and for C2 I
use -XX:-TieredCompilation -server
---
Now I understand the advantage of using ~i & (i - 1): the subsequent
zero check is a short-circuit for all odd numbers, better than i & -i,
which explains your results - they depend on being able to short-circuit.
So just use a more faithful inlining of nlz without trying to improve
on it.
static int ntz_inlineNlz5(int i) {
i = ~i & (i - 1);
if (i <= 0)
return (i == 0) ? 0 : 32;
int n = 1;
if (i >= 1 << 16) { n += 16; i >>>= 16; }
if (i >= 1 << 8) { n += 8; i >>>= 8; }
if (i >= 1 << 4) { n += 4; i >>>= 4; }
if (i >= 1 << 2) { n += 2; i >>>= 2; }
return n + (i >>> 1);
}
But it's hard to resist the urge to optimize out a branch:
static int ntz_inlineNlz6(int i) {
i = ~i & (i - 1);
if (i <= 0) return i & 32;
int n = 1;
if (i >= 1 << 16) { n += 16; i >>>= 16; }
if (i >= 1 << 8) { n += 8; i >>>= 8; }
if (i >= 1 << 4) { n += 4; i >>>= 4; }
if (i >= 1 << 2) { n += 2; i >>>= 2; }
return n + (i >>> 1);
}
--
With kind regards,
Ivan Gerasimov