On Tue, Jan 10, 2012 at 03:31:34PM +0800, Liu Bo wrote:
> +static inline int generate_node_level(u64 randseed)
> +{
> +     int level = 0;
> +
> +     while (randseed && !(randseed & 3)) {
> +             randseed >>= 2;
> +             level++;
> +     }
> +
> +     return (level > MAXLEVEL ? MAXLEVEL : level);
> +}

This is counting number of trailing zeros * 2 (except when randseed ==
0), there's a gcc builtin for it __builtin_ctzll and you can turn it in
a loopless inlinable function:

static inline int generate_node_level(u64 randseed)
{
        return randseed == 0 ? 0 : __builtin_ctzll(randseed) >> 1
}

the builtin should be safe on all arches without the need of libgcc
support, there seem to be handcoded asm statements for each arch.

microbenchmarkg of builtin vs while-counter showed 2.3x speedup:

builtin: 1.866529 ns/loop
while:   4.265664 ns/loop
(1<<32 loops, on a generic intel x86_64 box)


and if MAXLEVEL is <= 16, then you can generate just 4 random bytes and
compute the level in the same way without any loss.


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to