, 2009 7:12 PM
To: Matt Turner
Cc: mesa3d-dev@lists.sourceforge.net
Subject: Re: [Mesa3d-dev] minor u_math.h speedup fun
Matt Turner matts...@gmail.com writes:
On Sat, Nov 28, 2009 at 2:13 PM, Yang Zhao y...@yangman.ca wrote:
The speed-up is definitely there, but __builtin_popcount()
will still
@lists.sourceforge.net
Subject: Re: [Mesa3d-dev] minor u_math.h speedup fun
It sounds like you're ignoring the case where gallium is not built with
gcc (i.e. doesn't have __builtin_popcount available). We still need an
implementation to fall back on.
On Mon, 2009-11-30 at 01:00 -0800, Keith Whitwell wrote
I was perusing the commit log for mesa and stumbled upon the recently
added util_bitcount. It uses a rather naïve algorithm and I thought I'd
look into it as someone mentioned this problem to me before.
This is what I found, should anyone be interested:
Do your test again. I just pushed a fairly fast variable-length bitcount.
Sorry for not pushing it earlier.
Posting from a mobile, pardon my terseness. ~ C.
On Nov 28, 2009 10:12 AM, Joakim Sindholt b...@zhasha.com wrote:
I was perusing the commit log for mesa and stumbled upon the recently
The test results are in:
__builtin_popcount(): 12.677 seconds
fast_bitcount(): 7.218 seconds
kr_bitcount(): 33.172 seconds
naive(): 59.345 seconds
also, the patch you committed says for (bits, n, bits++). Notice the
commas are not semicolons.
On Sat, 2009-11-28 at 10:16 -0800, Corbin Simpson
Results from my 2 GHz Core 2.
__builtin_popcount(): 11.709 seconds
fast_bitcount(): 3.956 seconds
kr_bitcount(): 24.276 seconds
naive(): 38.493 seconds
Nothing even compares to fast_bitcount.
Matt
--
Let Crystal
On Sat, Nov 28, 2009 at 2:13 PM, Yang Zhao y...@yangman.ca wrote:
The speed-up is definitely there, but __builtin_popcount() will still
be drastically faster when architecture-specific optimizations are
enabled:
I don't think this is the case (except for with SSE4's popcnt
instruction, which
2009/11/28 Matt Turner matts...@gmail.com:
On Sat, Nov 28, 2009 at 2:13 PM, Yang Zhao y...@yangman.ca wrote:
The speed-up is definitely there, but __builtin_popcount() will still
be drastically faster when architecture-specific optimizations are
enabled:
I don't think this is the case