Re: [Mesa3d-dev] minor u_math.h speedup fun

2009-11-30 Thread Keith Whitwell
, 2009 7:12 PM To: Matt Turner Cc: mesa3d-dev@lists.sourceforge.net Subject: Re: [Mesa3d-dev] minor u_math.h speedup fun Matt Turner matts...@gmail.com writes: On Sat, Nov 28, 2009 at 2:13 PM, Yang Zhao y...@yangman.ca wrote: The speed-up is definitely there, but __builtin_popcount() will still

Re: [Mesa3d-dev] minor u_math.h speedup fun

2009-11-30 Thread Keith Whitwell
@lists.sourceforge.net Subject: Re: [Mesa3d-dev] minor u_math.h speedup fun It sounds like you're ignoring the case where gallium is not built with gcc (i.e. doesn't have __builtin_popcount available). We still need an implementation to fall back on. On Mon, 2009-11-30 at 01:00 -0800, Keith Whitwell wrote

[Mesa3d-dev] minor u_math.h speedup fun

2009-11-28 Thread Joakim Sindholt
I was perusing the commit log for mesa and stumbled upon the recently added util_bitcount. It uses a rather naïve algorithm and I thought I'd look into it as someone mentioned this problem to me before. This is what I found, should anyone be interested:

Re: [Mesa3d-dev] minor u_math.h speedup fun

2009-11-28 Thread Corbin Simpson
Do your test again. I just pushed a fairly fast variable-length bitcount. Sorry for not pushing it earlier. Posting from a mobile, pardon my terseness. ~ C. On Nov 28, 2009 10:12 AM, Joakim Sindholt b...@zhasha.com wrote: I was perusing the commit log for mesa and stumbled upon the recently

Re: [Mesa3d-dev] minor u_math.h speedup fun

2009-11-28 Thread Joakim Sindholt
The test results are in: __builtin_popcount(): 12.677 seconds fast_bitcount(): 7.218 seconds kr_bitcount(): 33.172 seconds naive(): 59.345 seconds also, the patch you committed says for (bits, n, bits++). Notice the commas are not semicolons. On Sat, 2009-11-28 at 10:16 -0800, Corbin Simpson

Re: [Mesa3d-dev] minor u_math.h speedup fun

2009-11-28 Thread Matt Turner
Results from my 2 GHz Core 2. __builtin_popcount(): 11.709 seconds fast_bitcount(): 3.956 seconds kr_bitcount(): 24.276 seconds naive(): 38.493 seconds Nothing even compares to fast_bitcount. Matt -- Let Crystal

Re: [Mesa3d-dev] minor u_math.h speedup fun

2009-11-28 Thread Matt Turner
On Sat, Nov 28, 2009 at 2:13 PM, Yang Zhao y...@yangman.ca wrote: The speed-up is definitely there, but __builtin_popcount() will still be drastically faster when architecture-specific optimizations are enabled: I don't think this is the case (except for with SSE4's popcnt instruction, which

Re: [Mesa3d-dev] minor u_math.h speedup fun

2009-11-28 Thread Yang Zhao
2009/11/28 Matt Turner matts...@gmail.com: On Sat, Nov 28, 2009 at 2:13 PM, Yang Zhao y...@yangman.ca wrote: The speed-up is definitely there, but __builtin_popcount() will still be drastically faster when architecture-specific optimizations are enabled: I don't think this is the case