On Mon May 7 04:54:23 EDT 2012, [email protected] wrote:
> sorry for being vague.
>
> treating pixels as 64bit on amd64 as that is the natural size for the
> machine, vs using 32bits per pixel - 10 bits of r, g, and b or y, u,
> and v plus 2 spare leads to a significant speedup; where significant
> is a number lost in the mists of time.
>
> i believe this speedup is due to the reduction in the rate of cache
> line refills, as forsyth described.
i'm confused. (asserting parity error.) which one's faster?
given your problem one would assume that in the absense of
any real gotchas,
processor bw >> memory bw ==> smaller integers faster
memory bw >> processor bw ==> natural integers faster
(this is yet another reason that int_fast* are a half-baked idea.
how does the compiler know this relation for the target machine
ahead of time?)
don't get me wrong, i can easily believe there are gotchas, that's
why i'm confused.
- erik