On 05/18/2012 03:54 PM, Roland Scheidegger wrote:
Looks ok though I wonder if we really need our own assembly here?
In particular if the compiler decides to use sse we really shouldn't use
the fp stack for converting floats to ints. fistp is just twice as slow
as sse conversion on newer cpus, and additionally it might potentially
involve moving values from xmm regs to fp.
I suspect something like lroundf() would generate better code than the
manual assembly (and far better than the c code) if things are compiled
to use sse2 at least (the same is of course true for the other functions
like ceil etc.). But I guess that's not available everywhere...

For now, I'm just trying to fix the issue at hand. If anyone wants to look into using lroundf() and SSE code, that's great. I'm really not up on what's the fastest solution on various CPUs.

-Brian
_______________________________________________
mesa-dev mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to