> on x86: > benchmarked magic: 14.048889885s > benchmarked div: 5.426952392s > benchmarked mul: 4.034106976s > > on x86-64: > benchmarked magic: 2.467789582s > benchmarked div: 9.748067755s > benchmarked mul: 8.665307997s > Did you compile your 32-bit code with -mfpmath=sse? If not, could you try and > post the results again? I'd be quite surprised if it turned out that the x87 > operations are faster than the SSE ones, but that's what your numbers show. It was compiled with following flags: on x86: -march=i686 -mtune=generic -O2 on x86-64: -march=x86-64 -mtune=generic -O2
As you asked, benchmark on 32-bit with -march=native -mfpmath=sse -O2 benchmarked magic: 16.204160542s benchmarked div: 9.719736771s benchmarked mul: 8.638401181s Slow SSE math is probably gcc fault. With clang on 32-bits i got this numbers: benchmarked magic: 19.441825239s benchmarked div: 5.493691053s benchmarked mul: 3.238189342s But at first look code does not contain x87 opcodes. (clang doesn't understand -mfpmath=sse) _______________________________________________ wayland-devel mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/wayland-devel
