On 11/07/2014 07:31 PM, Ian Romanick wrote:
On 11/07/2014 06:09 AM, Siavash Eliasi wrote:
On 11/07/2014 03:14 PM, Steven Newbury wrote:
On Thu, 2014-11-06 at 21:00 -0800, Matt Turner wrote:
On Thu, Nov 6, 2014 at 8:56 PM, Siavash Eliasi <
siavashser...@gmail.com> wrote:
Then I do recommend removing the "if (cpu_has_sse4_1)" from this
patch and similar places, because there is no runtime CPU
dispatching happening for SSE optimized code paths in action and
just adds extra overhead (unnecessary branches) to the generated
code.
No. Sorry, I realize I misread your previous question:

I guess checking for "cpu_has_sse4_1" is unnecessary if it isn't
controllable by user at runtime; because "USE_SSE41" is a
compile time check and requires the target machine to be SSE 4.1
capable already.
USE_SSE41 is set if the *compiler* supports SSE 4.1. This allows you
to build the code and then use it only on systems that actually
support it.

All of this could have been pretty easily answered by a few greps
though...
I wonder what difference it would make to have an option to compile
out the run-time check code to avoid the additional overhead in cases
where the builder *knows* at compile time what the run-time system is?
(ie Gentoo)
I think that's possible. Since "cpu_has_sse4_1" and friends are simply
macros, one can set them to "true" or "1" during compile time if it's
going to be built for an SSE 4.1 capable target so your smart compiler
will totally get rid of the unnecessary runtime check.

I guess "common_x86_features.h" should be modified to something like this:

#ifdef __SSE4_1__
#define cpu_has_sse4_1 1
#else
#define cpu_has_sse4_1        (_mesa_x86_cpu_features & X86_FEATURE_SSE4_1)
#endif
I was thinking about doing something similar for cpu_has_xmm and
cpu_has_xmm2 for x64.  SSE and SSE2 are required parts of that
instruction set, so they're always there.

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


I can come up with a patch implementing the same for SSE, SSE2, SSE3 and SSSE3 if current approach is fine by you.
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to