On Wed, Dec 30, 2015 at 5:41 PM, Roland Scheidegger <[email protected]> wrote: > Am 30.12.2015 um 10:46 schrieb Oded Gabbay: >> On Wed, Dec 30, 2015 at 1:11 AM, Roland Scheidegger <[email protected]> >> wrote: >>> >>> So, if I see that right, you will automatically generate binaries using >>> power8 instructions if compiled on power8 capable box, which then won't >>> run on boxes not supporting power8? Is that really what you want? >>> Maybe some runtime detection would be a good idea (though I don't know >>> if anyone cares about power7)? >> >> The problem is I don't think I can eliminate the build time check >> (although I would very much like to) because I need: >> 1. To pass a special flag to the GCC compiler: -mpower8-vector >> 2. To define _ARCH_PWR8 so GCC will include the newer intrinsic >> >> Without those two things, I won't be able to use vec_vgbbd which I >> need to implement the _mm_movemask_epi8 efficiently, and without that, >> all this patch series can be thrown out the window. The emulation of >> _mm_movemask_epi8 using regular instructions is just horrible. >> >> You are correct that once you build a binary with this flag on power8 >> machine, that binary won't run on power7 machine. You get "cannot >> execute binary file" >> >> Unfortunately, I don't see a way around this because even if I >> condition the use of vec_vgbbd on a runtime check/define, the library >> still won't be executable because it was built with -mpower8-vector. >> >> Having said that, because I *assume* IBM right now mostly cares about >> Linux running on POWER8 with little-endian, I think it is a fair >> compromise. > > Note I don't have anything against a build time check. My concern here > is something along the lines of unsuspecting distros shipping binaries > which won't work, as it looks to me like this will get picked up > automatically. That is different to how for instance sse41 is handled. > That is I believe this should only get enabled if someone has specified > some -mcpu=power8 or whatever flag explicitly somewhere already. > > Roland
I understand and I share your concern. Maybe we should add "--disable-pwr8-inst" to mesa's configure ? if that flag is given to configure, it would disable the optimization code (won't add _ARCH_PWR8 to defines and won't add -mpower8-vector to gcc flags). What do you think ? Oded > >> >> Oded >> >>> So far we didn't bother with that for SSE >>> but it has to be said SSE2 is a really low bar (and the manual assembly >>> stuff doesn't use anything more advanced, even though clearly things >>> like the emulated mm_mullo_epi32 are suboptimal if your cpu supports >>> sse41). And even then on non-x86 you actually might not get >>> PIPE_ARCH_SSE if you didn't set gcc's compile flags accordingly. >>> >>> Roland >>> >>> >>> Am 29.12.2015 um 17:12 schrieb Oded Gabbay: >>>> To determine if we could use special POWER8 assembly directives, we first >>>> need to detect whether we are running on POWER8 architecture. This patch >>>> adds this detection to configure.ac and adds the necessary compilation >>>> flags accordingly. >>>> >>>> Signed-off-by: Oded Gabbay <[email protected]> >>>> --- >>>> configure.ac | 30 ++++++++++++++++++++++++++++++ >>>> 1 file changed, 30 insertions(+) >>>> >>>> diff --git a/configure.ac b/configure.ac >>>> index f8a70be..1acd47e 100644 >>>> --- a/configure.ac >>>> +++ b/configure.ac >>>> @@ -396,6 +396,36 @@ fi >>>> AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1]) >>>> AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS) >>>> >>>> +dnl Check for POWER8 Architecture >>>> +PWR8_CFLAGS="-mpower8-vector" >>>> +have_pwr8_intrinsics=no >>>> +AC_MSG_CHECKING(whether we are running on POWER8 Architecture) >>>> +save_CFLAGS=$CFLAGS >>>> +CFLAGS="$PWR8_CFLAGS $CFLAGS" >>>> +AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ >>>> +#if defined(__GNUC__) && (__GNUC__ < 4 || (__GNUC__ == 4 && >>>> __GNUC_MINOR__ < 8)) >>>> +#error "Need GCC >= 4.8 for sane POWER8 support" >>>> +#endif >>>> +#include <altivec.h> >>>> +int main () { >>>> + vector unsigned char r; >>>> + vector unsigned int v = vec_splat_u32 (1); >>>> + r = __builtin_vec_vgbbd ((vector unsigned char) v); >>>> + return 0; >>>> +}]])], have_pwr8_intrinsics=yes) >>>> +CFLAGS=$save_CFLAGS >>>> + >>>> +if test $have_pwr8_intrinsics = yes ; then >>>> + DEFINES="$DEFINES -D_ARCH_PWR8" >>>> + CXXFLAGS="$CXXFLAGS $PWR8_CFLAGS" >>>> + CFLAGS="$CFLAGS $PWR8_CFLAGS" >>>> +else >>>> + PWR8_CFLAGS= >>>> +fi >>>> + >>>> +AC_MSG_RESULT($have_pwr8_intrinsics) >>>> +AC_SUBST([PWR8_CFLAGS], $PWR8_CFLAGS) >>>> + >>>> dnl Can't have static and shared libraries, default to static if user >>>> dnl explicitly requested. If both disabled, set to static since shared >>>> dnl was explicitly requested. >>>> >>> > _______________________________________________ mesa-dev mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/mesa-dev
