Re: [Mesa-dev] [Mesa-stable] [PATCH] swr: fix -march flag for AVX

2016-06-16 Thread Chuck Atkins
After some architecture testing, it looks like we can drop the march
options and just enable the specific instruction sets via:

libswrAVX: -mavx
libswrAVX2: -mavx2 -mfma -mbmi2 -mf16c

The benefit of course is that with these flags swr works on AMD cpu's with
AVX and AVX2 instructions enabled.  The specific march flags can cause
instruction set compatibility mismatch since AMD CPUs that supported AVX
didn't necessarily support all of sandybridge instructions and similar for
AVX2 and haswell.  Regarding the extra options for fma, bmi2, and f16c, all
AMD CPUs that supported AVX2 also supported those as well.

- Chuck

On Mon, Jun 13, 2016 at 1:22 PM, Emil Velikov 
wrote:

> On 13 June 2016 at 15:43, Rowley, Timothy O 
> wrote:
> >
> >> On Jun 12, 2016, at 3:56 AM, Steven Newbury 
> wrote:
> >>
> >> On Fri, 2016-06-10 at 16:05 -0400, Ilia Mirkin wrote:
> >>> On Fri, Jun 10, 2016 at 3:43 PM, Tim Rowley  >>> om> wrote:
> 
>  Previously used core-avx-i was for ivybridge;
>  corei7-avx allows sandybridge.
>  ---
>   src/gallium/drivers/swr/Makefile.am | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
>  diff --git a/src/gallium/drivers/swr/Makefile.am
>  b/src/gallium/drivers/swr/Makefile.am
>  index d211f2e..8156cf2 100644
>  --- a/src/gallium/drivers/swr/Makefile.am
>  +++ b/src/gallium/drivers/swr/Makefile.am
>  @@ -124,7 +124,7 @@ COMMON_LDFLAGS = \
>   lib_LTLIBRARIES = libswrAVX.la libswrAVX2.la
> 
>   libswrAVX_la_CXXFLAGS = \
>  -   -march=core-avx-i \
>  +   -march=corei7-avx \
> >>> Just wondering if it'd be enough to say like
> >>>
> >>> -march=x86_64
> >>> -mavx
> >>>
> >>> and add -mavx2 for libswrAVX2.
> >>>
> >>> I suspect you've iterated through this 20 times and this has some
> >>> shortcoming I'm not thinking of, but figured I'd point it out just in
> >>> case it helps.
> >>>
> >>>   -ilia
> >>>
> >> Maybe I'm the only one who finds it horrible to override -march from
> >> within project build systems.
>
> >> It causes no end of problems with LTO,
> >> and results in objects being built inappropriately for the target as
> >> specified by the builder.
> Can you please elaborate on the exact problem ? Is there link where
> one can read more on the topic ?
>
> > I agree that the way swr is built can be improved and welcome
> ideas/discussion for work on the master branch, but for the moment I’m
> trying to get the 12.0 branch in good shape for swr with non-invasive
> changes (as we’ve tested the current setup pretty extensively).
> >
> Fully agree. Getting something more robust and less evasive for 12.0
> and work for 'the best' in master. Perhaps attribute((target)) perhaps
> folding the multiple binaries into one, or maybe something else ?
>
> >> Why not use function attributes?  Either compile a function
> >> specifically for a particular target, or specifically enable AVX though
> >> "__attribute__ ((__target__ ("avx")))"
> Is this going to produce correct binary if one provides -mno-avx
> and/or alike ? I believe for some CPUs one should (or the compiler
> does it for them) explicitly disable avx but honestly I've never
> looked if cpuid for the said CPUs shows avx as supported.
>
> In the latter case (cpuid listing AVX as supported, while one is
> recommended building with -mno-avx) how do you suggest we should
> handle things ?
>
> >> (ideally with a fall-back
> >> implementation detected at run-time when the instruction set isn't
> >> supported).  Even better, have it also protected at build-time by a
> >> simple check for __AVX__ so the code can be compiled out entirely where
> >> the package builder has specified -march(=native) as a build flag.
> The idea here is to have the binary build explicitly with avx/avx2 (if
> compiler is capable), even when doing generic builds. The picky part
> comes when one has to establish if the user provided one is 'more
> generic' or not (as per your example).
>
> Then again, based on my limited 'testing', building mesa with native
> and LTO brings us little to no improvements. 'Tested' running glxgears
> in small window.
>
> > One problem with the function attribute path is that as far as I know,
> MSVC does not have a similar feature.
> >
> Afaict MSVC has optimise pragma [1], just like GCC. Ideally they'll
> add an equivalent to attribute((target)) soon ?
>
> Even without that, swr (as seen in mesa) does not build MSVC. When
> that cases it should be a matter of factoring the attribute target as
> a macro and leaving it empty for the !GCC build ?
>
> TL;DR: IMHO fix is ok for now, but for the future we might want to
> explode other options.
>
> -Emil
> [1] https://msdn.microsoft.com/en-us/library/chh3fb0k.aspx
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> 

Re: [Mesa-dev] [Mesa-stable] [PATCH] swr: fix -march flag for AVX

2016-06-13 Thread Emil Velikov
On 13 June 2016 at 15:43, Rowley, Timothy O  wrote:
>
>> On Jun 12, 2016, at 3:56 AM, Steven Newbury  wrote:
>>
>> On Fri, 2016-06-10 at 16:05 -0400, Ilia Mirkin wrote:
>>> On Fri, Jun 10, 2016 at 3:43 PM, Tim Rowley >> om> wrote:

 Previously used core-avx-i was for ivybridge;
 corei7-avx allows sandybridge.
 ---
  src/gallium/drivers/swr/Makefile.am | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/gallium/drivers/swr/Makefile.am
 b/src/gallium/drivers/swr/Makefile.am
 index d211f2e..8156cf2 100644
 --- a/src/gallium/drivers/swr/Makefile.am
 +++ b/src/gallium/drivers/swr/Makefile.am
 @@ -124,7 +124,7 @@ COMMON_LDFLAGS = \
  lib_LTLIBRARIES = libswrAVX.la libswrAVX2.la

  libswrAVX_la_CXXFLAGS = \
 -   -march=core-avx-i \
 +   -march=corei7-avx \
>>> Just wondering if it'd be enough to say like
>>>
>>> -march=x86_64
>>> -mavx
>>>
>>> and add -mavx2 for libswrAVX2.
>>>
>>> I suspect you've iterated through this 20 times and this has some
>>> shortcoming I'm not thinking of, but figured I'd point it out just in
>>> case it helps.
>>>
>>>   -ilia
>>>
>> Maybe I'm the only one who finds it horrible to override -march from
>> within project build systems.

>> It causes no end of problems with LTO,
>> and results in objects being built inappropriately for the target as
>> specified by the builder.
Can you please elaborate on the exact problem ? Is there link where
one can read more on the topic ?

> I agree that the way swr is built can be improved and welcome 
> ideas/discussion for work on the master branch, but for the moment I’m trying 
> to get the 12.0 branch in good shape for swr with non-invasive changes (as 
> we’ve tested the current setup pretty extensively).
>
Fully agree. Getting something more robust and less evasive for 12.0
and work for 'the best' in master. Perhaps attribute((target)) perhaps
folding the multiple binaries into one, or maybe something else ?

>> Why not use function attributes?  Either compile a function
>> specifically for a particular target, or specifically enable AVX though
>> "__attribute__ ((__target__ ("avx")))"
Is this going to produce correct binary if one provides -mno-avx
and/or alike ? I believe for some CPUs one should (or the compiler
does it for them) explicitly disable avx but honestly I've never
looked if cpuid for the said CPUs shows avx as supported.

In the latter case (cpuid listing AVX as supported, while one is
recommended building with -mno-avx) how do you suggest we should
handle things ?

>> (ideally with a fall-back
>> implementation detected at run-time when the instruction set isn't
>> supported).  Even better, have it also protected at build-time by a
>> simple check for __AVX__ so the code can be compiled out entirely where
>> the package builder has specified -march(=native) as a build flag.
The idea here is to have the binary build explicitly with avx/avx2 (if
compiler is capable), even when doing generic builds. The picky part
comes when one has to establish if the user provided one is 'more
generic' or not (as per your example).

Then again, based on my limited 'testing', building mesa with native
and LTO brings us little to no improvements. 'Tested' running glxgears
in small window.

> One problem with the function attribute path is that as far as I know, MSVC 
> does not have a similar feature.
>
Afaict MSVC has optimise pragma [1], just like GCC. Ideally they'll
add an equivalent to attribute((target)) soon ?

Even without that, swr (as seen in mesa) does not build MSVC. When
that cases it should be a matter of factoring the attribute target as
a macro and leaving it empty for the !GCC build ?

TL;DR: IMHO fix is ok for now, but for the future we might want to
explode other options.

-Emil
[1] https://msdn.microsoft.com/en-us/library/chh3fb0k.aspx
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] swr: fix -march flag for AVX

2016-06-10 Thread Rowley, Timothy O

> On Jun 10, 2016, at 3:49 PM, Ilia Mirkin  wrote:
> 
> On Fri, Jun 10, 2016 at 4:48 PM, Rowley, Timothy O
>  wrote:
>> 
>>> On Jun 10, 2016, at 3:37 PM, Rowley, Timothy O  
>>> wrote:
>>> 
 
 On Jun 10, 2016, at 3:01 PM, Emil Velikov  wrote:
 
 On 10 June 2016 at 20:43, Tim Rowley  wrote:
> Previously used core-avx-i was for ivybridge;
> corei7-avx allows sandybridge.
 Which GCC version was required by the previous and which by the
 current version ?
>>> 
>>> Both options work back to gcc 4.8.x.  I picked the wrong architecture flag 
>>> when setting up the build system initially.
>>> 
> ---
> src/gallium/drivers/swr/Makefile.am | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/swr/Makefile.am 
> b/src/gallium/drivers/swr/Makefile.am
> index d211f2e..8156cf2 100644
> --- a/src/gallium/drivers/swr/Makefile.am
> +++ b/src/gallium/drivers/swr/Makefile.am
> @@ -124,7 +124,7 @@ COMMON_LDFLAGS = \
> lib_LTLIBRARIES = libswrAVX.la libswrAVX2.la
> 
> libswrAVX_la_CXXFLAGS = \
> -   -march=core-avx-i \
> +   -march=corei7-avx \
 I'm likely missing something but neither one seems listed in the 5.4
 [1] and 6.1 [2] manual. Should we be using one that's officially
 supported ?
>>> 
>>> According to Chuck Atkins, while the older options aren’t documented 
>>> anymore, they still work on gcc 5.x and 6.x.
>>> 
>>> Another possibility which Ilia also mentioned was to use feature flags 
>>> instead which have stayed consistent between gcc versions, i.e. “-mavx” and 
>>> “-mavx2 -mfma -mf16c” respectively for the avx and avx2 configurations.  
>>> Not too opposed to this except that we’ve had extensive testing with the 
>>> -march flags and changing the build configuration during the release 
>>> candidate stage seems a bit late.
>> 
>> Another thought along these lines - since the swr code makes heavy use of 
>> intrinsics, we would like to have the code optimized as best as possible to 
>> the target architecture (or at least, the baseline architecture which 
>> introduces AVX and AVX2 respectively), which is best done with the -march 
>> flags.
> 
> Actually it's best done with -mtune flags. -march, of course, also
> sets mtune. (But you can override it later.)

Ah, true.  Though the documented architecture flags for -mtune have changed 
along with -march between gcc 4.8 and gcc >=5, so we don’t really gain from 
using feature flags and -mtune versus -march.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] swr: fix -march flag for AVX

2016-06-10 Thread Ilia Mirkin
On Fri, Jun 10, 2016 at 4:48 PM, Rowley, Timothy O
 wrote:
>
>> On Jun 10, 2016, at 3:37 PM, Rowley, Timothy O  
>> wrote:
>>
>>>
>>> On Jun 10, 2016, at 3:01 PM, Emil Velikov  wrote:
>>>
>>> On 10 June 2016 at 20:43, Tim Rowley  wrote:
 Previously used core-avx-i was for ivybridge;
 corei7-avx allows sandybridge.
>>> Which GCC version was required by the previous and which by the
>>> current version ?
>>
>> Both options work back to gcc 4.8.x.  I picked the wrong architecture flag 
>> when setting up the build system initially.
>>
 ---
 src/gallium/drivers/swr/Makefile.am | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/gallium/drivers/swr/Makefile.am 
 b/src/gallium/drivers/swr/Makefile.am
 index d211f2e..8156cf2 100644
 --- a/src/gallium/drivers/swr/Makefile.am
 +++ b/src/gallium/drivers/swr/Makefile.am
 @@ -124,7 +124,7 @@ COMMON_LDFLAGS = \
 lib_LTLIBRARIES = libswrAVX.la libswrAVX2.la

 libswrAVX_la_CXXFLAGS = \
 -   -march=core-avx-i \
 +   -march=corei7-avx \
>>> I'm likely missing something but neither one seems listed in the 5.4
>>> [1] and 6.1 [2] manual. Should we be using one that's officially
>>> supported ?
>>
>> According to Chuck Atkins, while the older options aren’t documented 
>> anymore, they still work on gcc 5.x and 6.x.
>>
>> Another possibility which Ilia also mentioned was to use feature flags 
>> instead which have stayed consistent between gcc versions, i.e. “-mavx” and 
>> “-mavx2 -mfma -mf16c” respectively for the avx and avx2 configurations.  Not 
>> too opposed to this except that we’ve had extensive testing with the -march 
>> flags and changing the build configuration during the release candidate 
>> stage seems a bit late.
>
> Another thought along these lines - since the swr code makes heavy use of 
> intrinsics, we would like to have the code optimized as best as possible to 
> the target architecture (or at least, the baseline architecture which 
> introduces AVX and AVX2 respectively), which is best done with the -march 
> flags.

Actually it's best done with -mtune flags. -march, of course, also
sets mtune. (But you can override it later.)

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] swr: fix -march flag for AVX

2016-06-10 Thread Rowley, Timothy O

> On Jun 10, 2016, at 3:37 PM, Rowley, Timothy O  
> wrote:
> 
>> 
>> On Jun 10, 2016, at 3:01 PM, Emil Velikov  wrote:
>> 
>> On 10 June 2016 at 20:43, Tim Rowley  wrote:
>>> Previously used core-avx-i was for ivybridge;
>>> corei7-avx allows sandybridge.
>> Which GCC version was required by the previous and which by the
>> current version ?
> 
> Both options work back to gcc 4.8.x.  I picked the wrong architecture flag 
> when setting up the build system initially.
> 
>>> ---
>>> src/gallium/drivers/swr/Makefile.am | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>> 
>>> diff --git a/src/gallium/drivers/swr/Makefile.am 
>>> b/src/gallium/drivers/swr/Makefile.am
>>> index d211f2e..8156cf2 100644
>>> --- a/src/gallium/drivers/swr/Makefile.am
>>> +++ b/src/gallium/drivers/swr/Makefile.am
>>> @@ -124,7 +124,7 @@ COMMON_LDFLAGS = \
>>> lib_LTLIBRARIES = libswrAVX.la libswrAVX2.la
>>> 
>>> libswrAVX_la_CXXFLAGS = \
>>> -   -march=core-avx-i \
>>> +   -march=corei7-avx \
>> I'm likely missing something but neither one seems listed in the 5.4
>> [1] and 6.1 [2] manual. Should we be using one that's officially
>> supported ?
> 
> According to Chuck Atkins, while the older options aren’t documented anymore, 
> they still work on gcc 5.x and 6.x.
> 
> Another possibility which Ilia also mentioned was to use feature flags 
> instead which have stayed consistent between gcc versions, i.e. “-mavx” and 
> “-mavx2 -mfma -mf16c” respectively for the avx and avx2 configurations.  Not 
> too opposed to this except that we’ve had extensive testing with the -march 
> flags and changing the build configuration during the release candidate stage 
> seems a bit late.

Another thought along these lines - since the swr code makes heavy use of 
intrinsics, we would like to have the code optimized as best as possible to the 
target architecture (or at least, the baseline architecture which introduces 
AVX and AVX2 respectively), which is best done with the -march flags.

> 
>> That aside, we might want to ensure that these are in sync with the
>> configure ones. One way it to set a variable, AC_SUBST it in configure
>> and use it in the makefile. Grep for SSE41 for example how we handle
>> it.
>> 
>> Thanks
>> Emil
>> 
>> [1] https://gcc.gnu.org/onlinedocs/gcc-5.4.0/gcc/x86-Options.html#x86-Options
>> [2] https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html#x86-Options
>> ___
>> mesa-stable mailing list
>> mesa-sta...@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-stable

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] swr: fix -march flag for AVX

2016-06-10 Thread Rowley, Timothy O

> On Jun 10, 2016, at 3:01 PM, Emil Velikov  wrote:
> 
> On 10 June 2016 at 20:43, Tim Rowley  wrote:
>> Previously used core-avx-i was for ivybridge;
>> corei7-avx allows sandybridge.
> Which GCC version was required by the previous and which by the
> current version ?

Both options work back to gcc 4.8.x.  I picked the wrong architecture flag when 
setting up the build system initially.

>> ---
>> src/gallium/drivers/swr/Makefile.am | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/src/gallium/drivers/swr/Makefile.am 
>> b/src/gallium/drivers/swr/Makefile.am
>> index d211f2e..8156cf2 100644
>> --- a/src/gallium/drivers/swr/Makefile.am
>> +++ b/src/gallium/drivers/swr/Makefile.am
>> @@ -124,7 +124,7 @@ COMMON_LDFLAGS = \
>> lib_LTLIBRARIES = libswrAVX.la libswrAVX2.la
>> 
>> libswrAVX_la_CXXFLAGS = \
>> -   -march=core-avx-i \
>> +   -march=corei7-avx \
> I'm likely missing something but neither one seems listed in the 5.4
> [1] and 6.1 [2] manual. Should we be using one that's officially
> supported ?

According to Chuck Atkins, while the older options aren’t documented anymore, 
they still work on gcc 5.x and 6.x.

Another possibility which Ilia also mentioned was to use feature flags instead 
which have stayed consistent between gcc versions, i.e. “-mavx” and “-mavx2 
-mfma -mf16c” respectively for the avx and avx2 configurations.  Not too 
opposed to this except that we’ve had extensive testing with the -march flags 
and changing the build configuration during the release candidate stage seems a 
bit late.

-Tim

> That aside, we might want to ensure that these are in sync with the
> configure ones. One way it to set a variable, AC_SUBST it in configure
> and use it in the makefile. Grep for SSE41 for example how we handle
> it.
> 
> Thanks
> Emil
> 
> [1] https://gcc.gnu.org/onlinedocs/gcc-5.4.0/gcc/x86-Options.html#x86-Options
> [2] https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html#x86-Options
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] swr: fix -march flag for AVX

2016-06-10 Thread Emil Velikov
On 10 June 2016 at 20:43, Tim Rowley  wrote:
> Previously used core-avx-i was for ivybridge;
> corei7-avx allows sandybridge.
Which GCC version was required by the previous and which by the
current version ?

> ---
>  src/gallium/drivers/swr/Makefile.am | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/swr/Makefile.am 
> b/src/gallium/drivers/swr/Makefile.am
> index d211f2e..8156cf2 100644
> --- a/src/gallium/drivers/swr/Makefile.am
> +++ b/src/gallium/drivers/swr/Makefile.am
> @@ -124,7 +124,7 @@ COMMON_LDFLAGS = \
>  lib_LTLIBRARIES = libswrAVX.la libswrAVX2.la
>
>  libswrAVX_la_CXXFLAGS = \
> -   -march=core-avx-i \
> +   -march=corei7-avx \
I'm likely missing something but neither one seems listed in the 5.4
[1] and 6.1 [2] manual. Should we be using one that's officially
supported ?

That aside, we might want to ensure that these are in sync with the
configure ones. One way it to set a variable, AC_SUBST it in configure
and use it in the makefile. Grep for SSE41 for example how we handle
it.

Thanks
Emil

[1] https://gcc.gnu.org/onlinedocs/gcc-5.4.0/gcc/x86-Options.html#x86-Options
[2] https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html#x86-Options
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev