Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41

2017-08-12 Thread Kenneth Graunke
On Saturday, August 12, 2017 1:25:25 AM PDT Mauro Rossi wrote:
> On Aug 11, 2017 5:14 PM, "Kenneth Graunke"  wrote:
> 
> On Thursday, August 10, 2017 11:50:30 PM PDT Tapani Pälli wrote:
> > I do wonder what the target machine is (I haven't seen one that would
> > not have ARCH_X86_HAVE_SSE4_1 true, both 32bit and 64bit) but falling
> > back to memcpy makes perfect sense without USE_SSE4_1;
> >
> > Reviewed-by: Tapani Pälli 
> 
> I agree - you really want SSE 4.1 support for decent performance on Atoms.
> The MOVNTDQA texture upload path, for example, gave us some really big
> performance gains on some ChromeOS workloads.
> 
> So, while we should patch this, it'd still be worth figuring out if it's
> possible to enable it in Android-x86.
> 
> --Ken
> 
> 
> Hi,
> 
> I checked on nougat-x86 build 32bit and the patches are Ok.
> 
> The users of android-x86 have the option to build with silvermont.mk and
> later
> to have an optimized build, but we currently try to support SSE3 devices.
> GNU/Linux distributions for desktop, like their kernel, may have that goal
> too.
> 
> Thanks
> M.

Well, right...but we do guard these with runtime checks.  So you should
be able to build in optional SSE 4.1 support but still have such a build
work on SSE3-only hardware...

I'm glad to hear it's an option, at least...

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41

2017-08-12 Thread Mauro Rossi
On Aug 11, 2017 5:14 PM, "Kenneth Graunke"  wrote:

On Thursday, August 10, 2017 11:50:30 PM PDT Tapani Pälli wrote:
> I do wonder what the target machine is (I haven't seen one that would
> not have ARCH_X86_HAVE_SSE4_1 true, both 32bit and 64bit) but falling
> back to memcpy makes perfect sense without USE_SSE4_1;
>
> Reviewed-by: Tapani Pälli 

I agree - you really want SSE 4.1 support for decent performance on Atoms.
The MOVNTDQA texture upload path, for example, gave us some really big
performance gains on some ChromeOS workloads.

So, while we should patch this, it'd still be worth figuring out if it's
possible to enable it in Android-x86.

--Ken


Hi,

I checked on nougat-x86 build 32bit and the patches are Ok.

The users of android-x86 have the option to build with silvermont.mk and
later
to have an optimized build, but we currently try to support SSE3 devices.
GNU/Linux distributions for desktop, like their kernel, may have that goal
too.

Thanks
M.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41

2017-08-11 Thread Kenneth Graunke
On Thursday, August 10, 2017 11:50:30 PM PDT Tapani Pälli wrote:
> I do wonder what the target machine is (I haven't seen one that would 
> not have ARCH_X86_HAVE_SSE4_1 true, both 32bit and 64bit) but falling 
> back to memcpy makes perfect sense without USE_SSE4_1;
> 
> Reviewed-by: Tapani Pälli 

I agree - you really want SSE 4.1 support for decent performance on Atoms.
The MOVNTDQA texture upload path, for example, gave us some really big
performance gains on some ChromeOS workloads.

So, while we should patch this, it'd still be worth figuring out if it's
possible to enable it in Android-x86.

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41

2017-08-11 Thread Emil Velikov
On 11 August 2017 at 12:31, Tapani Pälli  wrote:
> On 08/11/2017 02:23 PM, Grazvydas Ignotas wrote:
>>
>> On Fri, Aug 11, 2017 at 8:52 AM, Kenneth Graunke 
>> wrote:
>>>
>>> This should hopefully fix build issues on 32-bit Android-x86.
>>>
>>> Cc: Mauro Rossi 
>>> Cc: Tapani Pälli 
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050
>>> ---
>>>   src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> Mauro, hopefully this helps?
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c
>>> b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
>>> index ee591168283..9afd98edb87 100644
>>> --- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c
>>> +++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
>>> @@ -342,6 +342,7 @@ brw_get_buffer_subdata(struct gl_context *ctx,
>>>
>>>  unsigned int map_flags = MAP_READ;
>>>  mem_copy_fn memcpy_fn = memcpy;
>>> +#ifdef USE_SSE4_1
>>
>>
>> I don't think anything is defining USE_SSE4_1 in mesa, so you are
>> disabling this code for everyone.
>
>
> Right .. it should be USE_SSE41, r-b with that
>
Hmm you've beat me to it - yes USE_SSE41 it is. With that:
Reviewed-by: Emil Velikov 

Fwiw: The construct can be seen elsewhere - might want to move to a
helper at some later stage.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41

2017-08-11 Thread Tapani Pälli



On 08/11/2017 02:23 PM, Grazvydas Ignotas wrote:

On Fri, Aug 11, 2017 at 8:52 AM, Kenneth Graunke  wrote:

This should hopefully fix build issues on 32-bit Android-x86.

Cc: Mauro Rossi 
Cc: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050
---
  src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 ++
  1 file changed, 2 insertions(+)

Mauro, hopefully this helps?

diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c 
b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
index ee591168283..9afd98edb87 100644
--- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c
+++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
@@ -342,6 +342,7 @@ brw_get_buffer_subdata(struct gl_context *ctx,

 unsigned int map_flags = MAP_READ;
 mem_copy_fn memcpy_fn = memcpy;
+#ifdef USE_SSE4_1


I don't think anything is defining USE_SSE4_1 in mesa, so you are
disabling this code for everyone.


Right .. it should be USE_SSE41, r-b with that



 if (!intel_obj->buffer->cache_coherent && cpu_has_sse4_1) {
/* Rather than acquire a new WB mmaping of the buffer object and pull
 * it into the CPU cache, keep using the WC mmap that we have for 
writes,
@@ -350,6 +351,7 @@ brw_get_buffer_subdata(struct gl_context *ctx,
map_flags |= MAP_COHERENT;
memcpy_fn = (mem_copy_fn) _mesa_streaming_load_memcpy;
 }
+#endif

 void *map = brw_bo_map(brw, intel_obj->buffer, map_flags);
 if (unlikely(!map)) {
--
2.14.0


Gražvydas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41

2017-08-11 Thread Grazvydas Ignotas
On Fri, Aug 11, 2017 at 8:52 AM, Kenneth Graunke  wrote:
> This should hopefully fix build issues on 32-bit Android-x86.
>
> Cc: Mauro Rossi 
> Cc: Tapani Pälli 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050
> ---
>  src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> Mauro, hopefully this helps?
>
> diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c 
> b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
> index ee591168283..9afd98edb87 100644
> --- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c
> +++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
> @@ -342,6 +342,7 @@ brw_get_buffer_subdata(struct gl_context *ctx,
>
> unsigned int map_flags = MAP_READ;
> mem_copy_fn memcpy_fn = memcpy;
> +#ifdef USE_SSE4_1

I don't think anything is defining USE_SSE4_1 in mesa, so you are
disabling this code for everyone.

> if (!intel_obj->buffer->cache_coherent && cpu_has_sse4_1) {
>/* Rather than acquire a new WB mmaping of the buffer object and pull
> * it into the CPU cache, keep using the WC mmap that we have for 
> writes,
> @@ -350,6 +351,7 @@ brw_get_buffer_subdata(struct gl_context *ctx,
>map_flags |= MAP_COHERENT;
>memcpy_fn = (mem_copy_fn) _mesa_streaming_load_memcpy;
> }
> +#endif
>
> void *map = brw_bo_map(brw, intel_obj->buffer, map_flags);
> if (unlikely(!map)) {
> --
> 2.14.0

Gražvydas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41

2017-08-11 Thread Chris Wilson
Quoting Kenneth Graunke (2017-08-11 06:52:47)
> This should hopefully fix build issues on 32-bit Android-x86.
> 
> Cc: Mauro Rossi 
> Cc: Tapani Pälli 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050
> ---
>  src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> Mauro, hopefully this helps?

Which symbol isn't there? I presume cpu_has_sse4_1 is defined to 0 if
!USE_SSE4_1 and so we could do

#ifndef USE_SSE4_1
static inline void *_mesa_streaming_load_memcpy(void *dst, void *src, size_t 
size)
{
unreachable("Illegal CPU");
}
#endif

The compiler should be quiet happy to do the dead-code elimination for
one less ifdef in the code. Worth it?
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41

2017-08-11 Thread Tapani Pälli
I do wonder what the target machine is (I haven't seen one that would 
not have ARCH_X86_HAVE_SSE4_1 true, both 32bit and 64bit) but falling 
back to memcpy makes perfect sense without USE_SSE4_1;


Reviewed-by: Tapani Pälli 

On 08/11/2017 08:52 AM, Kenneth Graunke wrote:

This should hopefully fix build issues on 32-bit Android-x86.

Cc: Mauro Rossi 
Cc: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050
---
  src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 ++
  1 file changed, 2 insertions(+)

Mauro, hopefully this helps?

diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c 
b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
index ee591168283..9afd98edb87 100644
--- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c
+++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
@@ -342,6 +342,7 @@ brw_get_buffer_subdata(struct gl_context *ctx,
  
 unsigned int map_flags = MAP_READ;

 mem_copy_fn memcpy_fn = memcpy;
+#ifdef USE_SSE4_1
 if (!intel_obj->buffer->cache_coherent && cpu_has_sse4_1) {
/* Rather than acquire a new WB mmaping of the buffer object and pull
 * it into the CPU cache, keep using the WC mmap that we have for 
writes,
@@ -350,6 +351,7 @@ brw_get_buffer_subdata(struct gl_context *ctx,
map_flags |= MAP_COHERENT;
memcpy_fn = (mem_copy_fn) _mesa_streaming_load_memcpy;
 }
+#endif
  
 void *map = brw_bo_map(brw, intel_obj->buffer, map_flags);

 if (unlikely(!map)) {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev