Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41
On Saturday, August 12, 2017 1:25:25 AM PDT Mauro Rossi wrote: > On Aug 11, 2017 5:14 PM, "Kenneth Graunke"wrote: > > On Thursday, August 10, 2017 11:50:30 PM PDT Tapani Pälli wrote: > > I do wonder what the target machine is (I haven't seen one that would > > not have ARCH_X86_HAVE_SSE4_1 true, both 32bit and 64bit) but falling > > back to memcpy makes perfect sense without USE_SSE4_1; > > > > Reviewed-by: Tapani Pälli > > I agree - you really want SSE 4.1 support for decent performance on Atoms. > The MOVNTDQA texture upload path, for example, gave us some really big > performance gains on some ChromeOS workloads. > > So, while we should patch this, it'd still be worth figuring out if it's > possible to enable it in Android-x86. > > --Ken > > > Hi, > > I checked on nougat-x86 build 32bit and the patches are Ok. > > The users of android-x86 have the option to build with silvermont.mk and > later > to have an optimized build, but we currently try to support SSE3 devices. > GNU/Linux distributions for desktop, like their kernel, may have that goal > too. > > Thanks > M. Well, right...but we do guard these with runtime checks. So you should be able to build in optional SSE 4.1 support but still have such a build work on SSE3-only hardware... I'm glad to hear it's an option, at least... signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41
On Aug 11, 2017 5:14 PM, "Kenneth Graunke"wrote: On Thursday, August 10, 2017 11:50:30 PM PDT Tapani Pälli wrote: > I do wonder what the target machine is (I haven't seen one that would > not have ARCH_X86_HAVE_SSE4_1 true, both 32bit and 64bit) but falling > back to memcpy makes perfect sense without USE_SSE4_1; > > Reviewed-by: Tapani Pälli I agree - you really want SSE 4.1 support for decent performance on Atoms. The MOVNTDQA texture upload path, for example, gave us some really big performance gains on some ChromeOS workloads. So, while we should patch this, it'd still be worth figuring out if it's possible to enable it in Android-x86. --Ken Hi, I checked on nougat-x86 build 32bit and the patches are Ok. The users of android-x86 have the option to build with silvermont.mk and later to have an optimized build, but we currently try to support SSE3 devices. GNU/Linux distributions for desktop, like their kernel, may have that goal too. Thanks M. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41
On Thursday, August 10, 2017 11:50:30 PM PDT Tapani Pälli wrote: > I do wonder what the target machine is (I haven't seen one that would > not have ARCH_X86_HAVE_SSE4_1 true, both 32bit and 64bit) but falling > back to memcpy makes perfect sense without USE_SSE4_1; > > Reviewed-by: Tapani PälliI agree - you really want SSE 4.1 support for decent performance on Atoms. The MOVNTDQA texture upload path, for example, gave us some really big performance gains on some ChromeOS workloads. So, while we should patch this, it'd still be worth figuring out if it's possible to enable it in Android-x86. --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41
On 11 August 2017 at 12:31, Tapani Pälliwrote: > On 08/11/2017 02:23 PM, Grazvydas Ignotas wrote: >> >> On Fri, Aug 11, 2017 at 8:52 AM, Kenneth Graunke >> wrote: >>> >>> This should hopefully fix build issues on 32-bit Android-x86. >>> >>> Cc: Mauro Rossi >>> Cc: Tapani Pälli >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050 >>> --- >>> src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> Mauro, hopefully this helps? >>> >>> diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c >>> b/src/mesa/drivers/dri/i965/intel_buffer_objects.c >>> index ee591168283..9afd98edb87 100644 >>> --- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c >>> +++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c >>> @@ -342,6 +342,7 @@ brw_get_buffer_subdata(struct gl_context *ctx, >>> >>> unsigned int map_flags = MAP_READ; >>> mem_copy_fn memcpy_fn = memcpy; >>> +#ifdef USE_SSE4_1 >> >> >> I don't think anything is defining USE_SSE4_1 in mesa, so you are >> disabling this code for everyone. > > > Right .. it should be USE_SSE41, r-b with that > Hmm you've beat me to it - yes USE_SSE41 it is. With that: Reviewed-by: Emil Velikov Fwiw: The construct can be seen elsewhere - might want to move to a helper at some later stage. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41
On 08/11/2017 02:23 PM, Grazvydas Ignotas wrote: On Fri, Aug 11, 2017 at 8:52 AM, Kenneth Graunkewrote: This should hopefully fix build issues on 32-bit Android-x86. Cc: Mauro Rossi Cc: Tapani Pälli Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050 --- src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 ++ 1 file changed, 2 insertions(+) Mauro, hopefully this helps? diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c b/src/mesa/drivers/dri/i965/intel_buffer_objects.c index ee591168283..9afd98edb87 100644 --- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c +++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c @@ -342,6 +342,7 @@ brw_get_buffer_subdata(struct gl_context *ctx, unsigned int map_flags = MAP_READ; mem_copy_fn memcpy_fn = memcpy; +#ifdef USE_SSE4_1 I don't think anything is defining USE_SSE4_1 in mesa, so you are disabling this code for everyone. Right .. it should be USE_SSE41, r-b with that if (!intel_obj->buffer->cache_coherent && cpu_has_sse4_1) { /* Rather than acquire a new WB mmaping of the buffer object and pull * it into the CPU cache, keep using the WC mmap that we have for writes, @@ -350,6 +351,7 @@ brw_get_buffer_subdata(struct gl_context *ctx, map_flags |= MAP_COHERENT; memcpy_fn = (mem_copy_fn) _mesa_streaming_load_memcpy; } +#endif void *map = brw_bo_map(brw, intel_obj->buffer, map_flags); if (unlikely(!map)) { -- 2.14.0 Gražvydas ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41
On Fri, Aug 11, 2017 at 8:52 AM, Kenneth Graunkewrote: > This should hopefully fix build issues on 32-bit Android-x86. > > Cc: Mauro Rossi > Cc: Tapani Pälli > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050 > --- > src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 ++ > 1 file changed, 2 insertions(+) > > Mauro, hopefully this helps? > > diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c > b/src/mesa/drivers/dri/i965/intel_buffer_objects.c > index ee591168283..9afd98edb87 100644 > --- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c > +++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c > @@ -342,6 +342,7 @@ brw_get_buffer_subdata(struct gl_context *ctx, > > unsigned int map_flags = MAP_READ; > mem_copy_fn memcpy_fn = memcpy; > +#ifdef USE_SSE4_1 I don't think anything is defining USE_SSE4_1 in mesa, so you are disabling this code for everyone. > if (!intel_obj->buffer->cache_coherent && cpu_has_sse4_1) { >/* Rather than acquire a new WB mmaping of the buffer object and pull > * it into the CPU cache, keep using the WC mmap that we have for > writes, > @@ -350,6 +351,7 @@ brw_get_buffer_subdata(struct gl_context *ctx, >map_flags |= MAP_COHERENT; >memcpy_fn = (mem_copy_fn) _mesa_streaming_load_memcpy; > } > +#endif > > void *map = brw_bo_map(brw, intel_obj->buffer, map_flags); > if (unlikely(!map)) { > -- > 2.14.0 Gražvydas ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41
Quoting Kenneth Graunke (2017-08-11 06:52:47) > This should hopefully fix build issues on 32-bit Android-x86. > > Cc: Mauro Rossi> Cc: Tapani Pälli > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050 > --- > src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 ++ > 1 file changed, 2 insertions(+) > > Mauro, hopefully this helps? Which symbol isn't there? I presume cpu_has_sse4_1 is defined to 0 if !USE_SSE4_1 and so we could do #ifndef USE_SSE4_1 static inline void *_mesa_streaming_load_memcpy(void *dst, void *src, size_t size) { unreachable("Illegal CPU"); } #endif The compiler should be quiet happy to do the dead-code elimination for one less ifdef in the code. Worth it? -Chris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41
I do wonder what the target machine is (I haven't seen one that would not have ARCH_X86_HAVE_SSE4_1 true, both 32bit and 64bit) but falling back to memcpy makes perfect sense without USE_SSE4_1; Reviewed-by: Tapani PälliOn 08/11/2017 08:52 AM, Kenneth Graunke wrote: This should hopefully fix build issues on 32-bit Android-x86. Cc: Mauro Rossi Cc: Tapani Pälli Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050 --- src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 ++ 1 file changed, 2 insertions(+) Mauro, hopefully this helps? diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c b/src/mesa/drivers/dri/i965/intel_buffer_objects.c index ee591168283..9afd98edb87 100644 --- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c +++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c @@ -342,6 +342,7 @@ brw_get_buffer_subdata(struct gl_context *ctx, unsigned int map_flags = MAP_READ; mem_copy_fn memcpy_fn = memcpy; +#ifdef USE_SSE4_1 if (!intel_obj->buffer->cache_coherent && cpu_has_sse4_1) { /* Rather than acquire a new WB mmaping of the buffer object and pull * it into the CPU cache, keep using the WC mmap that we have for writes, @@ -350,6 +351,7 @@ brw_get_buffer_subdata(struct gl_context *ctx, map_flags |= MAP_COHERENT; memcpy_fn = (mem_copy_fn) _mesa_streaming_load_memcpy; } +#endif void *map = brw_bo_map(brw, intel_obj->buffer, map_flags); if (unlikely(!map)) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev