[Mesa-dev] [Bug 108263] glGetTexImage with PBO is not accelerated on Gallium

2018-10-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108263

--- Comment #1 from Andrew Wesie  ---
Created attachment 141926
  --> https://bugs.freedesktop.org/attachment.cgi?id=141926=edit
Patch for piglit for PBO texture download testcase

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108263] glGetTexImage with PBO is not accelerated on Gallium

2018-10-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108263

Bug ID: 108263
   Summary: glGetTexImage with PBO is not accelerated on Gallium
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: enhancement
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: awe...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 141925
  --> https://bugs.freedesktop.org/attachment.cgi?id=141925=edit
Mesa patch for accelerated PBO texture downloads

In May 2016, a patchset
(https://lists.freedesktop.org/archives/mesa-dev/2016-May/117294.html) added
acceleration for glReadPixels PBO downloads. Support for glGetTexImage and
friends was left as future work.

As part of my efforts to find and fix performance hot spots in Wine's directx
layer, I submitted patches to support texture downloads using PBOs in Wine.
Unfortunately, on Mesa, this does not improve performance for the reason stated
above.

It would be great if Mesa could add support for accelerated texture downloads
using PBOs. In order to facilitate this, I put together a patch based on
glReadPixels and a test case in piglit. I am not familiar with the Mesa code or
conventions, but the patch passes the test case so it is probably close to
correct.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Blits with huge widths/heights

2018-10-06 Thread Marek Olšák
We could certainly increase the width and height to 32 bits for
pipe_blit_info, not pipe_box.

Marek
On Sat, Oct 6, 2018 at 4:01 PM Ilia Mirkin  wrote:
>
> There's a WebGL test here
>
> https://www.khronos.org/registry/webgl/sdk/tests/conformance2/rendering/blitframebuffer-size-overflow.html
>
> which does a fairly ridiculous blit:
>
> srcX0=-1, srcY0=-1, srcX1=2147483647, srcY1=2147483647,
> dstX0=-1, dstY0=-1, dstX1=2147483647, dstY1=2147483647
>
> The underlying src and dst textures are 8x8. We hit some precision
> issues in _mesa_clip_blit, but after fixing those, I run into the
> st_BlitFramebuffer logic which has this comment:
>
>/* NOTE: If the src and dst dimensions don't match, we cannot simply adjust
> * the integer coordinates to account for clipping (or scissors) because 
> that
> * would make us cut off fractional parts, affecting the result of the 
> blit.
> */
>
> That means that the height gets set to srcY0 - srcY1, which obviously
> overflows the int16_t.
>
> Is there any reasonable way of clipping it down without running into
> the fraction parts issue mentioned in the comment? Otherwise we have
> to go back to 32-bit height. (Not 100% sure that most drivers would do
> anything particularly reasonable here either...)
>
> Cheers,
>
>   -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Blits with huge widths/heights

2018-10-06 Thread Ilia Mirkin
There's a WebGL test here

https://www.khronos.org/registry/webgl/sdk/tests/conformance2/rendering/blitframebuffer-size-overflow.html

which does a fairly ridiculous blit:

srcX0=-1, srcY0=-1, srcX1=2147483647, srcY1=2147483647,
dstX0=-1, dstY0=-1, dstX1=2147483647, dstY1=2147483647

The underlying src and dst textures are 8x8. We hit some precision
issues in _mesa_clip_blit, but after fixing those, I run into the
st_BlitFramebuffer logic which has this comment:

   /* NOTE: If the src and dst dimensions don't match, we cannot simply adjust
* the integer coordinates to account for clipping (or scissors) because that
* would make us cut off fractional parts, affecting the result of the blit.
*/

That means that the height gets set to srcY0 - srcY1, which obviously
overflows the int16_t.

Is there any reasonable way of clipping it down without running into
the fraction parts issue mentioned in the comment? Otherwise we have
to go back to 32-bit height. (Not 100% sure that most drivers would do
anything particularly reasonable here either...)

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/12] util: Add power-of-two divisor support to compute_fast_udiv_info

2018-10-06 Thread Jason Ekstrand

On October 6, 2018 13:19:15 Marek Olšák  wrote:


On Sat, Oct 6, 2018 at 12:11 AM Jason Ekstrand  wrote:


From: Marek Olšák 

---
src/util/fast_idiv_by_const.c | 21 +
src/util/fast_idiv_by_const.h |  5 +++--
2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/src/util/fast_idiv_by_const.c b/src/util/fast_idiv_by_const.c
index 65a9e640789..7b93316268c 100644
--- a/src/util/fast_idiv_by_const.c
+++ b/src/util/fast_idiv_by_const.c
@@ -52,6 +52,27 @@ util_compute_fast_udiv_info(uint64_t D, unsigned 
num_bits, unsigned UINT_BITS)

/* The eventual result */
struct util_fast_udiv_info result;

+   if (util_is_power_of_two_or_zero64(D)) {
+  unsigned div_shift = util_logbase2_64(D);
+
+  if (div_shift) {
+ /* Dividing by a power of two. */
+ result.multiplier = 1ull << (UINT_BITS - div_shift);
+ result.pre_shift = 0;
+ result.post_shift = 0;
+ result.increment = 0;
+ return result;
+  } else {
+ /* Dividing by 1. */
+ /* Assuming: floor((num + 1) * (2^32 - 1) / 2^32) = num */
+ result.multiplier = UINT_BITS == 64 ? UINT64_MAX :
+   (1ull << UINT_BITS) - 1;
+ result.pre_shift = 0;
+ result.post_shift = 0;
+ result.increment = 1;
+ return result;
+  }
+   }

/* The extra shift implicit in the difference between UINT_BITS and num_bits
*/
diff --git a/src/util/fast_idiv_by_const.h b/src/util/fast_idiv_by_const.h
index 231311f84be..3363fb9ee71 100644
--- a/src/util/fast_idiv_by_const.h
+++ b/src/util/fast_idiv_by_const.h
@@ -98,8 +98,9 @@ util_compute_fast_sdiv_info(int64_t D, unsigned SINT_BITS);
*   emit("result >>>= UINT_BITS")
*   if m.post_shift > 0: emit("result >>>= m.post_shift")
*
- * The shifts by UINT_BITS may be "free" if the high half of the full multiply
- * is put in a separate register.
+ * This second version works even if D is a power of two.  The shifts by


I think you meant to say that the second version works even if D is 1.


Correct. I'll fix that.

--Jason



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Mesa-stable] [PATCH] Scons: swr: fix LLVM >= 7 build

2018-10-06 Thread Liviu Prodea
Well I am more used with the merge / pull request model of sending patches so I 
am going to link it instead of inlining:

https://raw.githubusercontent.com/pal1000/mesa-dist-win/master/patches/upstream/scons-swr-llvm7.patch

This patch depends on series 50108 to be effective but it can be safely merged 
either before or after it.

Unfortunately this patch doesn't help osmesa linking with swr when using llvm 
>= 7 which is also an issue unaddressed by series 50108.

If you try to build both swr and osmesa together when using LLVM 7.0 with Scons 
you get this after applying this patch otherwise would be way more unresolved 
symbols. This patch cuts 41 unresolved symbols resulting in successful build 
when not building osmesa.

 

Generating code
Finished generating code
Finished generating code
Finished generating code
  Archiving build\windows-x86_64\gallium\drivers\swr\swr.lib ...
  Linking build\windows-x86_64\gallium\targets\osmesa\osmesa.dll ...
  Linking build\windows-x86_64\gallium\targets\libgl-gdi\opengl32.dll ...
   Creating library build\windows-x86_64\gallium\targets\osmesa\osmesa.lib and 
object build\windows-x86_64\gallium\targets\osmesa\osmesa.exp
   Creating library build\windows-x86_64\gallium\targets\libgl-gdi\opengl32.lib 
and object build\windows-x86_64\gallium\targets\libgl-gdi\opengl32.exp
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class 
llvm::ModulePass * __cdecl llvm::createPGOInstrumentationUseLegacyPass(class 
llvm::StringRef)" 
(?createPGOInstrumentationUseLegacyPass@llvm@@YAPEAVModulePass@1@VStringRef@1@@Z)
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class 
llvm::ModulePass * __cdecl llvm::createPGOInstrumentationGenLegacyPass(void)" 
(?createPGOInstrumentationGenLegacyPass@llvm@@YAPEAVModulePass@1@XZ)
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class 
llvm::FunctionPass * __cdecl llvm::createPGOMemOPSizeOptLegacyPass(void)" 
(?createPGOMemOPSizeOptLegacyPass@llvm@@YAPEAVFunctionPass@1@XZ)
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class 
llvm::ModulePass * __cdecl 
llvm::createPGOIndirectCallPromotionLegacyPass(bool,bool)" 
(?createPGOIndirectCallPromotionLegacyPass@llvm@@YAPEAVModulePass@1@_N0@Z)
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class 
llvm::ModulePass * __cdecl llvm::createInstrProfilingLegacyPass(struct 
llvm::InstrProfOptions const &)" 
(?createInstrProfilingLegacyPass@llvm@@YAPEAVModulePass@1@AEBUInstrProfOptions@1@@Z)
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "public: 
static struct llvm::GCOVOptions __cdecl llvm::GCOVOptions::getDefault(void)" 
(?getDefault@GCOVOptions@llvm@@SA?AU12@XZ)
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class 
llvm::FunctionPass * __cdecl llvm::createBoundsCheckingLegacyPass(void)" 
(?createBoundsCheckingLegacyPass@llvm@@YAPEAVFunctionPass@1@XZ)
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class 
llvm::ModulePass * __cdecl llvm::createGCOVProfilerPass(struct 
llvm::GCOVOptions const &)" 
(?createGCOVProfilerPass@llvm@@YAPEAVModulePass@1@AEBUGCOVOptions@1@@Z)
build\windows-x86_64\gallium\targets\libgl-gdi\opengl32.dll : fatal error 
LNK1120: 8 unresolved externals
scons: *** [build\windows-x86_64\gallium\targets\libgl-gdi\opengl32.dll] Error 
1120
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class 
llvm::ModulePass * __cdecl llvm::createPGOInstrumentationUseLegacyPass(class 
llvm::StringRef)" 
(?createPGOInstrumentationUseLegacyPass@llvm@@YAPEAVModulePass@1@VStringRef@1@@Z)
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class 
llvm::ModulePass * __cdecl llvm::createPGOInstrumentationGenLegacyPass(void)" 
(?createPGOInstrumentationGenLegacyPass@llvm@@YAPEAVModulePass@1@XZ)
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class 
llvm::FunctionPass * __cdecl llvm::createPGOMemOPSizeOptLegacyPass(void)" 
(?createPGOMemOPSizeOptLegacyPass@llvm@@YAPEAVFunctionPass@1@XZ)
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class 
llvm::ModulePass * __cdecl 
llvm::createPGOIndirectCallPromotionLegacyPass(bool,bool)" 
(?createPGOIndirectCallPromotionLegacyPass@llvm@@YAPEAVModulePass@1@_N0@Z)
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class 
llvm::ModulePass * __cdecl llvm::createInstrProfilingLegacyPass(struct 
llvm::InstrProfOptions const &)" 
(?createInstrProfilingLegacyPass@llvm@@YAPEAVModulePass@1@AEBUInstrProfOptions@1@@Z)
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "public: 
static struct llvm::GCOVOptions __cdecl llvm::GCOVOptions::getDefault(void)" 
(?getDefault@GCOVOptions@llvm@@SA?AU12@XZ)
swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class 
llvm::FunctionPass * __cdecl llvm::createBoundsCheckingLegacyPass(void)" 
(?createBoundsCheckingLegacyPass@llvm@@YAPEAVFunctionPass@1@XZ)

Re: [Mesa-dev] [PATCH 06/12] util: Add tests for fast integer division by constants

2018-10-06 Thread Marek Olšák
With my comments addressed, patches 2 - 6 are:

Reviewed-by: Marek Olšák 

Since I will need to compute the division terms during draw calls, I
may need to switch the math to uint32_t for my case (e.g. via a C++
template).

Marek

On Sat, Oct 6, 2018 at 12:11 AM Jason Ekstrand  wrote:
>> While I generally trust rediculousfish to have done his homework, we've
> made some adjustments to suite the needs of mesa and it'd be good to
> test those.  Also, there's no better place than unit tests to clearly
> document the different edge cases of the different methods.
> ---
>  configure.ac  |   1 +
>  src/util/Makefile.am  |   3 +-
>  src/util/meson.build  |   1 +
>  src/util/tests/fast_idiv_by_const/Makefile.am |  43 ++
>  .../fast_idiv_by_const_test.cpp   | 472 ++
>  src/util/tests/fast_idiv_by_const/meson.build |  30 ++
>  6 files changed, 549 insertions(+), 1 deletion(-)
>  create mode 100644 src/util/tests/fast_idiv_by_const/Makefile.am
>  create mode 100644 
> src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp
>  create mode 100644 src/util/tests/fast_idiv_by_const/meson.build
>
> diff --git a/configure.ac b/configure.ac
> index 34689826c98..7b0b2b20ba2 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -3198,6 +3198,7 @@ AC_CONFIG_FILES([Makefile
>   src/util/tests/hash_table/Makefile
>   src/util/tests/set/Makefile
>   src/util/tests/string_buffer/Makefile
> + src/util/tests/uint_inverse/Makefile
>   src/util/tests/vma/Makefile
>   src/util/xmlpool/Makefile
>   src/vulkan/Makefile])
> diff --git a/src/util/Makefile.am b/src/util/Makefile.am
> index d79f2b320be..9e633bf65d5 100644
> --- a/src/util/Makefile.am
> +++ b/src/util/Makefile.am
> @@ -24,7 +24,8 @@ SUBDIRS = . \
> tests/fast_idiv_by_const \
> tests/hash_table \
> tests/string_buffer \
> -   tests/set
> +   tests/set \
> +   tests/uint_inverse
>
>  if HAVE_STD_CXX11
>  SUBDIRS += tests/vma
> diff --git a/src/util/meson.build b/src/util/meson.build
> index cdbad98e7cb..49d84c16ebe 100644
> --- a/src/util/meson.build
> +++ b/src/util/meson.build
> @@ -170,6 +170,7 @@ if with_tests
>  )
>)
>
> +  subdir('tests/fast_idiv_by_const')
>subdir('tests/hash_table')
>subdir('tests/string_buffer')
>subdir('tests/vma')
> diff --git a/src/util/tests/fast_idiv_by_const/Makefile.am 
> b/src/util/tests/fast_idiv_by_const/Makefile.am
> new file mode 100644
> index 000..1ebee09f59b
> --- /dev/null
> +++ b/src/util/tests/fast_idiv_by_const/Makefile.am
> @@ -0,0 +1,43 @@
> +# Copyright © 2018 Intel
> +#
> +#  Permission is hereby granted, free of charge, to any person obtaining a
> +#  copy of this software and associated documentation files (the "Software"),
> +#  to deal in the Software without restriction, including without limitation
> +#  the rights to use, copy, modify, merge, publish, distribute, sublicense,
> +#  and/or sell copies of the Software, and to permit persons to whom the
> +#  Software is furnished to do so, subject to the following conditions:
> +#
> +#  The above copyright notice and this permission notice (including the next
> +#  paragraph) shall be included in all copies or substantial portions of the
> +#  Software.
> +#
> +#  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> +#  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> +#  FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> +#  THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> +#  LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> +#  FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> +#  IN THE SOFTWARE.
> +
> +AM_CPPFLAGS = \
> +   -I$(top_srcdir)/src \
> +   -I$(top_srcdir)/include \
> +   -I$(top_srcdir)/src/gallium/include \
> +   -I$(top_srcdir)/src/gtest/include \
> +   $(PTHREAD_CFLAGS) \
> +   $(DEFINES)
> +
> +TESTS = fast_idiv_by_const_test
> +
> +check_PROGRAMS = $(TESTS)
> +
> +fast_idiv_by_const_test_SOURCES = \
> +   fast_idiv_by_const_test.cpp
> +
> +fast_idiv_by_const_test_LDADD = \
> +   $(top_builddir)/src/gtest/libgtest.la \
> +   $(top_builddir)/src/util/libmesautil.la \
> +   $(PTHREAD_LIBS) \
> +   $(DLOPEN_LIBS)
> +
> +EXTRA_DIST = meson.build
> diff --git a/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp 
> b/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp
> new file mode 100644
> index 000..34b149e1c6f
> --- /dev/null
> +++ b/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp
> @@ -0,0 +1,472 @@
> +/*
> + * Copyright © 2018 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> 

Re: [Mesa-dev] [PATCH 06/12] util: Add tests for fast integer division by constants

2018-10-06 Thread Marek Olšák
On Sat, Oct 6, 2018 at 12:11 AM Jason Ekstrand  wrote:
>
> While I generally trust rediculousfish to have done his homework, we've
> made some adjustments to suite the needs of mesa and it'd be good to
> test those.  Also, there's no better place than unit tests to clearly
> document the different edge cases of the different methods.
> ---
>  configure.ac  |   1 +
>  src/util/Makefile.am  |   3 +-
>  src/util/meson.build  |   1 +
>  src/util/tests/fast_idiv_by_const/Makefile.am |  43 ++
>  .../fast_idiv_by_const_test.cpp   | 472 ++
>  src/util/tests/fast_idiv_by_const/meson.build |  30 ++
>  6 files changed, 549 insertions(+), 1 deletion(-)
>  create mode 100644 src/util/tests/fast_idiv_by_const/Makefile.am
>  create mode 100644 
> src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp
>  create mode 100644 src/util/tests/fast_idiv_by_const/meson.build
>
> diff --git a/configure.ac b/configure.ac
> index 34689826c98..7b0b2b20ba2 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -3198,6 +3198,7 @@ AC_CONFIG_FILES([Makefile
>   src/util/tests/hash_table/Makefile
>   src/util/tests/set/Makefile
>   src/util/tests/string_buffer/Makefile
> + src/util/tests/uint_inverse/Makefile
>   src/util/tests/vma/Makefile
>   src/util/xmlpool/Makefile
>   src/vulkan/Makefile])
> diff --git a/src/util/Makefile.am b/src/util/Makefile.am
> index d79f2b320be..9e633bf65d5 100644
> --- a/src/util/Makefile.am
> +++ b/src/util/Makefile.am
> @@ -24,7 +24,8 @@ SUBDIRS = . \
> tests/fast_idiv_by_const \
> tests/hash_table \
> tests/string_buffer \
> -   tests/set
> +   tests/set \
> +   tests/uint_inverse
>
>  if HAVE_STD_CXX11
>  SUBDIRS += tests/vma
> diff --git a/src/util/meson.build b/src/util/meson.build
> index cdbad98e7cb..49d84c16ebe 100644
> --- a/src/util/meson.build
> +++ b/src/util/meson.build
> @@ -170,6 +170,7 @@ if with_tests
>  )
>)
>
> +  subdir('tests/fast_idiv_by_const')
>subdir('tests/hash_table')
>subdir('tests/string_buffer')
>subdir('tests/vma')
> diff --git a/src/util/tests/fast_idiv_by_const/Makefile.am 
> b/src/util/tests/fast_idiv_by_const/Makefile.am
> new file mode 100644
> index 000..1ebee09f59b
> --- /dev/null
> +++ b/src/util/tests/fast_idiv_by_const/Makefile.am
> @@ -0,0 +1,43 @@
> +# Copyright © 2018 Intel
> +#
> +#  Permission is hereby granted, free of charge, to any person obtaining a
> +#  copy of this software and associated documentation files (the "Software"),
> +#  to deal in the Software without restriction, including without limitation
> +#  the rights to use, copy, modify, merge, publish, distribute, sublicense,
> +#  and/or sell copies of the Software, and to permit persons to whom the
> +#  Software is furnished to do so, subject to the following conditions:
> +#
> +#  The above copyright notice and this permission notice (including the next
> +#  paragraph) shall be included in all copies or substantial portions of the
> +#  Software.
> +#
> +#  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> +#  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> +#  FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> +#  THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> +#  LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> +#  FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> +#  IN THE SOFTWARE.
> +
> +AM_CPPFLAGS = \
> +   -I$(top_srcdir)/src \
> +   -I$(top_srcdir)/include \
> +   -I$(top_srcdir)/src/gallium/include \
> +   -I$(top_srcdir)/src/gtest/include \
> +   $(PTHREAD_CFLAGS) \
> +   $(DEFINES)
> +
> +TESTS = fast_idiv_by_const_test
> +
> +check_PROGRAMS = $(TESTS)
> +
> +fast_idiv_by_const_test_SOURCES = \
> +   fast_idiv_by_const_test.cpp
> +
> +fast_idiv_by_const_test_LDADD = \
> +   $(top_builddir)/src/gtest/libgtest.la \
> +   $(top_builddir)/src/util/libmesautil.la \
> +   $(PTHREAD_LIBS) \
> +   $(DLOPEN_LIBS)
> +
> +EXTRA_DIST = meson.build
> diff --git a/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp 
> b/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp
> new file mode 100644
> index 000..34b149e1c6f
> --- /dev/null
> +++ b/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp
> @@ -0,0 +1,472 @@
> +/*
> + * Copyright © 2018 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, 

Re: [Mesa-dev] [PATCH 05/12] util: Add power-of-two divisor support to compute_fast_udiv_info

2018-10-06 Thread Marek Olšák
On Sat, Oct 6, 2018 at 12:11 AM Jason Ekstrand  wrote:
>
> From: Marek Olšák 
>
> ---
>  src/util/fast_idiv_by_const.c | 21 +
>  src/util/fast_idiv_by_const.h |  5 +++--
>  2 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/src/util/fast_idiv_by_const.c b/src/util/fast_idiv_by_const.c
> index 65a9e640789..7b93316268c 100644
> --- a/src/util/fast_idiv_by_const.c
> +++ b/src/util/fast_idiv_by_const.c
> @@ -52,6 +52,27 @@ util_compute_fast_udiv_info(uint64_t D, unsigned num_bits, 
> unsigned UINT_BITS)
> /* The eventual result */
> struct util_fast_udiv_info result;
>
> +   if (util_is_power_of_two_or_zero64(D)) {
> +  unsigned div_shift = util_logbase2_64(D);
> +
> +  if (div_shift) {
> + /* Dividing by a power of two. */
> + result.multiplier = 1ull << (UINT_BITS - div_shift);
> + result.pre_shift = 0;
> + result.post_shift = 0;
> + result.increment = 0;
> + return result;
> +  } else {
> + /* Dividing by 1. */
> + /* Assuming: floor((num + 1) * (2^32 - 1) / 2^32) = num */
> + result.multiplier = UINT_BITS == 64 ? UINT64_MAX :
> +   (1ull << UINT_BITS) - 1;
> + result.pre_shift = 0;
> + result.post_shift = 0;
> + result.increment = 1;
> + return result;
> +  }
> +   }
>
> /* The extra shift implicit in the difference between UINT_BITS and 
> num_bits
>  */
> diff --git a/src/util/fast_idiv_by_const.h b/src/util/fast_idiv_by_const.h
> index 231311f84be..3363fb9ee71 100644
> --- a/src/util/fast_idiv_by_const.h
> +++ b/src/util/fast_idiv_by_const.h
> @@ -98,8 +98,9 @@ util_compute_fast_sdiv_info(int64_t D, unsigned SINT_BITS);
>   *   emit("result >>>= UINT_BITS")
>   *   if m.post_shift > 0: emit("result >>>= m.post_shift")
>   *
> - * The shifts by UINT_BITS may be "free" if the high half of the full 
> multiply
> - * is put in a separate register.
> + * This second version works even if D is a power of two.  The shifts by

I think you meant to say that the second version works even if D is 1.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108135] AVX instructions leak outside of CPU feature check and cause SIGILL

2018-10-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108135

--- Comment #7 from Thiago Macieira  ---
You can scan for .o that have initialisers by searching for .init_array
sections

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel: Introducing Whiskey Lake platform

2018-10-06 Thread Lionel Landwerlin

On 05/10/2018 22:29, Rodrigo Vivi wrote:

Whiskey Lake uses the same gen graphics as Coffe Lake, including some
ids that were previously marked as reserved on Coffe Lake, but that
now are moved to WHL page.

This follows the ids and approach used on kernel's commit
b9be78531d27 ("drm/i915/whl: Introducing Whiskey Lake platform")
and commit c1c8f6fa731b ("drm/i915: Redefine some Whiskey Lake SKUs")

v2: Lionel noticed that GT{1,2,3} on kernel wasn't following
spec when looking to number of EUs, so kernel has been updated.

Cc: Lionel Landwerlin 
Cc: José Roberto de Souza 
Cc: Anuj Phogat 
Signed-off-by: Rodrigo Vivi 



Looks good, thanks for following up on this :


Reviewed-by: Lionel Landwerlin 



---
  include/pci_ids/i965_pci_ids.h  | 10 +-
  src/intel/compiler/test_eu_validate.cpp |  1 +
  src/intel/dev/gen_device_info.c |  1 +
  src/intel/tools/aubinator.c |  2 +-
  4 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
index 4efac638e9..cb33bea7d4 100644
--- a/include/pci_ids/i965_pci_ids.h
+++ b/include/pci_ids/i965_pci_ids.h
@@ -170,8 +170,6 @@ CHIPSET(0x3185, glk_2x6, "Intel(R) UHD Graphics 600 (Geminilake 
2x6)")
  CHIPSET(0x3E90, cfl_gt1, "Intel(R) UHD Graphics 610 (Coffeelake 2x6 GT1)")
  CHIPSET(0x3E93, cfl_gt1, "Intel(R) UHD Graphics 610 (Coffeelake 2x6 GT1)")
  CHIPSET(0x3E99, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
-CHIPSET(0x3EA1, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
-CHIPSET(0x3EA4, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
  CHIPSET(0x3E91, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")
  CHIPSET(0x3E92, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")
  CHIPSET(0x3E96, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
@@ -179,14 +177,16 @@ CHIPSET(0x3E98, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 
GT2)")
  CHIPSET(0x3E9A, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
  CHIPSET(0x3E9B, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")
  CHIPSET(0x3E94, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
-CHIPSET(0x3EA0, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
-CHIPSET(0x3EA3, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
  CHIPSET(0x3EA9, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
-CHIPSET(0x3EA2, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
  CHIPSET(0x3EA5, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
  CHIPSET(0x3EA6, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
  CHIPSET(0x3EA7, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
  CHIPSET(0x3EA8, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
+CHIPSET(0x3EA1, cfl_gt1, "Intel(R) HD Graphics (Whiskey Lake 2x6 GT1)")
+CHIPSET(0x3EA4, cfl_gt1, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT1)")
+CHIPSET(0x3EA0, cfl_gt2, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT2)")
+CHIPSET(0x3EA3, cfl_gt2, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT2)")
+CHIPSET(0x3EA2, cfl_gt3, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT3)")
  CHIPSET(0x5A49, cnl_2x8, "Intel(R) HD Graphics (Cannonlake 2x8 GT0.5)")
  CHIPSET(0x5A4A, cnl_2x8, "Intel(R) HD Graphics (Cannonlake 2x8 GT0.5)")
  CHIPSET(0x5A41, cnl_3x8, "Intel(R) HD Graphics (Cannonlake 3x8 GT1)")
diff --git a/src/intel/compiler/test_eu_validate.cpp 
b/src/intel/compiler/test_eu_validate.cpp
index 744ae5806d..73300b2312 100644
--- a/src/intel/compiler/test_eu_validate.cpp
+++ b/src/intel/compiler/test_eu_validate.cpp
@@ -43,6 +43,7 @@ static const struct gen_info {
 { "aml", },
 { "glk", },
 { "cfl", },
+   { "whl", },
 { "cnl", },
 { "icl", },
  };
diff --git a/src/intel/dev/gen_device_info.c b/src/intel/dev/gen_device_info.c
index e2c6cbc710..5dbd060757 100644
--- a/src/intel/dev/gen_device_info.c
+++ b/src/intel/dev/gen_device_info.c
@@ -60,6 +60,7 @@ gen_device_name_to_pci_device_id(const char *name)
{ "aml", 0x591C },
{ "glk", 0x3185 },
{ "cfl", 0x3E9B },
+  { "whl", 0x3EA1 },
{ "cnl", 0x5a52 },
{ "icl", 0x8a52 },
 };
diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index ef0f7650b1..1458875a31 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -300,7 +300,7 @@ int main(int argc, char *argv[])
   if (id < 0) {
  fprintf(stderr, "can't parse gen: '%s', expected brw, g4x, ilk, "
  "snb, ivb, hsw, byt, bdw, chv, skl, bxt, kbl, "
-"aml, glk, cfl, cnl, icl", optarg);
+"aml, glk, cfl, whl, cnl, icl", optarg);
  exit(EXIT_FAILURE);
   } else {
  pci_id = id;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] gallivm: Make it possible to disable some optimization shortcuts in release builds

2018-10-06 Thread Gert Wollny
Am Freitag, den 05.10.2018, 21:46 + schrieb Roland Scheidegger:
> Looks alright to me. I'm not quite sold on the "safemath" name
> though, since "safe math" is usually associated with floating point 
> optimizations, and this here is just filtering hacks. Maybe
> something like disable_filtering_hacks would be more fitting?
I think I'll push a shorter version: "no_filter_hacks"

best, 
Gert 
> 
> Reviewed-by: Roland Scheidegger 
> 
> On 10/05/2018 06:08 AM, Gert Wollny wrote:
> > From: Gert Wollny 
> > 
> > For testing it is of interest that all tests of dEQP pass, e.g. to
> > test
> > virglrenderer on a host only providing software rendering like in a
> > CI.
> > Hence make it possible to disable certain optimizations that make
> > tests fail.
> > 
> > While we are there also add some documentation to the flags to make
> > it clear
> > that this is opt-out.
> > 
> > Setting the environment variable "GALLIVM_PERF=disable_all" can be
> > used to make
> > the following tests pass in release mode:
> > 
> >dEQP-GLES2.functional.texture.mipmap.2d.affine.*_linear_*
> >dEQP-GLES2.functional.texture.mipmap.cube.generate.*
> >dEQP-
> > GLES2.functional.texture.vertex.2d.filtering.*_mipmap_linear_*
> >dEQP-GLES2.functional.texture.vertex.2d.wrap.*
> > 
> > Related:
> >https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2
> > Fbugs.freedesktop.org%2Fshow_bug.cgi%3Fid%3D94957data=02%7C01%
> > 7Csroland%40vmware.com%7Cca786b57a0ab40daeddd08d62ac3e86d%7Cb39138c
> > a3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636743418250432350sdata=cz
> > SAaylr1oCdY5lsPRxB7EsVb6mPhMU2e1t1ZuhYnYk%3Dreserved=0
> > 
> > v2: rename optimization disabling flag to 'safemath' and also move
> > the
> >  nopt flag to the perf flags.
> > 
> > Signed-off-by: Gert Wollny 
> > ---
> >   src/gallium/auxiliary/gallivm/lp_bld_debug.h  | 16 
> > ---
> >   src/gallium/auxiliary/gallivm/lp_bld_init.c   | 25
> > +++
> >   src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |  6 +++---
> >   src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c   |  6 +++---
> >   4 files changed, 32 insertions(+), 21 deletions(-)
> > 
> > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_debug.h
> > b/src/gallium/auxiliary/gallivm/lp_bld_debug.h
> > index f96a1afa7a..eeef0d6ba6 100644
> > --- a/src/gallium/auxiliary/gallivm/lp_bld_debug.h
> > +++ b/src/gallium/auxiliary/gallivm/lp_bld_debug.h
> > @@ -39,20 +39,22 @@
> >   #define GALLIVM_DEBUG_TGSI  (1 << 0)
> >   #define GALLIVM_DEBUG_IR(1 << 1)
> >   #define GALLIVM_DEBUG_ASM   (1 << 2)
> > -#define GALLIVM_DEBUG_NO_OPT(1 << 3)
> > -#define GALLIVM_DEBUG_PERF  (1 << 4)
> > -#define GALLIVM_DEBUG_NO_BRILINEAR  (1 << 5)
> > -#define GALLIVM_DEBUG_NO_RHO_APPROX (1 << 6)
> > -#define GALLIVM_DEBUG_NO_QUAD_LOD   (1 << 7)
> > -#define GALLIVM_DEBUG_GC(1 << 8)
> > -#define GALLIVM_DEBUG_DUMP_BC   (1 << 9)
> > +#define GALLIVM_DEBUG_PERF  (1 << 3)
> > +#define GALLIVM_DEBUG_GC(1 << 4)
> > +#define GALLIVM_DEBUG_DUMP_BC   (1 << 5)
> >   
> > +#define GALLIVM_PERF_NO_BRILINEAR  (1 << 0)
> > +#define GALLIVM_PERF_NO_RHO_APPROX (1 << 1)
> > +#define GALLIVM_PERF_NO_QUAD_LOD   (1 << 2)
> > +#define GALLIVM_PERF_NO_OPT(1 << 3)
> >   
> >   #ifdef __cplusplus
> >   extern "C" {
> >   #endif
> >   
> >   
> > +extern unsigned gallivm_perf;
> > +
> >   #ifdef DEBUG
> >   extern unsigned gallivm_debug;
> >   #else
> > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c
> > b/src/gallium/auxiliary/gallivm/lp_bld_init.c
> > index 1f0a01cde6..3f7c4d3154 100644
> > --- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
> > +++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
> > @@ -59,6 +59,17 @@ static const bool use_mcjit = USE_MCJIT;
> >   static bool use_mcjit = FALSE;
> >   #endif
> >   
> > +unsigned gallivm_perf = 0;
> > +
> > +static const struct debug_named_value lp_bld_perf_flags[] = {
> > +   { "no_brilinear", GALLIVM_PERF_NO_BRILINEAR, "disable brilinear
> > optimization" },
> > +   { "no_rho_approx", GALLIVM_PERF_NO_RHO_APPROX, "disable
> > rho_approx optimization" },
> > +   { "no_quad_lod", GALLIVM_PERF_NO_QUAD_LOD, "disable quad_lod
> > optimization" },
> > +   { "nopt",   GALLIVM_PERF_NO_OPT, "disable optimization passes
> > to speed up shader compilation" },
> > +   { "safemath", GALLIVM_PERF_NO_BRILINEAR |
> > GALLIVM_PERF_NO_RHO_APPROX |
> > + GALLIVM_PERF_NO_QUAD_LOD, "disable unsafe optimizations" },
> > +   DEBUG_NAMED_VALUE_END
> > +};
> >   
> >   #ifdef DEBUG
> >   unsigned gallivm_debug = 0;
> > @@ -67,11 +78,7 @@ static const struct debug_named_value
> > lp_bld_debug_flags[] = {
> >  { "tgsi",   GALLIVM_DEBUG_TGSI, NULL },
> >  { "ir", GALLIVM_DEBUG_IR, NULL },
> >  { "asm",GALLIVM_DEBUG_ASM, NULL },
> > -   { "nopt",   GALLIVM_DEBUG_NO_OPT, NULL },
> >  { "perf",   GALLIVM_DEBUG_PERF, NULL },
> > -   { "no_brilinear", 

Re: [Mesa-dev] [PATCH:mesa] util: Make xmlconfig.c build on Solaris without d_type in dirent (v2)

2018-10-06 Thread Roland Mainz
On Sat, 6 Oct 2018, 01:34 Alan Coopersmith, 
wrote:

> v2: check for lstat() failing
>
> Fixes: 04bdbbcab3c "xmlconfig: read more config files from drirc.d/"
> Signed-off-by: Alan Coopersmith 
> Reviewed-by: Roland Mainz 
> Reviewed-by: Ian Romanick 
> ---
>  src/util/xmlconfig.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/src/util/xmlconfig.c b/src/util/xmlconfig.c
> index 5264f2598b..5907c68012 100644
> --- a/src/util/xmlconfig.c
> +++ b/src/util/xmlconfig.c
> @@ -938,8 +938,16 @@ parseOneConfigFile(struct OptConfData *data, const
> char *filename)
>  static int
>  scandir_filter(const struct dirent *ent)
>  {
> +#ifndef DT_REG /* systems without d_type in dirent results */
> +struct stat st;
> +
> +if ((lstat(ent->d_name, ) != 0) ||
> +(!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode)))
> +   return 0;
> +#else
>  if (ent->d_type != DT_REG && ent->d_type != DT_LNK)
> return 0;
> +#endif
>
>  if (fnmatch("*.conf", ent->d_name, 0))
> return 0;
>

Reviewed by roland.ma...@nrubsig.org



Bye,
Roland

>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/9] i965: Pack simple pipelined query objects into the same buffer

2018-10-06 Thread Chris Wilson
Quoting Kenneth Graunke (2018-10-06 02:57:29)
> On Tuesday, October 2, 2018 11:06:23 AM PDT Chris Wilson wrote:
> > Reuse the same query object buffer for multiple queries within the same
> > batch.
> > 
> > A task for the future is propagating the GL_NO_MEMORY errors.
> > 
> > Signed-off-by: Chris Wilson 
> > Cc: Kenneth Graunke 
> > Cc: Matt Turner 
> > ---
> >  src/mesa/drivers/dri/i965/brw_context.c   |  3 ++
> >  src/mesa/drivers/dri/i965/brw_context.h   | 10 +++--
> >  src/mesa/drivers/dri/i965/brw_queryobj.c  | 16 +++
> >  src/mesa/drivers/dri/i965/gen6_queryobj.c | 51 ++-
> >  4 files changed, 59 insertions(+), 21 deletions(-)
> 
> Don't want to do this.  This means that WaitQuery will wait on the whole
> group of packed queries instead of just the one the app asked about.
> 
> Vulkan has to pack queries by design, and it turns out this was a real
> issue there.  See b2c97bc789198427043cd902bc76e194e7e81c7d.  Jason ended
> up making it busy-wait to avoid bo_waiting on the entire pool, and it
> improved Serious Engine performance by 20%.
> 
> We could busy-wait in GL too, for lower latency but more CPU waste,
> but I think I prefer separate BOs + bo_wait.

It's the same latency for wait as a new BO is used for each batch, and
waits are on the batch fence not the individual write into the query
batch. (So no change whatsoever for waits from the current code.) Polling
is improved as we there we can check the individual fence. Pipelined is
unaffected.

Now we could keep the query buffer across multiple batches and use
fences (userspace seqno + batch handles) to poll or wait on the partial
writes.
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108135] AVX instructions leak outside of CPU feature check and cause SIGILL

2018-10-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108135

--- Comment #6 from Kenneth Graunke  ---
This does point out an even more serious issue though:

Megadrivers interacts very badly with drivers that have global C++
initializers.  When dlopen'ing a DRI driver, C++ global constructors will run
for ANY driver that is built in, even if that isn't the driver the user is
trying to use.

In other words, let's say you're on a Raspberry Pi and are trying to use the
vc4 driver.  If your distro happens to have built SWR, then you'll end up
running a bunch of SWR C++ constructors...when trying to dlopen
vc4_dri.so...because both live in the same .so file.

Such global initialization really ought to happen at driver screen init time,
not dlopen time.  IMO we should forbid C++ global objects in Mesa drivers.  (We
have a few in the compiler, but everybody uses the compiler.)  I don't know how
feasible that is, though.  If it isn't, maybe we need to exclude such drivers
from the megadrivers mechanism...

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev