[Mesa-dev] [Bug 108263] glGetTexImage with PBO is not accelerated on Gallium
https://bugs.freedesktop.org/show_bug.cgi?id=108263 --- Comment #1 from Andrew Wesie --- Created attachment 141926 --> https://bugs.freedesktop.org/attachment.cgi?id=141926=edit Patch for piglit for PBO texture download testcase -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 108263] glGetTexImage with PBO is not accelerated on Gallium
https://bugs.freedesktop.org/show_bug.cgi?id=108263 Bug ID: 108263 Summary: glGetTexImage with PBO is not accelerated on Gallium Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: enhancement Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: awe...@gmail.com QA Contact: mesa-dev@lists.freedesktop.org Created attachment 141925 --> https://bugs.freedesktop.org/attachment.cgi?id=141925=edit Mesa patch for accelerated PBO texture downloads In May 2016, a patchset (https://lists.freedesktop.org/archives/mesa-dev/2016-May/117294.html) added acceleration for glReadPixels PBO downloads. Support for glGetTexImage and friends was left as future work. As part of my efforts to find and fix performance hot spots in Wine's directx layer, I submitted patches to support texture downloads using PBOs in Wine. Unfortunately, on Mesa, this does not improve performance for the reason stated above. It would be great if Mesa could add support for accelerated texture downloads using PBOs. In order to facilitate this, I put together a patch based on glReadPixels and a test case in piglit. I am not familiar with the Mesa code or conventions, but the patch passes the test case so it is probably close to correct. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Blits with huge widths/heights
We could certainly increase the width and height to 32 bits for pipe_blit_info, not pipe_box. Marek On Sat, Oct 6, 2018 at 4:01 PM Ilia Mirkin wrote: > > There's a WebGL test here > > https://www.khronos.org/registry/webgl/sdk/tests/conformance2/rendering/blitframebuffer-size-overflow.html > > which does a fairly ridiculous blit: > > srcX0=-1, srcY0=-1, srcX1=2147483647, srcY1=2147483647, > dstX0=-1, dstY0=-1, dstX1=2147483647, dstY1=2147483647 > > The underlying src and dst textures are 8x8. We hit some precision > issues in _mesa_clip_blit, but after fixing those, I run into the > st_BlitFramebuffer logic which has this comment: > >/* NOTE: If the src and dst dimensions don't match, we cannot simply adjust > * the integer coordinates to account for clipping (or scissors) because > that > * would make us cut off fractional parts, affecting the result of the > blit. > */ > > That means that the height gets set to srcY0 - srcY1, which obviously > overflows the int16_t. > > Is there any reasonable way of clipping it down without running into > the fraction parts issue mentioned in the comment? Otherwise we have > to go back to 32-bit height. (Not 100% sure that most drivers would do > anything particularly reasonable here either...) > > Cheers, > > -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Blits with huge widths/heights
There's a WebGL test here https://www.khronos.org/registry/webgl/sdk/tests/conformance2/rendering/blitframebuffer-size-overflow.html which does a fairly ridiculous blit: srcX0=-1, srcY0=-1, srcX1=2147483647, srcY1=2147483647, dstX0=-1, dstY0=-1, dstX1=2147483647, dstY1=2147483647 The underlying src and dst textures are 8x8. We hit some precision issues in _mesa_clip_blit, but after fixing those, I run into the st_BlitFramebuffer logic which has this comment: /* NOTE: If the src and dst dimensions don't match, we cannot simply adjust * the integer coordinates to account for clipping (or scissors) because that * would make us cut off fractional parts, affecting the result of the blit. */ That means that the height gets set to srcY0 - srcY1, which obviously overflows the int16_t. Is there any reasonable way of clipping it down without running into the fraction parts issue mentioned in the comment? Otherwise we have to go back to 32-bit height. (Not 100% sure that most drivers would do anything particularly reasonable here either...) Cheers, -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/12] util: Add power-of-two divisor support to compute_fast_udiv_info
On October 6, 2018 13:19:15 Marek Olšák wrote: On Sat, Oct 6, 2018 at 12:11 AM Jason Ekstrand wrote: From: Marek Olšák --- src/util/fast_idiv_by_const.c | 21 + src/util/fast_idiv_by_const.h | 5 +++-- 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/src/util/fast_idiv_by_const.c b/src/util/fast_idiv_by_const.c index 65a9e640789..7b93316268c 100644 --- a/src/util/fast_idiv_by_const.c +++ b/src/util/fast_idiv_by_const.c @@ -52,6 +52,27 @@ util_compute_fast_udiv_info(uint64_t D, unsigned num_bits, unsigned UINT_BITS) /* The eventual result */ struct util_fast_udiv_info result; + if (util_is_power_of_two_or_zero64(D)) { + unsigned div_shift = util_logbase2_64(D); + + if (div_shift) { + /* Dividing by a power of two. */ + result.multiplier = 1ull << (UINT_BITS - div_shift); + result.pre_shift = 0; + result.post_shift = 0; + result.increment = 0; + return result; + } else { + /* Dividing by 1. */ + /* Assuming: floor((num + 1) * (2^32 - 1) / 2^32) = num */ + result.multiplier = UINT_BITS == 64 ? UINT64_MAX : + (1ull << UINT_BITS) - 1; + result.pre_shift = 0; + result.post_shift = 0; + result.increment = 1; + return result; + } + } /* The extra shift implicit in the difference between UINT_BITS and num_bits */ diff --git a/src/util/fast_idiv_by_const.h b/src/util/fast_idiv_by_const.h index 231311f84be..3363fb9ee71 100644 --- a/src/util/fast_idiv_by_const.h +++ b/src/util/fast_idiv_by_const.h @@ -98,8 +98,9 @@ util_compute_fast_sdiv_info(int64_t D, unsigned SINT_BITS); * emit("result >>>= UINT_BITS") * if m.post_shift > 0: emit("result >>>= m.post_shift") * - * The shifts by UINT_BITS may be "free" if the high half of the full multiply - * is put in a separate register. + * This second version works even if D is a power of two. The shifts by I think you meant to say that the second version works even if D is 1. Correct. I'll fix that. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Mesa-stable] [PATCH] Scons: swr: fix LLVM >= 7 build
Well I am more used with the merge / pull request model of sending patches so I am going to link it instead of inlining: https://raw.githubusercontent.com/pal1000/mesa-dist-win/master/patches/upstream/scons-swr-llvm7.patch This patch depends on series 50108 to be effective but it can be safely merged either before or after it. Unfortunately this patch doesn't help osmesa linking with swr when using llvm >= 7 which is also an issue unaddressed by series 50108. If you try to build both swr and osmesa together when using LLVM 7.0 with Scons you get this after applying this patch otherwise would be way more unresolved symbols. This patch cuts 41 unresolved symbols resulting in successful build when not building osmesa. Generating code Finished generating code Finished generating code Finished generating code Archiving build\windows-x86_64\gallium\drivers\swr\swr.lib ... Linking build\windows-x86_64\gallium\targets\osmesa\osmesa.dll ... Linking build\windows-x86_64\gallium\targets\libgl-gdi\opengl32.dll ... Creating library build\windows-x86_64\gallium\targets\osmesa\osmesa.lib and object build\windows-x86_64\gallium\targets\osmesa\osmesa.exp Creating library build\windows-x86_64\gallium\targets\libgl-gdi\opengl32.lib and object build\windows-x86_64\gallium\targets\libgl-gdi\opengl32.exp swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class llvm::ModulePass * __cdecl llvm::createPGOInstrumentationUseLegacyPass(class llvm::StringRef)" (?createPGOInstrumentationUseLegacyPass@llvm@@YAPEAVModulePass@1@VStringRef@1@@Z) swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class llvm::ModulePass * __cdecl llvm::createPGOInstrumentationGenLegacyPass(void)" (?createPGOInstrumentationGenLegacyPass@llvm@@YAPEAVModulePass@1@XZ) swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class llvm::FunctionPass * __cdecl llvm::createPGOMemOPSizeOptLegacyPass(void)" (?createPGOMemOPSizeOptLegacyPass@llvm@@YAPEAVFunctionPass@1@XZ) swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class llvm::ModulePass * __cdecl llvm::createPGOIndirectCallPromotionLegacyPass(bool,bool)" (?createPGOIndirectCallPromotionLegacyPass@llvm@@YAPEAVModulePass@1@_N0@Z) swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class llvm::ModulePass * __cdecl llvm::createInstrProfilingLegacyPass(struct llvm::InstrProfOptions const &)" (?createInstrProfilingLegacyPass@llvm@@YAPEAVModulePass@1@AEBUInstrProfOptions@1@@Z) swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "public: static struct llvm::GCOVOptions __cdecl llvm::GCOVOptions::getDefault(void)" (?getDefault@GCOVOptions@llvm@@SA?AU12@XZ) swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class llvm::FunctionPass * __cdecl llvm::createBoundsCheckingLegacyPass(void)" (?createBoundsCheckingLegacyPass@llvm@@YAPEAVFunctionPass@1@XZ) swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class llvm::ModulePass * __cdecl llvm::createGCOVProfilerPass(struct llvm::GCOVOptions const &)" (?createGCOVProfilerPass@llvm@@YAPEAVModulePass@1@AEBUGCOVOptions@1@@Z) build\windows-x86_64\gallium\targets\libgl-gdi\opengl32.dll : fatal error LNK1120: 8 unresolved externals scons: *** [build\windows-x86_64\gallium\targets\libgl-gdi\opengl32.dll] Error 1120 swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class llvm::ModulePass * __cdecl llvm::createPGOInstrumentationUseLegacyPass(class llvm::StringRef)" (?createPGOInstrumentationUseLegacyPass@llvm@@YAPEAVModulePass@1@VStringRef@1@@Z) swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class llvm::ModulePass * __cdecl llvm::createPGOInstrumentationGenLegacyPass(void)" (?createPGOInstrumentationGenLegacyPass@llvm@@YAPEAVModulePass@1@XZ) swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class llvm::FunctionPass * __cdecl llvm::createPGOMemOPSizeOptLegacyPass(void)" (?createPGOMemOPSizeOptLegacyPass@llvm@@YAPEAVFunctionPass@1@XZ) swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class llvm::ModulePass * __cdecl llvm::createPGOIndirectCallPromotionLegacyPass(bool,bool)" (?createPGOIndirectCallPromotionLegacyPass@llvm@@YAPEAVModulePass@1@_N0@Z) swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class llvm::ModulePass * __cdecl llvm::createInstrProfilingLegacyPass(struct llvm::InstrProfOptions const &)" (?createInstrProfilingLegacyPass@llvm@@YAPEAVModulePass@1@AEBUInstrProfOptions@1@@Z) swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "public: static struct llvm::GCOVOptions __cdecl llvm::GCOVOptions::getDefault(void)" (?getDefault@GCOVOptions@llvm@@SA?AU12@XZ) swr.lib(JitManager.obj) : error LNK2001: unresolved external symbol "class llvm::FunctionPass * __cdecl llvm::createBoundsCheckingLegacyPass(void)" (?createBoundsCheckingLegacyPass@llvm@@YAPEAVFunctionPass@1@XZ)
Re: [Mesa-dev] [PATCH 06/12] util: Add tests for fast integer division by constants
With my comments addressed, patches 2 - 6 are: Reviewed-by: Marek Olšák Since I will need to compute the division terms during draw calls, I may need to switch the math to uint32_t for my case (e.g. via a C++ template). Marek On Sat, Oct 6, 2018 at 12:11 AM Jason Ekstrand wrote: >> While I generally trust rediculousfish to have done his homework, we've > made some adjustments to suite the needs of mesa and it'd be good to > test those. Also, there's no better place than unit tests to clearly > document the different edge cases of the different methods. > --- > configure.ac | 1 + > src/util/Makefile.am | 3 +- > src/util/meson.build | 1 + > src/util/tests/fast_idiv_by_const/Makefile.am | 43 ++ > .../fast_idiv_by_const_test.cpp | 472 ++ > src/util/tests/fast_idiv_by_const/meson.build | 30 ++ > 6 files changed, 549 insertions(+), 1 deletion(-) > create mode 100644 src/util/tests/fast_idiv_by_const/Makefile.am > create mode 100644 > src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp > create mode 100644 src/util/tests/fast_idiv_by_const/meson.build > > diff --git a/configure.ac b/configure.ac > index 34689826c98..7b0b2b20ba2 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -3198,6 +3198,7 @@ AC_CONFIG_FILES([Makefile > src/util/tests/hash_table/Makefile > src/util/tests/set/Makefile > src/util/tests/string_buffer/Makefile > + src/util/tests/uint_inverse/Makefile > src/util/tests/vma/Makefile > src/util/xmlpool/Makefile > src/vulkan/Makefile]) > diff --git a/src/util/Makefile.am b/src/util/Makefile.am > index d79f2b320be..9e633bf65d5 100644 > --- a/src/util/Makefile.am > +++ b/src/util/Makefile.am > @@ -24,7 +24,8 @@ SUBDIRS = . \ > tests/fast_idiv_by_const \ > tests/hash_table \ > tests/string_buffer \ > - tests/set > + tests/set \ > + tests/uint_inverse > > if HAVE_STD_CXX11 > SUBDIRS += tests/vma > diff --git a/src/util/meson.build b/src/util/meson.build > index cdbad98e7cb..49d84c16ebe 100644 > --- a/src/util/meson.build > +++ b/src/util/meson.build > @@ -170,6 +170,7 @@ if with_tests > ) >) > > + subdir('tests/fast_idiv_by_const') >subdir('tests/hash_table') >subdir('tests/string_buffer') >subdir('tests/vma') > diff --git a/src/util/tests/fast_idiv_by_const/Makefile.am > b/src/util/tests/fast_idiv_by_const/Makefile.am > new file mode 100644 > index 000..1ebee09f59b > --- /dev/null > +++ b/src/util/tests/fast_idiv_by_const/Makefile.am > @@ -0,0 +1,43 @@ > +# Copyright © 2018 Intel > +# > +# Permission is hereby granted, free of charge, to any person obtaining a > +# copy of this software and associated documentation files (the "Software"), > +# to deal in the Software without restriction, including without limitation > +# the rights to use, copy, modify, merge, publish, distribute, sublicense, > +# and/or sell copies of the Software, and to permit persons to whom the > +# Software is furnished to do so, subject to the following conditions: > +# > +# The above copyright notice and this permission notice (including the next > +# paragraph) shall be included in all copies or substantial portions of the > +# Software. > +# > +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > +# IN THE SOFTWARE. > + > +AM_CPPFLAGS = \ > + -I$(top_srcdir)/src \ > + -I$(top_srcdir)/include \ > + -I$(top_srcdir)/src/gallium/include \ > + -I$(top_srcdir)/src/gtest/include \ > + $(PTHREAD_CFLAGS) \ > + $(DEFINES) > + > +TESTS = fast_idiv_by_const_test > + > +check_PROGRAMS = $(TESTS) > + > +fast_idiv_by_const_test_SOURCES = \ > + fast_idiv_by_const_test.cpp > + > +fast_idiv_by_const_test_LDADD = \ > + $(top_builddir)/src/gtest/libgtest.la \ > + $(top_builddir)/src/util/libmesautil.la \ > + $(PTHREAD_LIBS) \ > + $(DLOPEN_LIBS) > + > +EXTRA_DIST = meson.build > diff --git a/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp > b/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp > new file mode 100644 > index 000..34b149e1c6f > --- /dev/null > +++ b/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp > @@ -0,0 +1,472 @@ > +/* > + * Copyright © 2018 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a >
Re: [Mesa-dev] [PATCH 06/12] util: Add tests for fast integer division by constants
On Sat, Oct 6, 2018 at 12:11 AM Jason Ekstrand wrote: > > While I generally trust rediculousfish to have done his homework, we've > made some adjustments to suite the needs of mesa and it'd be good to > test those. Also, there's no better place than unit tests to clearly > document the different edge cases of the different methods. > --- > configure.ac | 1 + > src/util/Makefile.am | 3 +- > src/util/meson.build | 1 + > src/util/tests/fast_idiv_by_const/Makefile.am | 43 ++ > .../fast_idiv_by_const_test.cpp | 472 ++ > src/util/tests/fast_idiv_by_const/meson.build | 30 ++ > 6 files changed, 549 insertions(+), 1 deletion(-) > create mode 100644 src/util/tests/fast_idiv_by_const/Makefile.am > create mode 100644 > src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp > create mode 100644 src/util/tests/fast_idiv_by_const/meson.build > > diff --git a/configure.ac b/configure.ac > index 34689826c98..7b0b2b20ba2 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -3198,6 +3198,7 @@ AC_CONFIG_FILES([Makefile > src/util/tests/hash_table/Makefile > src/util/tests/set/Makefile > src/util/tests/string_buffer/Makefile > + src/util/tests/uint_inverse/Makefile > src/util/tests/vma/Makefile > src/util/xmlpool/Makefile > src/vulkan/Makefile]) > diff --git a/src/util/Makefile.am b/src/util/Makefile.am > index d79f2b320be..9e633bf65d5 100644 > --- a/src/util/Makefile.am > +++ b/src/util/Makefile.am > @@ -24,7 +24,8 @@ SUBDIRS = . \ > tests/fast_idiv_by_const \ > tests/hash_table \ > tests/string_buffer \ > - tests/set > + tests/set \ > + tests/uint_inverse > > if HAVE_STD_CXX11 > SUBDIRS += tests/vma > diff --git a/src/util/meson.build b/src/util/meson.build > index cdbad98e7cb..49d84c16ebe 100644 > --- a/src/util/meson.build > +++ b/src/util/meson.build > @@ -170,6 +170,7 @@ if with_tests > ) >) > > + subdir('tests/fast_idiv_by_const') >subdir('tests/hash_table') >subdir('tests/string_buffer') >subdir('tests/vma') > diff --git a/src/util/tests/fast_idiv_by_const/Makefile.am > b/src/util/tests/fast_idiv_by_const/Makefile.am > new file mode 100644 > index 000..1ebee09f59b > --- /dev/null > +++ b/src/util/tests/fast_idiv_by_const/Makefile.am > @@ -0,0 +1,43 @@ > +# Copyright © 2018 Intel > +# > +# Permission is hereby granted, free of charge, to any person obtaining a > +# copy of this software and associated documentation files (the "Software"), > +# to deal in the Software without restriction, including without limitation > +# the rights to use, copy, modify, merge, publish, distribute, sublicense, > +# and/or sell copies of the Software, and to permit persons to whom the > +# Software is furnished to do so, subject to the following conditions: > +# > +# The above copyright notice and this permission notice (including the next > +# paragraph) shall be included in all copies or substantial portions of the > +# Software. > +# > +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > +# IN THE SOFTWARE. > + > +AM_CPPFLAGS = \ > + -I$(top_srcdir)/src \ > + -I$(top_srcdir)/include \ > + -I$(top_srcdir)/src/gallium/include \ > + -I$(top_srcdir)/src/gtest/include \ > + $(PTHREAD_CFLAGS) \ > + $(DEFINES) > + > +TESTS = fast_idiv_by_const_test > + > +check_PROGRAMS = $(TESTS) > + > +fast_idiv_by_const_test_SOURCES = \ > + fast_idiv_by_const_test.cpp > + > +fast_idiv_by_const_test_LDADD = \ > + $(top_builddir)/src/gtest/libgtest.la \ > + $(top_builddir)/src/util/libmesautil.la \ > + $(PTHREAD_LIBS) \ > + $(DLOPEN_LIBS) > + > +EXTRA_DIST = meson.build > diff --git a/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp > b/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp > new file mode 100644 > index 000..34b149e1c6f > --- /dev/null > +++ b/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp > @@ -0,0 +1,472 @@ > +/* > + * Copyright © 2018 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute,
Re: [Mesa-dev] [PATCH 05/12] util: Add power-of-two divisor support to compute_fast_udiv_info
On Sat, Oct 6, 2018 at 12:11 AM Jason Ekstrand wrote: > > From: Marek Olšák > > --- > src/util/fast_idiv_by_const.c | 21 + > src/util/fast_idiv_by_const.h | 5 +++-- > 2 files changed, 24 insertions(+), 2 deletions(-) > > diff --git a/src/util/fast_idiv_by_const.c b/src/util/fast_idiv_by_const.c > index 65a9e640789..7b93316268c 100644 > --- a/src/util/fast_idiv_by_const.c > +++ b/src/util/fast_idiv_by_const.c > @@ -52,6 +52,27 @@ util_compute_fast_udiv_info(uint64_t D, unsigned num_bits, > unsigned UINT_BITS) > /* The eventual result */ > struct util_fast_udiv_info result; > > + if (util_is_power_of_two_or_zero64(D)) { > + unsigned div_shift = util_logbase2_64(D); > + > + if (div_shift) { > + /* Dividing by a power of two. */ > + result.multiplier = 1ull << (UINT_BITS - div_shift); > + result.pre_shift = 0; > + result.post_shift = 0; > + result.increment = 0; > + return result; > + } else { > + /* Dividing by 1. */ > + /* Assuming: floor((num + 1) * (2^32 - 1) / 2^32) = num */ > + result.multiplier = UINT_BITS == 64 ? UINT64_MAX : > + (1ull << UINT_BITS) - 1; > + result.pre_shift = 0; > + result.post_shift = 0; > + result.increment = 1; > + return result; > + } > + } > > /* The extra shift implicit in the difference between UINT_BITS and > num_bits > */ > diff --git a/src/util/fast_idiv_by_const.h b/src/util/fast_idiv_by_const.h > index 231311f84be..3363fb9ee71 100644 > --- a/src/util/fast_idiv_by_const.h > +++ b/src/util/fast_idiv_by_const.h > @@ -98,8 +98,9 @@ util_compute_fast_sdiv_info(int64_t D, unsigned SINT_BITS); > * emit("result >>>= UINT_BITS") > * if m.post_shift > 0: emit("result >>>= m.post_shift") > * > - * The shifts by UINT_BITS may be "free" if the high half of the full > multiply > - * is put in a separate register. > + * This second version works even if D is a power of two. The shifts by I think you meant to say that the second version works even if D is 1. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 108135] AVX instructions leak outside of CPU feature check and cause SIGILL
https://bugs.freedesktop.org/show_bug.cgi?id=108135 --- Comment #7 from Thiago Macieira --- You can scan for .o that have initialisers by searching for .init_array sections -- You are receiving this mail because: You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel: Introducing Whiskey Lake platform
On 05/10/2018 22:29, Rodrigo Vivi wrote: Whiskey Lake uses the same gen graphics as Coffe Lake, including some ids that were previously marked as reserved on Coffe Lake, but that now are moved to WHL page. This follows the ids and approach used on kernel's commit b9be78531d27 ("drm/i915/whl: Introducing Whiskey Lake platform") and commit c1c8f6fa731b ("drm/i915: Redefine some Whiskey Lake SKUs") v2: Lionel noticed that GT{1,2,3} on kernel wasn't following spec when looking to number of EUs, so kernel has been updated. Cc: Lionel Landwerlin Cc: José Roberto de Souza Cc: Anuj Phogat Signed-off-by: Rodrigo Vivi Looks good, thanks for following up on this : Reviewed-by: Lionel Landwerlin --- include/pci_ids/i965_pci_ids.h | 10 +- src/intel/compiler/test_eu_validate.cpp | 1 + src/intel/dev/gen_device_info.c | 1 + src/intel/tools/aubinator.c | 2 +- 4 files changed, 8 insertions(+), 6 deletions(-) diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h index 4efac638e9..cb33bea7d4 100644 --- a/include/pci_ids/i965_pci_ids.h +++ b/include/pci_ids/i965_pci_ids.h @@ -170,8 +170,6 @@ CHIPSET(0x3185, glk_2x6, "Intel(R) UHD Graphics 600 (Geminilake 2x6)") CHIPSET(0x3E90, cfl_gt1, "Intel(R) UHD Graphics 610 (Coffeelake 2x6 GT1)") CHIPSET(0x3E93, cfl_gt1, "Intel(R) UHD Graphics 610 (Coffeelake 2x6 GT1)") CHIPSET(0x3E99, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)") -CHIPSET(0x3EA1, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)") -CHIPSET(0x3EA4, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)") CHIPSET(0x3E91, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)") CHIPSET(0x3E92, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)") CHIPSET(0x3E96, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)") @@ -179,14 +177,16 @@ CHIPSET(0x3E98, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)") CHIPSET(0x3E9A, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)") CHIPSET(0x3E9B, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)") CHIPSET(0x3E94, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)") -CHIPSET(0x3EA0, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)") -CHIPSET(0x3EA3, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)") CHIPSET(0x3EA9, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)") -CHIPSET(0x3EA2, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)") CHIPSET(0x3EA5, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)") CHIPSET(0x3EA6, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)") CHIPSET(0x3EA7, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)") CHIPSET(0x3EA8, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)") +CHIPSET(0x3EA1, cfl_gt1, "Intel(R) HD Graphics (Whiskey Lake 2x6 GT1)") +CHIPSET(0x3EA4, cfl_gt1, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT1)") +CHIPSET(0x3EA0, cfl_gt2, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT2)") +CHIPSET(0x3EA3, cfl_gt2, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT2)") +CHIPSET(0x3EA2, cfl_gt3, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT3)") CHIPSET(0x5A49, cnl_2x8, "Intel(R) HD Graphics (Cannonlake 2x8 GT0.5)") CHIPSET(0x5A4A, cnl_2x8, "Intel(R) HD Graphics (Cannonlake 2x8 GT0.5)") CHIPSET(0x5A41, cnl_3x8, "Intel(R) HD Graphics (Cannonlake 3x8 GT1)") diff --git a/src/intel/compiler/test_eu_validate.cpp b/src/intel/compiler/test_eu_validate.cpp index 744ae5806d..73300b2312 100644 --- a/src/intel/compiler/test_eu_validate.cpp +++ b/src/intel/compiler/test_eu_validate.cpp @@ -43,6 +43,7 @@ static const struct gen_info { { "aml", }, { "glk", }, { "cfl", }, + { "whl", }, { "cnl", }, { "icl", }, }; diff --git a/src/intel/dev/gen_device_info.c b/src/intel/dev/gen_device_info.c index e2c6cbc710..5dbd060757 100644 --- a/src/intel/dev/gen_device_info.c +++ b/src/intel/dev/gen_device_info.c @@ -60,6 +60,7 @@ gen_device_name_to_pci_device_id(const char *name) { "aml", 0x591C }, { "glk", 0x3185 }, { "cfl", 0x3E9B }, + { "whl", 0x3EA1 }, { "cnl", 0x5a52 }, { "icl", 0x8a52 }, }; diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c index ef0f7650b1..1458875a31 100644 --- a/src/intel/tools/aubinator.c +++ b/src/intel/tools/aubinator.c @@ -300,7 +300,7 @@ int main(int argc, char *argv[]) if (id < 0) { fprintf(stderr, "can't parse gen: '%s', expected brw, g4x, ilk, " "snb, ivb, hsw, byt, bdw, chv, skl, bxt, kbl, " -"aml, glk, cfl, cnl, icl", optarg); +"aml, glk, cfl, whl, cnl, icl", optarg); exit(EXIT_FAILURE); } else { pci_id = id; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] gallivm: Make it possible to disable some optimization shortcuts in release builds
Am Freitag, den 05.10.2018, 21:46 + schrieb Roland Scheidegger: > Looks alright to me. I'm not quite sold on the "safemath" name > though, since "safe math" is usually associated with floating point > optimizations, and this here is just filtering hacks. Maybe > something like disable_filtering_hacks would be more fitting? I think I'll push a shorter version: "no_filter_hacks" best, Gert > > Reviewed-by: Roland Scheidegger > > On 10/05/2018 06:08 AM, Gert Wollny wrote: > > From: Gert Wollny > > > > For testing it is of interest that all tests of dEQP pass, e.g. to > > test > > virglrenderer on a host only providing software rendering like in a > > CI. > > Hence make it possible to disable certain optimizations that make > > tests fail. > > > > While we are there also add some documentation to the flags to make > > it clear > > that this is opt-out. > > > > Setting the environment variable "GALLIVM_PERF=disable_all" can be > > used to make > > the following tests pass in release mode: > > > >dEQP-GLES2.functional.texture.mipmap.2d.affine.*_linear_* > >dEQP-GLES2.functional.texture.mipmap.cube.generate.* > >dEQP- > > GLES2.functional.texture.vertex.2d.filtering.*_mipmap_linear_* > >dEQP-GLES2.functional.texture.vertex.2d.wrap.* > > > > Related: > >https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2 > > Fbugs.freedesktop.org%2Fshow_bug.cgi%3Fid%3D94957data=02%7C01% > > 7Csroland%40vmware.com%7Cca786b57a0ab40daeddd08d62ac3e86d%7Cb39138c > > a3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636743418250432350sdata=cz > > SAaylr1oCdY5lsPRxB7EsVb6mPhMU2e1t1ZuhYnYk%3Dreserved=0 > > > > v2: rename optimization disabling flag to 'safemath' and also move > > the > > nopt flag to the perf flags. > > > > Signed-off-by: Gert Wollny > > --- > > src/gallium/auxiliary/gallivm/lp_bld_debug.h | 16 > > --- > > src/gallium/auxiliary/gallivm/lp_bld_init.c | 25 > > +++ > > src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 6 +++--- > > src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 6 +++--- > > 4 files changed, 32 insertions(+), 21 deletions(-) > > > > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_debug.h > > b/src/gallium/auxiliary/gallivm/lp_bld_debug.h > > index f96a1afa7a..eeef0d6ba6 100644 > > --- a/src/gallium/auxiliary/gallivm/lp_bld_debug.h > > +++ b/src/gallium/auxiliary/gallivm/lp_bld_debug.h > > @@ -39,20 +39,22 @@ > > #define GALLIVM_DEBUG_TGSI (1 << 0) > > #define GALLIVM_DEBUG_IR(1 << 1) > > #define GALLIVM_DEBUG_ASM (1 << 2) > > -#define GALLIVM_DEBUG_NO_OPT(1 << 3) > > -#define GALLIVM_DEBUG_PERF (1 << 4) > > -#define GALLIVM_DEBUG_NO_BRILINEAR (1 << 5) > > -#define GALLIVM_DEBUG_NO_RHO_APPROX (1 << 6) > > -#define GALLIVM_DEBUG_NO_QUAD_LOD (1 << 7) > > -#define GALLIVM_DEBUG_GC(1 << 8) > > -#define GALLIVM_DEBUG_DUMP_BC (1 << 9) > > +#define GALLIVM_DEBUG_PERF (1 << 3) > > +#define GALLIVM_DEBUG_GC(1 << 4) > > +#define GALLIVM_DEBUG_DUMP_BC (1 << 5) > > > > +#define GALLIVM_PERF_NO_BRILINEAR (1 << 0) > > +#define GALLIVM_PERF_NO_RHO_APPROX (1 << 1) > > +#define GALLIVM_PERF_NO_QUAD_LOD (1 << 2) > > +#define GALLIVM_PERF_NO_OPT(1 << 3) > > > > #ifdef __cplusplus > > extern "C" { > > #endif > > > > > > +extern unsigned gallivm_perf; > > + > > #ifdef DEBUG > > extern unsigned gallivm_debug; > > #else > > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c > > b/src/gallium/auxiliary/gallivm/lp_bld_init.c > > index 1f0a01cde6..3f7c4d3154 100644 > > --- a/src/gallium/auxiliary/gallivm/lp_bld_init.c > > +++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c > > @@ -59,6 +59,17 @@ static const bool use_mcjit = USE_MCJIT; > > static bool use_mcjit = FALSE; > > #endif > > > > +unsigned gallivm_perf = 0; > > + > > +static const struct debug_named_value lp_bld_perf_flags[] = { > > + { "no_brilinear", GALLIVM_PERF_NO_BRILINEAR, "disable brilinear > > optimization" }, > > + { "no_rho_approx", GALLIVM_PERF_NO_RHO_APPROX, "disable > > rho_approx optimization" }, > > + { "no_quad_lod", GALLIVM_PERF_NO_QUAD_LOD, "disable quad_lod > > optimization" }, > > + { "nopt", GALLIVM_PERF_NO_OPT, "disable optimization passes > > to speed up shader compilation" }, > > + { "safemath", GALLIVM_PERF_NO_BRILINEAR | > > GALLIVM_PERF_NO_RHO_APPROX | > > + GALLIVM_PERF_NO_QUAD_LOD, "disable unsafe optimizations" }, > > + DEBUG_NAMED_VALUE_END > > +}; > > > > #ifdef DEBUG > > unsigned gallivm_debug = 0; > > @@ -67,11 +78,7 @@ static const struct debug_named_value > > lp_bld_debug_flags[] = { > > { "tgsi", GALLIVM_DEBUG_TGSI, NULL }, > > { "ir", GALLIVM_DEBUG_IR, NULL }, > > { "asm",GALLIVM_DEBUG_ASM, NULL }, > > - { "nopt", GALLIVM_DEBUG_NO_OPT, NULL }, > > { "perf", GALLIVM_DEBUG_PERF, NULL }, > > - { "no_brilinear",
Re: [Mesa-dev] [PATCH:mesa] util: Make xmlconfig.c build on Solaris without d_type in dirent (v2)
On Sat, 6 Oct 2018, 01:34 Alan Coopersmith, wrote: > v2: check for lstat() failing > > Fixes: 04bdbbcab3c "xmlconfig: read more config files from drirc.d/" > Signed-off-by: Alan Coopersmith > Reviewed-by: Roland Mainz > Reviewed-by: Ian Romanick > --- > src/util/xmlconfig.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/src/util/xmlconfig.c b/src/util/xmlconfig.c > index 5264f2598b..5907c68012 100644 > --- a/src/util/xmlconfig.c > +++ b/src/util/xmlconfig.c > @@ -938,8 +938,16 @@ parseOneConfigFile(struct OptConfData *data, const > char *filename) > static int > scandir_filter(const struct dirent *ent) > { > +#ifndef DT_REG /* systems without d_type in dirent results */ > +struct stat st; > + > +if ((lstat(ent->d_name, ) != 0) || > +(!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode))) > + return 0; > +#else > if (ent->d_type != DT_REG && ent->d_type != DT_LNK) > return 0; > +#endif > > if (fnmatch("*.conf", ent->d_name, 0)) > return 0; > Reviewed by roland.ma...@nrubsig.org Bye, Roland > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/9] i965: Pack simple pipelined query objects into the same buffer
Quoting Kenneth Graunke (2018-10-06 02:57:29) > On Tuesday, October 2, 2018 11:06:23 AM PDT Chris Wilson wrote: > > Reuse the same query object buffer for multiple queries within the same > > batch. > > > > A task for the future is propagating the GL_NO_MEMORY errors. > > > > Signed-off-by: Chris Wilson > > Cc: Kenneth Graunke > > Cc: Matt Turner > > --- > > src/mesa/drivers/dri/i965/brw_context.c | 3 ++ > > src/mesa/drivers/dri/i965/brw_context.h | 10 +++-- > > src/mesa/drivers/dri/i965/brw_queryobj.c | 16 +++ > > src/mesa/drivers/dri/i965/gen6_queryobj.c | 51 ++- > > 4 files changed, 59 insertions(+), 21 deletions(-) > > Don't want to do this. This means that WaitQuery will wait on the whole > group of packed queries instead of just the one the app asked about. > > Vulkan has to pack queries by design, and it turns out this was a real > issue there. See b2c97bc789198427043cd902bc76e194e7e81c7d. Jason ended > up making it busy-wait to avoid bo_waiting on the entire pool, and it > improved Serious Engine performance by 20%. > > We could busy-wait in GL too, for lower latency but more CPU waste, > but I think I prefer separate BOs + bo_wait. It's the same latency for wait as a new BO is used for each batch, and waits are on the batch fence not the individual write into the query batch. (So no change whatsoever for waits from the current code.) Polling is improved as we there we can check the individual fence. Pipelined is unaffected. Now we could keep the query buffer across multiple batches and use fences (userspace seqno + batch handles) to poll or wait on the partial writes. -Chris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 108135] AVX instructions leak outside of CPU feature check and cause SIGILL
https://bugs.freedesktop.org/show_bug.cgi?id=108135 --- Comment #6 from Kenneth Graunke --- This does point out an even more serious issue though: Megadrivers interacts very badly with drivers that have global C++ initializers. When dlopen'ing a DRI driver, C++ global constructors will run for ANY driver that is built in, even if that isn't the driver the user is trying to use. In other words, let's say you're on a Raspberry Pi and are trying to use the vc4 driver. If your distro happens to have built SWR, then you'll end up running a bunch of SWR C++ constructors...when trying to dlopen vc4_dri.so...because both live in the same .so file. Such global initialization really ought to happen at driver screen init time, not dlopen time. IMO we should forbid C++ global objects in Mesa drivers. (We have a few in the compiler, but everybody uses the compiler.) I don't know how feasible that is, though. If it isn't, maybe we need to exclude such drivers from the megadrivers mechanism... -- You are receiving this mail because: You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev