[Mesa-dev] two radv regressions with latest addrlib import
Just documenting these here, I'm out for a few days so might not get time to fix them: I've sent a patch for the first. 1. Small 1D mipmapped textures tests all failed after Fixes: 36149998 amdgpu/addrlib: Rewrite tile mode optimization code This looks like the 1D base level gets reduced to a linear aligned, and things go wrong after that, the fix stops it. 2. S8_UINT is broken Since it looks like radeonsi never uses s8_uint amdgpu/addrlib: Add new flags minimizePadding and maxBaseAlign is what breaks it, it appears to be validating HwlGetAlignmentInfoMacroTiled that is failing as we loop through all the non-stencil levels, and we get a return and stop before calculating the image size fully. Even hacking that doesn't make it instantly fix it, but I'll try and track it down a bit more. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/glthread: Avoid unnecessary batch reallocation
On 04/04/17 13:54, Bartosz Tomczyk wrote: Thank you Timothy. Sorry about that, I'm still quite to new git/Mesa workflow. I will do better in future. Not a problem :) Thanks for the patches. On Apr 4, 2017 01:54, "Timothy Arceri"> wrote: I've pushed this. Thanks! In future please add the version to the subject when sending new revisions. You can do this with the -v option when using git send-email e.g -v2 will result in [PATCH v2] Also please add changes to the patch since v1 in the commit message. e.g. V2: set batch->used = 0 rather than memsetting the batch to 0. This helps those reviewing to quickly compare the updates since the first revision. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/glthread: Avoid unnecessary batch reallocation
Thank you Timothy. Sorry about that, I'm still quite to new git/Mesa workflow. I will do better in future. On Apr 4, 2017 01:54, "Timothy Arceri"wrote: I've pushed this. Thanks! In future please add the version to the subject when sending new revisions. You can do this with the -v option when using git send-email e.g -v2 will result in [PATCH v2] Also please add changes to the patch since v1 in the commit message. e.g. V2: set batch->used = 0 rather than memsetting the batch to 0. This helps those reviewing to quickly compare the updates since the first revision. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] addrlib: don't use linear aligned when pow2Pad is selected.
From: Dave AirlieThe below commit caused a regression on radv with 1D textures, this was because the base level for these small textures was being degraded to a linear level due to this code. This stops the degradation to linear when the pow2Pad bit it set (this is set when there are miplevels). I'm not sure if this is the correct fix, it works at least, maybe there is still a bug in radv somewhere. Fixes: 36149998 amdgpu/addrlib: Rewrite tile mode optmization code Signed-off-by: Dave Airlie --- src/amd/addrlib/core/addrlib1.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/amd/addrlib/core/addrlib1.cpp b/src/amd/addrlib/core/addrlib1.cpp index 2d640cf..9f783a5 100644 --- a/src/amd/addrlib/core/addrlib1.cpp +++ b/src/amd/addrlib/core/addrlib1.cpp @@ -3616,6 +3616,7 @@ VOID Lib::OptimizeTileMode( (ElemLib::IsBlockCompressed(pInOut->format) == FALSE) && (pInOut->flags.depth == FALSE) && (pInOut->flags.stencil == FALSE) && +(pInOut->flags.pow2Pad == FALSE) && (m_configFlags.disableLinearOpt == FALSE) && (pInOut->flags.disableLinearOpt == FALSE)) { -- 2.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/19] tgsi: add BALLOT/READ_* opcodes
On Mon, Apr 3, 2017 at 10:01 PM, Boyan Dingwrote: > 2017-04-01 1:14 GMT+08:00 Nicolai Hähnle : >> From: Ilia Mirkin >> >> v2 (Nicolai): >> - BALLOT isn't per-channel >> - expand the documentation (also for VOTE_*) >> >> Signed-off-by: Ilia Mirkin >> Signed-off-by: Nicolai Hähnle >> --- >> src/gallium/auxiliary/tgsi/tgsi_info.c | 6 +-- >> src/gallium/docs/source/tgsi.rst | 67 >> +- >> src/gallium/include/pipe/p_shader_tokens.h | 6 +-- >> 3 files changed, 62 insertions(+), 17 deletions(-) >> >> diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c >> b/src/gallium/auxiliary/tgsi/tgsi_info.c >> index 5a6a9bc..30bad6d 100644 >> --- a/src/gallium/auxiliary/tgsi/tgsi_info.c >> +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c >> @@ -106,51 +106,51 @@ static const struct tgsi_opcode_info >> opcode_info[TGSI_OPCODE_LAST] = >> { 1, 3, 0, 0, 0, 0, 0, COMP, "CMP", TGSI_OPCODE_CMP }, >> { 1, 1, 0, 0, 0, 0, 0, CHAN, "SCS", TGSI_OPCODE_SCS }, >> { 1, 2, 1, 0, 0, 0, 0, OTHR, "TXB", TGSI_OPCODE_TXB }, >> { 1, 1, 0, 0, 0, 0, 0, OTHR, "FBFETCH", TGSI_OPCODE_FBFETCH }, >> { 1, 2, 0, 0, 0, 0, 0, COMP, "DIV", TGSI_OPCODE_DIV }, >> { 1, 2, 0, 0, 0, 0, 0, REPL, "DP2", TGSI_OPCODE_DP2 }, >> { 1, 2, 1, 0, 0, 0, 0, OTHR, "TXL", TGSI_OPCODE_TXL }, >> { 0, 0, 0, 0, 0, 0, 0, NONE, "BRK", TGSI_OPCODE_BRK }, >> { 0, 1, 0, 0, 1, 0, 1, NONE, "IF", TGSI_OPCODE_IF }, >> { 0, 1, 0, 0, 1, 0, 1, NONE, "UIF", TGSI_OPCODE_UIF }, >> - { 0, 1, 0, 0, 0, 0, 1, NONE, "", 76 }, /* removed */ >> + { 1, 2, 0, 0, 0, 0, 0, COMP, "READ_INVOC", TGSI_OPCODE_READ_INVOC }, >> { 0, 0, 0, 0, 1, 1, 1, NONE, "ELSE", TGSI_OPCODE_ELSE }, >> { 0, 0, 0, 0, 0, 1, 0, NONE, "ENDIF", TGSI_OPCODE_ENDIF }, >> { 1, 1, 0, 0, 0, 0, 0, COMP, "DDX_FINE", TGSI_OPCODE_DDX_FINE }, >> { 1, 1, 0, 0, 0, 0, 0, COMP, "DDY_FINE", TGSI_OPCODE_DDY_FINE }, >> { 0, 1, 0, 0, 0, 0, 0, NONE, "PUSHA", TGSI_OPCODE_PUSHA }, >> { 1, 0, 0, 0, 0, 0, 0, NONE, "POPA", TGSI_OPCODE_POPA }, >> { 1, 1, 0, 0, 0, 0, 0, COMP, "CEIL", TGSI_OPCODE_CEIL }, >> { 1, 1, 0, 0, 0, 0, 0, COMP, "I2F", TGSI_OPCODE_I2F }, >> { 1, 1, 0, 0, 0, 0, 0, COMP, "NOT", TGSI_OPCODE_NOT }, >> { 1, 1, 0, 0, 0, 0, 0, COMP, "TRUNC", TGSI_OPCODE_TRUNC }, >> { 1, 2, 0, 0, 0, 0, 0, COMP, "SHL", TGSI_OPCODE_SHL }, >> - { 0, 0, 0, 0, 0, 0, 0, NONE, "", 88 }, /* removed */ >> + { 1, 1, 0, 0, 0, 0, 0, OTHR, "BALLOT", TGSI_OPCODE_BALLOT }, >> { 1, 2, 0, 0, 0, 0, 0, COMP, "AND", TGSI_OPCODE_AND }, >> { 1, 2, 0, 0, 0, 0, 0, COMP, "OR", TGSI_OPCODE_OR }, >> { 1, 2, 0, 0, 0, 0, 0, COMP, "MOD", TGSI_OPCODE_MOD }, >> { 1, 2, 0, 0, 0, 0, 0, COMP, "XOR", TGSI_OPCODE_XOR }, >> { 1, 3, 0, 0, 0, 0, 0, COMP, "SAD", TGSI_OPCODE_SAD }, >> { 1, 2, 1, 0, 0, 0, 0, OTHR, "TXF", TGSI_OPCODE_TXF }, >> { 1, 2, 1, 0, 0, 0, 0, OTHR, "TXQ", TGSI_OPCODE_TXQ }, >> { 0, 0, 0, 0, 0, 0, 0, NONE, "CONT", TGSI_OPCODE_CONT }, >> { 0, 1, 0, 0, 0, 0, 0, NONE, "EMIT", TGSI_OPCODE_EMIT }, >> { 0, 1, 0, 0, 0, 0, 0, NONE, "ENDPRIM", TGSI_OPCODE_ENDPRIM }, >> { 0, 0, 0, 0, 1, 0, 1, NONE, "BGNLOOP", TGSI_OPCODE_BGNLOOP }, >> { 0, 0, 0, 0, 0, 0, 1, NONE, "BGNSUB", TGSI_OPCODE_BGNSUB }, >> { 0, 0, 0, 0, 1, 1, 0, NONE, "ENDLOOP", TGSI_OPCODE_ENDLOOP }, >> { 0, 0, 0, 0, 0, 1, 0, NONE, "ENDSUB", TGSI_OPCODE_ENDSUB }, >> { 1, 1, 1, 0, 0, 0, 0, OTHR, "TXQ_LZ", TGSI_OPCODE_TXQ_LZ }, >> { 1, 1, 1, 0, 0, 0, 0, OTHR, "TXQS", TGSI_OPCODE_TXQS }, >> { 1, 1, 0, 0, 0, 0, 0, OTHR, "RESQ", TGSI_OPCODE_RESQ }, >> - { 0, 0, 0, 0, 0, 0, 0, NONE, "", 106 }, /* removed */ >> + { 1, 1, 0, 0, 0, 0, 0, COMP, "READ_FIRST", TGSI_OPCODE_READ_FIRST }, >> { 0, 0, 0, 0, 0, 0, 0, NONE, "NOP", TGSI_OPCODE_NOP }, >> { 1, 2, 0, 0, 0, 0, 0, COMP, "FSEQ", TGSI_OPCODE_FSEQ }, >> { 1, 2, 0, 0, 0, 0, 0, COMP, "FSGE", TGSI_OPCODE_FSGE }, >> { 1, 2, 0, 0, 0, 0, 0, COMP, "FSLT", TGSI_OPCODE_FSLT }, >> { 1, 2, 0, 0, 0, 0, 0, COMP, "FSNE", TGSI_OPCODE_FSNE }, >> { 0, 1, 0, 0, 0, 0, 0, OTHR, "MEMBAR", TGSI_OPCODE_MEMBAR }, >> { 0, 1, 0, 0, 0, 0, 0, NONE, "CALLNZ", TGSI_OPCODE_CALLNZ }, >> { 0, 1, 0, 0, 0, 0, 0, NONE, "", 114 }, /* removed */ >> { 0, 1, 0, 0, 0, 0, 0, NONE, "BREAKC", TGSI_OPCODE_BREAKC }, >> { 0, 1, 0, 0, 0, 0, 0, NONE, "KILL_IF", TGSI_OPCODE_KILL_IF }, >> diff --git a/src/gallium/docs/source/tgsi.rst >> b/src/gallium/docs/source/tgsi.rst >> index 05b06ce..7e9b47c 100644 >> --- a/src/gallium/docs/source/tgsi.rst >> +++ b/src/gallium/docs/source/tgsi.rst >> @@ -2852,36 +2852,81 @@ only be used with 32-bit integer image formats. >> >>The following operation is performed atomically: >> >> .. math:: >> >>dst_x = resource[offset] >> >>resource[offset] = (dst_x > src_x ? dst_x : src_x) >> >> >> -.. _voteopcodes: >> +..
Re: [Mesa-dev] [PATCH 07/19] tgsi: add BALLOT/READ_* opcodes
2017-04-01 1:14 GMT+08:00 Nicolai Hähnle: > From: Ilia Mirkin > > v2 (Nicolai): > - BALLOT isn't per-channel > - expand the documentation (also for VOTE_*) > > Signed-off-by: Ilia Mirkin > Signed-off-by: Nicolai Hähnle > --- > src/gallium/auxiliary/tgsi/tgsi_info.c | 6 +-- > src/gallium/docs/source/tgsi.rst | 67 > +- > src/gallium/include/pipe/p_shader_tokens.h | 6 +-- > 3 files changed, 62 insertions(+), 17 deletions(-) > > diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c > b/src/gallium/auxiliary/tgsi/tgsi_info.c > index 5a6a9bc..30bad6d 100644 > --- a/src/gallium/auxiliary/tgsi/tgsi_info.c > +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c > @@ -106,51 +106,51 @@ static const struct tgsi_opcode_info > opcode_info[TGSI_OPCODE_LAST] = > { 1, 3, 0, 0, 0, 0, 0, COMP, "CMP", TGSI_OPCODE_CMP }, > { 1, 1, 0, 0, 0, 0, 0, CHAN, "SCS", TGSI_OPCODE_SCS }, > { 1, 2, 1, 0, 0, 0, 0, OTHR, "TXB", TGSI_OPCODE_TXB }, > { 1, 1, 0, 0, 0, 0, 0, OTHR, "FBFETCH", TGSI_OPCODE_FBFETCH }, > { 1, 2, 0, 0, 0, 0, 0, COMP, "DIV", TGSI_OPCODE_DIV }, > { 1, 2, 0, 0, 0, 0, 0, REPL, "DP2", TGSI_OPCODE_DP2 }, > { 1, 2, 1, 0, 0, 0, 0, OTHR, "TXL", TGSI_OPCODE_TXL }, > { 0, 0, 0, 0, 0, 0, 0, NONE, "BRK", TGSI_OPCODE_BRK }, > { 0, 1, 0, 0, 1, 0, 1, NONE, "IF", TGSI_OPCODE_IF }, > { 0, 1, 0, 0, 1, 0, 1, NONE, "UIF", TGSI_OPCODE_UIF }, > - { 0, 1, 0, 0, 0, 0, 1, NONE, "", 76 }, /* removed */ > + { 1, 2, 0, 0, 0, 0, 0, COMP, "READ_INVOC", TGSI_OPCODE_READ_INVOC }, > { 0, 0, 0, 0, 1, 1, 1, NONE, "ELSE", TGSI_OPCODE_ELSE }, > { 0, 0, 0, 0, 0, 1, 0, NONE, "ENDIF", TGSI_OPCODE_ENDIF }, > { 1, 1, 0, 0, 0, 0, 0, COMP, "DDX_FINE", TGSI_OPCODE_DDX_FINE }, > { 1, 1, 0, 0, 0, 0, 0, COMP, "DDY_FINE", TGSI_OPCODE_DDY_FINE }, > { 0, 1, 0, 0, 0, 0, 0, NONE, "PUSHA", TGSI_OPCODE_PUSHA }, > { 1, 0, 0, 0, 0, 0, 0, NONE, "POPA", TGSI_OPCODE_POPA }, > { 1, 1, 0, 0, 0, 0, 0, COMP, "CEIL", TGSI_OPCODE_CEIL }, > { 1, 1, 0, 0, 0, 0, 0, COMP, "I2F", TGSI_OPCODE_I2F }, > { 1, 1, 0, 0, 0, 0, 0, COMP, "NOT", TGSI_OPCODE_NOT }, > { 1, 1, 0, 0, 0, 0, 0, COMP, "TRUNC", TGSI_OPCODE_TRUNC }, > { 1, 2, 0, 0, 0, 0, 0, COMP, "SHL", TGSI_OPCODE_SHL }, > - { 0, 0, 0, 0, 0, 0, 0, NONE, "", 88 }, /* removed */ > + { 1, 1, 0, 0, 0, 0, 0, OTHR, "BALLOT", TGSI_OPCODE_BALLOT }, > { 1, 2, 0, 0, 0, 0, 0, COMP, "AND", TGSI_OPCODE_AND }, > { 1, 2, 0, 0, 0, 0, 0, COMP, "OR", TGSI_OPCODE_OR }, > { 1, 2, 0, 0, 0, 0, 0, COMP, "MOD", TGSI_OPCODE_MOD }, > { 1, 2, 0, 0, 0, 0, 0, COMP, "XOR", TGSI_OPCODE_XOR }, > { 1, 3, 0, 0, 0, 0, 0, COMP, "SAD", TGSI_OPCODE_SAD }, > { 1, 2, 1, 0, 0, 0, 0, OTHR, "TXF", TGSI_OPCODE_TXF }, > { 1, 2, 1, 0, 0, 0, 0, OTHR, "TXQ", TGSI_OPCODE_TXQ }, > { 0, 0, 0, 0, 0, 0, 0, NONE, "CONT", TGSI_OPCODE_CONT }, > { 0, 1, 0, 0, 0, 0, 0, NONE, "EMIT", TGSI_OPCODE_EMIT }, > { 0, 1, 0, 0, 0, 0, 0, NONE, "ENDPRIM", TGSI_OPCODE_ENDPRIM }, > { 0, 0, 0, 0, 1, 0, 1, NONE, "BGNLOOP", TGSI_OPCODE_BGNLOOP }, > { 0, 0, 0, 0, 0, 0, 1, NONE, "BGNSUB", TGSI_OPCODE_BGNSUB }, > { 0, 0, 0, 0, 1, 1, 0, NONE, "ENDLOOP", TGSI_OPCODE_ENDLOOP }, > { 0, 0, 0, 0, 0, 1, 0, NONE, "ENDSUB", TGSI_OPCODE_ENDSUB }, > { 1, 1, 1, 0, 0, 0, 0, OTHR, "TXQ_LZ", TGSI_OPCODE_TXQ_LZ }, > { 1, 1, 1, 0, 0, 0, 0, OTHR, "TXQS", TGSI_OPCODE_TXQS }, > { 1, 1, 0, 0, 0, 0, 0, OTHR, "RESQ", TGSI_OPCODE_RESQ }, > - { 0, 0, 0, 0, 0, 0, 0, NONE, "", 106 }, /* removed */ > + { 1, 1, 0, 0, 0, 0, 0, COMP, "READ_FIRST", TGSI_OPCODE_READ_FIRST }, > { 0, 0, 0, 0, 0, 0, 0, NONE, "NOP", TGSI_OPCODE_NOP }, > { 1, 2, 0, 0, 0, 0, 0, COMP, "FSEQ", TGSI_OPCODE_FSEQ }, > { 1, 2, 0, 0, 0, 0, 0, COMP, "FSGE", TGSI_OPCODE_FSGE }, > { 1, 2, 0, 0, 0, 0, 0, COMP, "FSLT", TGSI_OPCODE_FSLT }, > { 1, 2, 0, 0, 0, 0, 0, COMP, "FSNE", TGSI_OPCODE_FSNE }, > { 0, 1, 0, 0, 0, 0, 0, OTHR, "MEMBAR", TGSI_OPCODE_MEMBAR }, > { 0, 1, 0, 0, 0, 0, 0, NONE, "CALLNZ", TGSI_OPCODE_CALLNZ }, > { 0, 1, 0, 0, 0, 0, 0, NONE, "", 114 }, /* removed */ > { 0, 1, 0, 0, 0, 0, 0, NONE, "BREAKC", TGSI_OPCODE_BREAKC }, > { 0, 1, 0, 0, 0, 0, 0, NONE, "KILL_IF", TGSI_OPCODE_KILL_IF }, > diff --git a/src/gallium/docs/source/tgsi.rst > b/src/gallium/docs/source/tgsi.rst > index 05b06ce..7e9b47c 100644 > --- a/src/gallium/docs/source/tgsi.rst > +++ b/src/gallium/docs/source/tgsi.rst > @@ -2852,36 +2852,81 @@ only be used with 32-bit integer image formats. > >The following operation is performed atomically: > > .. math:: > >dst_x = resource[offset] > >resource[offset] = (dst_x > src_x ? dst_x : src_x) > > > -.. _voteopcodes: > +.. _interlaneopcodes: > + > +Inter-lane opcodes > +^^ > + > +These opcodes reduce the given value across the shader invocations > +running in the current SIMD group.
Re: [Mesa-dev] [PATCH v2 10/18] anv: Move queues, events, and semaphores to their own file
On Mon 13 Mar 2017, Jason Ekstrand wrote: > Things are about to get more complicated, especially as far as > semaphores are concerned. > --- > src/intel/Makefile.sources| 1 + > src/intel/vulkan/Makefile.sources | 86 +++ > src/intel/vulkan/anv_device.c | 440 --- > src/intel/vulkan/anv_queue.c | 470 > ++ > 4 files changed, 557 insertions(+), 440 deletions(-) > create mode 100644 src/intel/vulkan/Makefile.sources > create mode 100644 src/intel/vulkan/anv_queue.c > > diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources > index 1337574..4382d79 100644 > --- a/src/intel/Makefile.sources > +++ b/src/intel/Makefile.sources > @@ -184,6 +184,7 @@ VULKAN_FILES := \ > vulkan/anv_pipeline.c \ > vulkan/anv_pipeline_cache.c \ > vulkan/anv_private.h \ > + vulkan/anv_queue.c \ > vulkan/anv_util.c \ > vulkan/anv_wsi.c \ > vulkan/vk_format_info.h > diff --git a/src/intel/vulkan/Makefile.sources > b/src/intel/vulkan/Makefile.sources > new file mode 100644 > index 000..aa243a1 > --- /dev/null > +++ b/src/intel/vulkan/Makefile.sources > @@ -0,0 +1,86 @@ > +# Copyright © 2015 Intel Corporation > +# > +# Permission is hereby granted, free of charge, to any person obtaining a > +# copy of this software and associated documentation files (the "Software"), > +# to deal in the Software without restriction, including without limitation > +# the rights to use, copy, modify, merge, publish, distribute, sublicense, > +# and/or sell copies of the Software, and to permit persons to whom the > +# Software is furnished to do so, subject to the following conditions: > +# > +# The above copyright notice and this permission notice (including the next > +# paragraph) shall be included in all copies or substantial portions of the > +# Software. > +# > +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > +# IN THE SOFTWARE. Oops. This patch adds an extra Makefile.sources. Other than the dup'd file, patch 10 is Reviewed-by: Chad Versace___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 09/18] anv: Implement VK_KHX_external_memory_fd
On Mon 13 Mar 2017, Jason Ekstrand wrote: > v2 (chadv): > - Rebase. > - Fix vkGetPhysicalDeviceImageFormatProperties2KHR when > handleType == 0. > - Move handleType-independency comments out of handleType-switch, in > vkGetPhysicalDeviceExternalBufferPropertiesKHX. Reduces diff in > future dma_buf patches. > > Co-authored-with: Chad Versace> --- > src/intel/vulkan/anv_device.c | 71 > - > src/intel/vulkan/anv_entrypoints_gen.py | 1 + > src/intel/vulkan/anv_formats.c | 59 +++ > 3 files changed, 113 insertions(+), 18 deletions(-) Patch 9 is Reviewed-by: Chad Versace ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache
On Mon, Apr 3, 2017 at 5:19 PM, Chad Versacewrote: > On Wed 15 Mar 2017, Jason Ekstrand wrote: > > This cache allows us to easily ensure that we have a unique anv_bo for > > each gem handle. We'll need this in order to support multiple-import of > > memory objects and semaphores. > > > > v2 (Jason Ekstrand): > > - Reject BO imports if the size doesn't match the prime fd size as > >reported by lseek(). > > > > v3 (Jason Ekstrand): > > - Fix reference counting around cache_release (Chris Willson) > > - Move the mutex_unlock() later in cache_release > > --- > > src/intel/vulkan/anv_allocator.c | 261 ++ > + > > src/intel/vulkan/anv_private.h | 26 > > 2 files changed, 287 insertions(+) > > > > > +VkResult anv_bo_cache_alloc(struct anv_device *device, > > +struct anv_bo_cache *cache, > > +uint64_t size, struct anv_bo **bo, > > +VkAllocationCallbacks *alloc); > > > +VkResult anv_bo_cache_import(struct anv_device *device, > > + struct anv_bo_cache *cache, > > + int fd, uint64_t size, struct anv_bo **bo, > > + VkAllocationCallbacks *alloc); > > > +void anv_bo_cache_release(struct anv_device *device, > > + struct anv_bo_cache *cache, > > + struct anv_bo *bo, > > + VkAllocationCallbacks *alloc); > > + > > The app may do this: > > // fd1 != fd2 > // fd1 and fd2 wrap the same gem handle > > mem1 = vkAllocateMemory(import fd1, pAllocator=alloc1); // anv_cached_bo > is allocated here > mem2 = vkAllocateMemory(import fd2, pAllocator=alloc2); > > ... > > vkFreeMemory(mem1, alloc1); > vkFreeMemory(mem2, alloc2); // anv_cached_bo is freed here > > So we can't use either pAllocator to allocate/free the anv_cached_bo, > unless anv_cached_bo also cached its original allocator. > If we don't cache the allocator (and please, let's not), then the > 'alloc' params to anv_bo_cache_alloc/import/release are dead params. > We must use an allocator that is not bound > to the anv_cached_bo, such as device->alloc. > > > Patch 8/18 correctly passes device->alloc into these functions. But, > since the 'alloc' params aren't really used, they should be dropped to > prevent accidental misuse. > I'm not quite sure what you're asking for here. Do you want me to just use device->alloc and drop the parameter? I'm happy to do that. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache
On Wed 15 Mar 2017, Jason Ekstrand wrote: > This cache allows us to easily ensure that we have a unique anv_bo for > each gem handle. We'll need this in order to support multiple-import of > memory objects and semaphores. > > v2 (Jason Ekstrand): > - Reject BO imports if the size doesn't match the prime fd size as >reported by lseek(). > > v3 (Jason Ekstrand): > - Fix reference counting around cache_release (Chris Willson) > - Move the mutex_unlock() later in cache_release > --- > src/intel/vulkan/anv_allocator.c | 261 > +++ > src/intel/vulkan/anv_private.h | 26 > 2 files changed, 287 insertions(+) > +VkResult anv_bo_cache_alloc(struct anv_device *device, > +struct anv_bo_cache *cache, > +uint64_t size, struct anv_bo **bo, > +VkAllocationCallbacks *alloc); > +VkResult anv_bo_cache_import(struct anv_device *device, > + struct anv_bo_cache *cache, > + int fd, uint64_t size, struct anv_bo **bo, > + VkAllocationCallbacks *alloc); > +void anv_bo_cache_release(struct anv_device *device, > + struct anv_bo_cache *cache, > + struct anv_bo *bo, > + VkAllocationCallbacks *alloc); > + The app may do this: // fd1 != fd2 // fd1 and fd2 wrap the same gem handle mem1 = vkAllocateMemory(import fd1, pAllocator=alloc1); // anv_cached_bo is allocated here mem2 = vkAllocateMemory(import fd2, pAllocator=alloc2); ... vkFreeMemory(mem1, alloc1); vkFreeMemory(mem2, alloc2); // anv_cached_bo is freed here So we can't use either pAllocator to allocate/free the anv_cached_bo, unless anv_cached_bo also cached its original allocator. If we don't cache the allocator (and please, let's not), then the 'alloc' params to anv_bo_cache_alloc/import/release are dead params. We must use an allocator that is not bound to the anv_cached_bo, such as device->alloc. Patch 8/18 correctly passes device->alloc into these functions. But, since the 'alloc' params aren't really used, they should be dropped to prevent accidental misuse. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/18] anv: Use the BO cache for DeviceMemory allocations
On Mon 13 Mar 2017, Jason Ekstrand wrote: > --- > src/intel/vulkan/anv_device.c | 27 --- > src/intel/vulkan/anv_image.c | 2 +- > src/intel/vulkan/anv_intel.c | 14 ++ > src/intel/vulkan/anv_private.h | 4 +++- > src/intel/vulkan/anv_wsi.c | 6 +++--- > 5 files changed, 29 insertions(+), 24 deletions(-) Patch 8 is Reviewed-by: Chad Versace___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel/vec4: Add some fall through comments
On Mon, Apr 3, 2017 at 4:43 PM, Ilia Mirkinwrote: > On Mon, Apr 3, 2017 at 7:24 PM, Jason Ekstrand > wrote: > > --- > > src/intel/compiler/brw_vec4_nir.cpp | 4 > > 1 file changed, 4 insertions(+) > > > > diff --git a/src/intel/compiler/brw_vec4_nir.cpp > b/src/intel/compiler/brw_vec4_nir.cpp > > index 2384265..613c695 100644 > > --- a/src/intel/compiler/brw_vec4_nir.cpp > > +++ b/src/intel/compiler/brw_vec4_nir.cpp > > @@ -1310,6 +1310,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) > > > > case nir_op_iadd: > >assert(nir_dest_bit_size(instr->dest.dest) < 64); > > + /* fall through */ > > case nir_op_fadd: > >inst = emit(ADD(dst, op[0], op[1])); > >inst->saturate = instr->dest.saturate; > > @@ -1539,6 +1540,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) > > > > case nir_op_imin: > > case nir_op_umin: > > + /* fall through */ > > Normally these are placed at the end rather than the beginning. Not > sure if this will shut coverity up or not. > I think it would but I fixed it locally anyway. Thanks! --Jason > >assert(nir_dest_bit_size(instr->dest.dest) < 64); > > case nir_op_fmin: > >inst = emit_minmax(BRW_CONDITIONAL_L, dst, op[0], op[1]); > > @@ -1548,6 +1550,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) > > case nir_op_imax: > > case nir_op_umax: > >assert(nir_dest_bit_size(instr->dest.dest) < 64); > > + /* fall through */ > > case nir_op_fmax: > >inst = emit_minmax(BRW_CONDITIONAL_GE, dst, op[0], op[1]); > >inst->saturate = instr->dest.saturate; > > @@ -2054,6 +2057,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) > > case nir_op_iabs: > > case nir_op_ineg: > >assert(nir_dest_bit_size(instr->dest.dest) < 64); > > + /* fall through */ > > case nir_op_fabs: > > case nir_op_fneg: > > case nir_op_fsat: > > -- > > 2.5.0.400.gff86faf > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/glthread: Avoid unnecessary batch reallocation
I've pushed this. Thanks! In future please add the version to the subject when sending new revisions. You can do this with the -v option when using git send-email e.g -v2 will result in [PATCH v2] Also please add changes to the patch since v1 in the commit message. e.g. V2: set batch->used = 0 rather than memsetting the batch to 0. This helps those reviewing to quickly compare the updates since the first revision. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel/vec4: Add some fall through comments
On Mon, Apr 3, 2017 at 7:24 PM, Jason Ekstrandwrote: > --- > src/intel/compiler/brw_vec4_nir.cpp | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/intel/compiler/brw_vec4_nir.cpp > b/src/intel/compiler/brw_vec4_nir.cpp > index 2384265..613c695 100644 > --- a/src/intel/compiler/brw_vec4_nir.cpp > +++ b/src/intel/compiler/brw_vec4_nir.cpp > @@ -1310,6 +1310,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) > > case nir_op_iadd: >assert(nir_dest_bit_size(instr->dest.dest) < 64); > + /* fall through */ > case nir_op_fadd: >inst = emit(ADD(dst, op[0], op[1])); >inst->saturate = instr->dest.saturate; > @@ -1539,6 +1540,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) > > case nir_op_imin: > case nir_op_umin: > + /* fall through */ Normally these are placed at the end rather than the beginning. Not sure if this will shut coverity up or not. >assert(nir_dest_bit_size(instr->dest.dest) < 64); > case nir_op_fmin: >inst = emit_minmax(BRW_CONDITIONAL_L, dst, op[0], op[1]); > @@ -1548,6 +1550,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) > case nir_op_imax: > case nir_op_umax: >assert(nir_dest_bit_size(instr->dest.dest) < 64); > + /* fall through */ > case nir_op_fmax: >inst = emit_minmax(BRW_CONDITIONAL_GE, dst, op[0], op[1]); >inst->saturate = instr->dest.saturate; > @@ -2054,6 +2057,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) > case nir_op_iabs: > case nir_op_ineg: >assert(nir_dest_bit_size(instr->dest.dest) < 64); > + /* fall through */ > case nir_op_fabs: > case nir_op_fneg: > case nir_op_fsat: > -- > 2.5.0.400.gff86faf > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel/vec4: Add some fall through comments
Thank you. These coverity warnings have been coming back repeatedly it seems. Reviewed-by: Matt Turner___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [RFC] mesa/glthread: misaligned address access
On 04/04/17 05:12, Bartosz Tomczyk wrote: Address sanitizer reports lot of misaligned access: SUMMARY: AddressSanitizer: undefined-behavior main/marshal.c:276:31 in main/marshal.c:276:31: runtime error: load of misaligned address 0x631000104866 for type 'const GLuint' (aka 'const unsigned int'), which requires 4 byte alignment 0x631000104866: note: pointer points here 92 88 00 00 00 00 00 00 4a 03 0c 00 93 88 00 00 00 00 00 00 02 01 0c 00 40 8d 00 00 00 00 00 00 ^ SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28725:12 in main/marshal_generated.c:28726:12: runtime error: member access within misaligned address 0x6310003fc874 for type 'struct marshal_cmd_VertexAttribPointer', which requires 8 byte alignment 0x6310003fc874: note: pointer points here 01 00 00 00 7a 02 20 00 00 00 00 00 be be be be be be be be be be be be be be be be be be be be ^ SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28726:12 in main/marshal_generated.c:28726:12: runtime error: store to misaligned address 0x6310003fc87c for type 'GLint' (aka 'int'), which requires 8 byte alignment 0x6310003fc87c: note: pointer points here 00 00 00 00 be be be be be be be be be be be be be be be be be be be be be be be be be be be be This probably isn't issue for x86 as misaligned access shoul be as fast as aligned ones. But this could be issue for different architectures like ARM/MIPS, any exert here ? Any idea how to get correct aligment on different platforms ? Thanks. I've had the same patch locally for a while, I think it makes sense to push this as is for now (which I've done). We can adjust for other platforms later if needed. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache
On Mon, Apr 3, 2017 at 4:26 PM, Chad Versacewrote: > On Wed 15 Mar 2017, Jason Ekstrand wrote: > > This cache allows us to easily ensure that we have a unique anv_bo for > > each gem handle. We'll need this in order to support multiple-import of > > memory objects and semaphores. > > > > v2 (Jason Ekstrand): > > - Reject BO imports if the size doesn't match the prime fd size as > >reported by lseek(). > > > > v3 (Jason Ekstrand): > > - Fix reference counting around cache_release (Chris Willson) > > - Move the mutex_unlock() later in cache_release > > --- > > src/intel/vulkan/anv_allocator.c | 261 ++ > + > > src/intel/vulkan/anv_private.h | 26 > > 2 files changed, 287 insertions(+) > > > > > +VkResult > > +anv_bo_cache_import(struct anv_device *device, > > +struct anv_bo_cache *cache, > > +int fd, uint64_t size, struct anv_bo **bo_out, > > +VkAllocationCallbacks *alloc) > > +{ > > + pthread_mutex_lock(>mutex); > > + > > + /* The kernel is going to give us whole pages anyway */ > > + size = align_u64(size, 4096); > > + > > + uint32_t gem_handle = anv_gem_fd_to_handle(device, fd); > > + if (!gem_handle) { > > + pthread_mutex_unlock(>mutex); > > + return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX); > > + } > > + > > + struct anv_cached_bo *bo = anv_bo_cache_lookup_locked(cache, > gem_handle); > > + if (bo) { > > + assert(bo->bo.size == size); > > For security's sake, we must always check the size, even if the bo is in > the cache. An assertion isn't enough. A malicious client may tell the > truth the first time when sending the fd to the trusted app. The > malicious client may then dup the fd and send it to the trusted app with > a false size. > Yup. Fixed locally. --Jason > > + __sync_fetch_and_add(>refcount, 1); > > + } else { > > + /* For security purposes, we reject BO imports where the size > does not > > + * match exactly. This prevents a malicious client from passing a > > + * buffer to a trusted client, lying about the size, and telling > the > > + * trusted client to try and texture from an image that goes > > + * out-of-bounds. This sort of thing could lead to GPU hangs or > worse > > + * in the trusted client. The trusted client can protect itself > against > > + * this sort of attack but only if it can trust the buffer size. > > + */ > > > + off_t import_size = lseek(fd, 0, SEEK_END); > > + if (import_size == (off_t)-1 || import_size != size) { > > + anv_gem_close(device, gem_handle); > > + pthread_mutex_unlock(>mutex); > > + return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX); > > + } > > + > > + struct anv_cached_bo *bo = > > + vk_alloc(alloc, size, 8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT); > > + if (!bo) { > > + anv_gem_close(device, gem_handle); > > + pthread_mutex_unlock(>mutex); > > + return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY); > > + } > > + > > + bo->refcount = 1; > > + > > + anv_bo_init(>bo, gem_handle, size); > > + > > + _mesa_hash_table_insert(cache->bo_map, (void > *)(uintptr_t)gem_handle, bo); > > + } > > + > > + pthread_mutex_unlock(>mutex); > > + > > + /* From the Vulkan spec: > > +* > > +*"Importing memory from a file descriptor transfers ownership of > > +*the file descriptor from the application to the Vulkan > > +*implementation. The application must not perform any > operations on > > +*the file descriptor after a successful import." > > +* > > +* If the import fails, we leave the file descriptor open. > > +*/ > > + close(fd); > > + > > + *bo_out = >bo; > > + > > + return VK_SUCCESS; > > +} > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache
On Wed 15 Mar 2017, Jason Ekstrand wrote: > This cache allows us to easily ensure that we have a unique anv_bo for > each gem handle. We'll need this in order to support multiple-import of > memory objects and semaphores. > > v2 (Jason Ekstrand): > - Reject BO imports if the size doesn't match the prime fd size as >reported by lseek(). > > v3 (Jason Ekstrand): > - Fix reference counting around cache_release (Chris Willson) > - Move the mutex_unlock() later in cache_release > --- > src/intel/vulkan/anv_allocator.c | 261 > +++ > src/intel/vulkan/anv_private.h | 26 > 2 files changed, 287 insertions(+) > +VkResult > +anv_bo_cache_import(struct anv_device *device, > +struct anv_bo_cache *cache, > +int fd, uint64_t size, struct anv_bo **bo_out, > +VkAllocationCallbacks *alloc) > +{ > + pthread_mutex_lock(>mutex); > + > + /* The kernel is going to give us whole pages anyway */ > + size = align_u64(size, 4096); > + > + uint32_t gem_handle = anv_gem_fd_to_handle(device, fd); > + if (!gem_handle) { > + pthread_mutex_unlock(>mutex); > + return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX); > + } > + > + struct anv_cached_bo *bo = anv_bo_cache_lookup_locked(cache, gem_handle); > + if (bo) { > + assert(bo->bo.size == size); For security's sake, we must always check the size, even if the bo is in the cache. An assertion isn't enough. A malicious client may tell the truth the first time when sending the fd to the trusted app. The malicious client may then dup the fd and send it to the trusted app with a false size. > + __sync_fetch_and_add(>refcount, 1); > + } else { > + /* For security purposes, we reject BO imports where the size does not > + * match exactly. This prevents a malicious client from passing a > + * buffer to a trusted client, lying about the size, and telling the > + * trusted client to try and texture from an image that goes > + * out-of-bounds. This sort of thing could lead to GPU hangs or worse > + * in the trusted client. The trusted client can protect itself > against > + * this sort of attack but only if it can trust the buffer size. > + */ > + off_t import_size = lseek(fd, 0, SEEK_END); > + if (import_size == (off_t)-1 || import_size != size) { > + anv_gem_close(device, gem_handle); > + pthread_mutex_unlock(>mutex); > + return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX); > + } > + > + struct anv_cached_bo *bo = > + vk_alloc(alloc, size, 8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT); > + if (!bo) { > + anv_gem_close(device, gem_handle); > + pthread_mutex_unlock(>mutex); > + return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY); > + } > + > + bo->refcount = 1; > + > + anv_bo_init(>bo, gem_handle, size); > + > + _mesa_hash_table_insert(cache->bo_map, (void *)(uintptr_t)gem_handle, > bo); > + } > + > + pthread_mutex_unlock(>mutex); > + > + /* From the Vulkan spec: > +* > +*"Importing memory from a file descriptor transfers ownership of > +*the file descriptor from the application to the Vulkan > +*implementation. The application must not perform any operations on > +*the file descriptor after a successful import." > +* > +* If the import fails, we leave the file descriptor open. > +*/ > + close(fd); > + > + *bo_out = >bo; > + > + return VK_SUCCESS; > +} ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache
On Mon, Apr 3, 2017 at 3:45 PM, Chad Versacewrote: > On Mon 03 Apr 2017, Jason Ekstrand wrote: > > On Mon, Apr 3, 2017 at 12:31 PM, Chad Versace > > wrote: > > > > > On Fri 31 Mar 2017, Chad Versace wrote: > > > > On Wed 15 Mar 2017, Jason Ekstrand wrote: > > > > > This cache allows us to easily ensure that we have a unique anv_bo > for > > > > > each gem handle. We'll need this in order to support > multiple-import > > > of > > > > > memory objects and semaphores. > > > > > > > > > > v2 (Jason Ekstrand): > > > > > - Reject BO imports if the size doesn't match the prime fd size as > > > > >reported by lseek(). > > > > > > > > > > v3 (Jason Ekstrand): > > > > > - Fix reference counting around cache_release (Chris Willson) > > > > > - Move the mutex_unlock() later in cache_release > > > > > --- > > > > > src/intel/vulkan/anv_allocator.c | 261 > ++ > > > + > > > > > src/intel/vulkan/anv_private.h | 26 > > > > > 2 files changed, 287 insertions(+) > > > > > > > > > > > > > +static uint32_t > > > > > +hash_uint32_t(const void *key) > > > > > +{ > > > > > + return (uint32_t)(uintptr_t)key; > > > > > +} > > > > > > > > This hash function does not appear hashy. > > > > > > > > If I correctly understand the details of Mesa's struct hash_table, > > > > choosing the identify function for the hash function causes unwanted > > > > clustering when inserting consecutive gem handles. Since the kernel > does > > > > allocate gem handles consecutively, the problem is real. > > > > > > > > For proof, consider the following: > > > > > > > >- Suppose a long-running process (maybe the compositor) has > thrashed > > > on the > > > > hash table long enough that its bucket count > > > > is ht->size = hash_sizes[7].size = 283. Suppose a spike of > > > > compositor activity raises the hash table's density to about > 0.5. > > > > And suppose the hash table buckets are filled with the > consecutive > > > gem > > > > handles > > > > > > > > {0, 0, 0, 0, 4, 5, 6, 7, 8, 9, ..., 127, 128, 0, 0, 0, ..., 0 } > > > > > > > > The exact density is (128 - 4 + 1) / 283 = 0.4417. > > > > > > > >- Next, some other in-process activity (maybe OpenGL) generated > > > > a lot of gem handles after Vulkan's most recently imported > > > > gem handle, 128. > > > > > > This point in the example---the reason why the gem handles in the > > > anv_bo_cache skip from 128 to 287---is bogus in Vulkan. The problem > *is* > > > real for multiple in-process OpenGL contexts derived from the same > > > EGLDisplay, using EGL_EXT_image_dma_buf_import, because each context > > > shares the same intel_screen, and therefore the same drm device fd. But > > > in Vulkan, each VkDevice opens its own drm device id. So, bogus > example. > > > > > > BUT, that leads to a new question... > > > > > > Since each VkDevice has a unique drm device fd, and since the kernel > > > allocates gem handles consecutively on the fd, and since struct > > > hash_table only grows and never shrinks, and since patch 8/18 inserts > > > every VkDeviceMemory into the cache... I believe no collisions are > > > possible in anv_bo_cache. > > > > > > > Does this fall under the category of unbreakable kernel ABI or is it > just a > > side-effect of the implementation? If not, then I'm reluctant to trust > it. > > I'm not certain. But krh, sitting beside me, says "It's ABI at this point. > The kernel uses a 'idr' structure which guarantees that behavior". > If we think we can rely on it, I'm happy to make it into a flat array but it'll basically be a simplified hash table at that point. > > > If there are no collisions, then the hash table is only adding > overhead, > > > > Sure, but a no-collision hash table is pretty cheap... > > > > > and we should use a direct-addressing lookup table. The bo cache should > > > look like this: > > > > > > struct anv_bo_cache { > > >/* The array indices are gem handles. Null entries are legal. */ > > >struct anv_bo **bos; > > > > > >/* Length of the array. Because the array can have holes, this > > > * is *not* the number of gem handles in the array. > > > */ > > >size_t len; > > > > > >pthread_mutex_t mutex; > > > }; > > > > > > struct anv_bo * > > > anv_bo_cache_lookup(struct anv_bo_cache *cache, uint32_t > gem_handle) > > > { > > >struct anv_bo *bo = NULL; > > > > > >pthread_mutex_lock(>mutex); > > > > > >if (gem_handle < cache->len) > > > bo = cache->entries[gem_handle]; > > > > > >pthread_mutex_unlock(>mutex); > > > > > >return bo; > > > > > > } > > > > > > BUT, that leads to yet another question... > > > > > > Why is patch 8/18 inserting every VkDeviceMemory into the cache? If > > > I understand things correctly, we only *need* to insert a > VkDeviceMemory > > > into the
[Mesa-dev] [PATCH] intel/vec4: Add some fall through comments
--- src/intel/compiler/brw_vec4_nir.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/intel/compiler/brw_vec4_nir.cpp b/src/intel/compiler/brw_vec4_nir.cpp index 2384265..613c695 100644 --- a/src/intel/compiler/brw_vec4_nir.cpp +++ b/src/intel/compiler/brw_vec4_nir.cpp @@ -1310,6 +1310,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) case nir_op_iadd: assert(nir_dest_bit_size(instr->dest.dest) < 64); + /* fall through */ case nir_op_fadd: inst = emit(ADD(dst, op[0], op[1])); inst->saturate = instr->dest.saturate; @@ -1539,6 +1540,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) case nir_op_imin: case nir_op_umin: + /* fall through */ assert(nir_dest_bit_size(instr->dest.dest) < 64); case nir_op_fmin: inst = emit_minmax(BRW_CONDITIONAL_L, dst, op[0], op[1]); @@ -1548,6 +1550,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) case nir_op_imax: case nir_op_umax: assert(nir_dest_bit_size(instr->dest.dest) < 64); + /* fall through */ case nir_op_fmax: inst = emit_minmax(BRW_CONDITIONAL_GE, dst, op[0], op[1]); inst->saturate = instr->dest.saturate; @@ -2054,6 +2057,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) case nir_op_iabs: case nir_op_ineg: assert(nir_dest_bit_size(instr->dest.dest) < 64); + /* fall through */ case nir_op_fabs: case nir_op_fneg: case nir_op_fsat: -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] intel/gen_decoder: return -1 for unknown command formats
Decoding with aubinator encountered a command of 0x. With the previous code, it caused aubinator to jump 255 + 2 dwords to start decoding again. Instead we can attempt to detect the known instruction formats. If the format is not recognized, then we can advance just 1 dword. Signed-off-by: Jordan Justen--- src/intel/common/gen_decoder.c| 21 +++-- src/intel/tools/aubinator.c | 4 ++-- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 2 ++ 3 files changed, 19 insertions(+), 8 deletions(-) diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c index 1244f4c4480..d5ba3e6265e 100644 --- a/src/intel/common/gen_decoder.c +++ b/src/intel/common/gen_decoder.c @@ -692,28 +692,37 @@ gen_group_get_length(struct gen_group *group, const uint32_t *p) case 3: /* Render */ { uint32_t subtype = field(h, 27, 28); + uint32_t opcode = field(h, 24, 26); switch (subtype) { case 0: - return field(h, 0, 7) + 2; + if (opcode < 2) +return field(h, 0, 7) + 2; + else +return -1; case 1: - return 1; + if (opcode < 2) +return 1; + else +return -1; case 2: { - uint32_t opcode = field(h, 24, 26); - assert(opcode < 3 /* 3 and above currently reserved */); if (opcode == 0) return field(h, 0, 7) + 2; else if (opcode < 3) return field(h, 0, 15) + 2; else -return 1; /* FIXME: if more opcodes are added */ +return -1; } case 3: - return field(h, 0, 7) + 2; + if (opcode < 4) +return field(h, 0, 7) + 2; + else +return -1; } } } unreachable("bad opcode"); + return -1; } void diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c index cae578babac..a64bce7a536 100644 --- a/src/intel/tools/aubinator.c +++ b/src/intel/tools/aubinator.c @@ -687,12 +687,12 @@ parse_commands(struct gen_spec *spec, uint32_t *cmds, int size, int engine) for (p = cmds; p < end; p += length) { inst = gen_spec_find_instruction(spec, p); + length = gen_group_get_length(inst, p); if (inst == NULL) { fprintf(outfile, "unknown instruction %08x\n", p[0]); - length = (p[0] & 0xff) + 2; + length = MIN2(1, length); continue; } - length = gen_group_get_length(inst, p); const char *color, *reset_color = NORMAL; uint64_t offset; diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 54bab9efb02..d16b4622456 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -318,6 +318,8 @@ do_batch_dump(struct brw_context *brw) } length = gen_group_get_length(inst, p); + assert(length > 0); + length = MAX2(length, 1); } if (ret == 0) { -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Fix blob memory leak
Pushed, Thanks! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] intel/aubinator: Stop searching after a custom handler is found
Signed-off-by: Jordan Justen--- src/intel/tools/aubinator.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c index a64bce7a536..0d33c392384 100644 --- a/src/intel/tools/aubinator.c +++ b/src/intel/tools/aubinator.c @@ -724,8 +724,10 @@ parse_commands(struct gen_spec *spec, uint32_t *cmds, int size, int engine) decode_group(inst, p, 0); for (i = 0; i < ARRAY_LENGTH(custom_handlers); i++) { -if (gen_group_get_opcode(inst) == custom_handlers[i].opcode) +if (gen_group_get_opcode(inst) == custom_handlers[i].opcode) { custom_handlers[i].handle(spec, p); + break; +} } } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] intel/gen_decoder: Fix length for Media State/Object commands
From BDW PRM, Volume 6: Command Stream Programming, 'Render Command Header Format'. Signed-off-by: Jordan Justen--- src/intel/common/gen_decoder.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c index 1ae78c88e3f..1244f4c4480 100644 --- a/src/intel/common/gen_decoder.c +++ b/src/intel/common/gen_decoder.c @@ -697,8 +697,16 @@ gen_group_get_length(struct gen_group *group, const uint32_t *p) return field(h, 0, 7) + 2; case 1: return 1; - case 2: - return 2; + case 2: { + uint32_t opcode = field(h, 24, 26); + assert(opcode < 3 /* 3 and above currently reserved */); + if (opcode == 0) +return field(h, 0, 7) + 2; + else if (opcode < 3) +return field(h, 0, 15) + 2; + else +return 1; /* FIXME: if more opcodes are added */ + } case 3: return field(h, 0, 7) + 2; } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH] egl/android: Dequeue buffers inside EGL calls
Hi Tomasz, 2017-04-03 10:00 GMT+02:00 Tomasz Figa: > Hi Mauro, > > On Mon, Apr 3, 2017 at 2:48 AM, Mauro Rossi wrote: > > > > > > 2017-03-31 13:05 GMT+02:00 Tapani Pälli : > >> > >> > >> > >> On 03/31/2017 10:12 AM, Tapani Pälli wrote: > >>> > >>> > >>> > >>> On 03/31/2017 09:06 AM, Tapani Pälli wrote: > > > > On 03/31/2017 08:24 AM, Rob Clark wrote: > > > > On Fri, Mar 31, 2017 at 12:22 AM, Tapani Pälli > > wrote: > >> > >> > >> > >> On 03/30/2017 05:57 PM, Emil Velikov wrote: > >>> > >>> > >>> On 30 March 2017 at 15:30, Tomasz Figa wrote: > > > On Thu, Mar 30, 2017 at 11:17 PM, Emil Velikov > > wrote: > > > > > > > > On 30 March 2017 at 11:55, Tomasz Figa > wrote: > >> > >> > >> Android buffer queues can be abandoned, which results in failing > >> to > >> dequeue next buffer. Currently this would fail somewhere deep > >> within > >> the DRI stack calling loader's getBuffers*(), without any error > >> reporting to the client app. However Android framework code > >> relies on > >> proper signaling of this event, so we move buffer dequeue to > >> createWindowSurface() and swapBuffers() call, which can generate > >> proper > >> EGL errors. To keep the performance benefits of delayed buffer > >> handling, > >> if any, fence wait and DRI image creation is kept delayed until > >> getBuffers*() is called by the DRI driver. > >> > > Thank you Tomasz. > > > > I'm fairly confident that this should resolve the crash [in > > swap_buffers] that Mauro was seeing. > > Mauro can you give it a test ? > > > > Ah, I actually noticed a problem with existing code, supposedly > fixed > by [1], but I'm afraid it's still wrong. > > Current swap_buffers calls get_back_bo(), but doesn't call > update_buffers(), which is the function that should be called > before > to actually dequeue a buffer from Android's buffer queue. Given > that, > get_back_bo() would simply fail with !dri2_surf->buffer, because > no > buffer was dequeued. > > >>> Right - I was wondering why we don't hit that on EGL/GBM or > >>> EGL/Wayland. > >>> From a quick look - may be because EGL/Android drops the dpy mutex > in > >>> droid_window_enqueue_buffer(). > >>> > My patch removes update_buffers() and changes the buffer > management so > that there is always a buffer dequeued, starting from surface > creation, unless there was an error somewhere. > > >>> Of the top of your head - is there something stopping us from using > >>> the same method on $other platforms? > >>> > [1] > > https://cgit.freedesktop.org/mesa/mesa/commit/src/egl/ > drivers/dri2/platform_android.c?id=4d4558411db166d2d66f8cec9cb581 > 149dbe1597 > > > > > > > > > Not that huge of an expert on the Android specifics, so just a > > humble > > request: > > Can we seek the code resuffle (droid_{alloc,free}_local_buffer, > >>> > >>> > >>> Oops silly typo - s/seek/split/. > >>> > > other?) separate from the functionality changes ? > > > > Sure. Thanks for suggestion. > > >>> Please give it a day or two for others to comment. > >> > >> > >> > >> I'm trying to debug why this causes our homescreen (wallpaper) to be > >> black. > >> Otherwise I haven't seen any issues with these changes. > >> > > > > wallpaper seems to be a special sorta hell.. I wonder if there is > > somehow some sort of interaction with what I fixed / worked-around in > > a5e733c6b52e93de3000647d075f5ca2f55fcb71 ?? > > > > Maybe at least try commenting out the temp-pbuffer thing to get max > > texture size, and see if that "fixes" things > > > Can you give more details, I still live in la la land and don't know > about 'temp-pbuffer thing'? > > >>> > >>> aa I did not recall the problem, you mean the 'dummy pbuffer' in > >>> SurfaceFlinger .. yes I will check if this is related. > >>> > >> > >> If I take away that dummy pbuffer usage (which is useless anyway), > couple > >> of errors disappear from the log. They are: > >> > >> SurfaceFlinger: releasePendingBuffer failed: Unknown error -1 (1) > >> SurfaceFlinger: releasePendingBuffer failed: Unknown error -1 (1) > >> > >> but otherwise the desktop still stays black,
Re: [Mesa-dev] [PATCH 6/9] gallium: decrease the size of pipe_box - 24 -> 16 bytes
Am 03.04.2017 um 17:11 schrieb Alex Deucher: > On Sun, Apr 2, 2017 at 2:00 PM, Marek Olšákwrote: >> From: Marek Olšák >> >> Also: >> >> pipe_transfer: 48 -> 40 bytes. >> pipe_blit_info = 176 -> 160 bytes. >> --- >> src/gallium/include/pipe/p_state.h | 8 >> 1 file changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/src/gallium/include/pipe/p_state.h >> b/src/gallium/include/pipe/p_state.h >> index 392bb8b..6a147ef 100644 >> --- a/src/gallium/include/pipe/p_state.h >> +++ b/src/gallium/include/pipe/p_state.h >> @@ -472,25 +472,25 @@ struct pipe_image_view >> } u; >> }; >> >> >> /** >> * Subregion of 1D/2D/3D image resource. >> */ >> struct pipe_box >> { >> int x; >> - int y; >> - int z; >> + int16_t y; >> + int16_t z; >> int width; >> - int height; >> - int depth; >> + int16_t height; >> + int16_t depth; >> }; > > Not specific to this patch per se, but will this cause any issues in > the future as max surface sizes supported by hw increase? I feel like > we'll need to change this back at some point. I don't think this can increase easily. With 8 subpixel bits for rasterization (which is the standard nowadays) the absolute maximum you could theoretically support would be 64k when using float math, and there's probably reasons why you don't want to go quite to the aboluste limit. That said, with int16 the largest pot size possible would be 16k, which indeed is what everybody already supports, not leaving even a factor 2 increase... So I'm not sure it's really a good idea. > Alex > >> >> >> /** >> * A memory object/resource such as a vertex buffer or texture. >> */ >> struct pipe_resource >> { >> struct pipe_reference reference; >> struct pipe_screen *screen; /**< screen that this texture belongs to */ >> -- >> 2.7.4 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] util/u_atomic: provide 64bit atomics where they're missing
On Mon, Apr 3, 2017 at 1:37 PM, Mark Janeswrote: > > This commit appears to intermittently provoke gpu hangs on 32-bit Intel > systems when running > > piglit.shaders.glsl-max-varyings >max_varying_components > > Since the behavior is intermittent, I may have identified the wrong > patch. Please let me know if this patch seems unlikely to affect the > test. I think it's not this patch, but I haven't figured out which it is yet. Applying this patch doesn't change i965_dri.so (md5sum is the same). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support
On Mon, Apr 3, 2017 at 4:57 PM, Rob Clarkwrote: > On Mon, Apr 3, 2017 at 4:06 PM, Thomas Hellstrom > wrote: >> On 04/03/2017 07:33 PM, Thomas Hellstrom wrote: >>> On 04/03/2017 07:13 PM, Rob Clark wrote: On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstrom wrote: > Hi, Rob, > > On 03/24/2017 10:21 PM, Rob Clark wrote: >> It's kinda sad that (a) we don't have debug_backtrace support on !X86 >> and that (b) we re-invent our own crude backtrace support in the first >> place. If available, use libunwind instead. The backtrace format is >> based on what xserver and weston use, since it is nice not to have to >> figure out a different format. >> >> Signed-off-by: Rob Clark > Did you consider glibc "backtrace()", I think it's also available on > ARM... I had not.. although xserver and weston are already using libunwind. I'm not sure about portability of libunwind to other libc implementations (but I guess it is at least not worse than using a glibc specific API). I suppose we could always add a fallback to backtrace(). >> Hmm. This commit (bisected) appears to break svga/vmwgfx in DEBUG mode: >> >> *** Error in `glxgears': malloc(): memory corruption: 0x025c09c0 *** >> Aborted (core dumped) >> >> The svga linux winsys makes extensive use of the backtrace functionality >> using u_debug_flush.c > > ok, I can reproduce.. hopefully patch following shortly.. > So, I found one other minor issue, but the big problem is debug_stack_frame size difference, because u_debug_flush.c doesn't #include config.h.. adding: #ifdef HAVE_CONFIG_H #include #endif in u_debug_stack.h solves that.. although I think usually folks prefer to have that in .c files. (Not sure if I remember the reason.) So we should either add that in u_debug_stack.h or track down all the .c files that #include it and add there. (Or is there a reason we don't just do -include config.h globally in the build system?) BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache
On Mon, Apr 3, 2017 at 12:31 PM, Chad Versacewrote: > On Fri 31 Mar 2017, Chad Versace wrote: > > On Wed 15 Mar 2017, Jason Ekstrand wrote: > > > This cache allows us to easily ensure that we have a unique anv_bo for > > > each gem handle. We'll need this in order to support multiple-import > of > > > memory objects and semaphores. > > > > > > v2 (Jason Ekstrand): > > > - Reject BO imports if the size doesn't match the prime fd size as > > >reported by lseek(). > > > > > > v3 (Jason Ekstrand): > > > - Fix reference counting around cache_release (Chris Willson) > > > - Move the mutex_unlock() later in cache_release > > > --- > > > src/intel/vulkan/anv_allocator.c | 261 ++ > + > > > src/intel/vulkan/anv_private.h | 26 > > > 2 files changed, 287 insertions(+) > > > > > > > +static uint32_t > > > +hash_uint32_t(const void *key) > > > +{ > > > + return (uint32_t)(uintptr_t)key; > > > +} > > > > This hash function does not appear hashy. > > > > If I correctly understand the details of Mesa's struct hash_table, > > choosing the identify function for the hash function causes unwanted > > clustering when inserting consecutive gem handles. Since the kernel does > > allocate gem handles consecutively, the problem is real. > > > > For proof, consider the following: > > > >- Suppose a long-running process (maybe the compositor) has thrashed > on the > > hash table long enough that its bucket count > > is ht->size = hash_sizes[7].size = 283. Suppose a spike of > > compositor activity raises the hash table's density to about 0.5. > > And suppose the hash table buckets are filled with the consecutive > gem > > handles > > > > {0, 0, 0, 0, 4, 5, 6, 7, 8, 9, ..., 127, 128, 0, 0, 0, ..., 0 } > > > > The exact density is (128 - 4 + 1) / 283 = 0.4417. > > > >- Next, some other in-process activity (maybe OpenGL) generated > > a lot of gem handles after Vulkan's most recently imported > > gem handle, 128. > > This point in the example---the reason why the gem handles in the > anv_bo_cache skip from 128 to 287---is bogus in Vulkan. The problem *is* > real for multiple in-process OpenGL contexts derived from the same > EGLDisplay, using EGL_EXT_image_dma_buf_import, because each context > shares the same intel_screen, and therefore the same drm device fd. But > in Vulkan, each VkDevice opens its own drm device id. So, bogus example. > > BUT, that leads to a new question... > > Since each VkDevice has a unique drm device fd, and since the kernel > allocates gem handles consecutively on the fd, and since struct > hash_table only grows and never shrinks, and since patch 8/18 inserts > every VkDeviceMemory into the cache... I believe no collisions are > possible in anv_bo_cache. > Does this fall under the category of unbreakable kernel ABI or is it just a side-effect of the implementation? If not, then I'm reluctant to trust it. > If there are no collisions, then the hash table is only adding overhead, > Sure, but a no-collision hash table is pretty cheap... and we should use a direct-addressing lookup table. The bo cache should > look like this: > > struct anv_bo_cache { >/* The array indices are gem handles. Null entries are legal. */ >struct anv_bo **bos; > >/* Length of the array. Because the array can have holes, this > * is *not* the number of gem handles in the array. > */ >size_t len; > >pthread_mutex_t mutex; > }; > > struct anv_bo * > anv_bo_cache_lookup(struct anv_bo_cache *cache, uint32_t gem_handle) > { >struct anv_bo *bo = NULL; > >pthread_mutex_lock(>mutex); > >if (gem_handle < cache->len) > bo = cache->entries[gem_handle] == NULL) > >pthread_mutex_unlock(>mutex); > >return bo; > > } > > BUT, that leads to yet another question... > > Why is patch 8/18 inserting every VkDeviceMemory into the cache? If > I understand things correctly, we only *need* to insert a VkDeviceMemory > into the cache if, in vkAllocateMemory, either (1) > VkExportMemoryAllocateInfoKHX::handleTypes != 0 or (2) > VkMemoryAllocateInfo's pNext chain contains an import structure. > Because I'm lazy. In order to start using the bo cache, anv_device_memory::bo needs to be a pointer (well, it's makes the BO cache API simpler and more efficient if it's a pointer). This would mean that we would have to allocate an additional chunk of memory or go through some other hoops in order to make it work. At the end of the day, just stuffing everything in the cache was simpler and kept us to a single path. > If we insert into the cache only those VkDeviceMemories that are > imported or that will be exported, then the bo cache remains small, and > we *should* use a hash table. > Maybe. But the client isn't supposed to be allocating hundreds of VkDeviceMemory objects. It's supposed to
Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support
On Mon, Apr 3, 2017 at 4:06 PM, Thomas Hellstromwrote: > On 04/03/2017 07:33 PM, Thomas Hellstrom wrote: >> On 04/03/2017 07:13 PM, Rob Clark wrote: >>> On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstrom >>> wrote: Hi, Rob, On 03/24/2017 10:21 PM, Rob Clark wrote: > It's kinda sad that (a) we don't have debug_backtrace support on !X86 > and that (b) we re-invent our own crude backtrace support in the first > place. If available, use libunwind instead. The backtrace format is > based on what xserver and weston use, since it is nice not to have to > figure out a different format. > > Signed-off-by: Rob Clark Did you consider glibc "backtrace()", I think it's also available on ARM... >>> I had not.. although xserver and weston are already using libunwind. >>> I'm not sure about portability of libunwind to other libc >>> implementations (but I guess it is at least not worse than using a >>> glibc specific API). >>> >>> I suppose we could always add a fallback to backtrace(). >>> > Hmm. This commit (bisected) appears to break svga/vmwgfx in DEBUG mode: > > *** Error in `glxgears': malloc(): memory corruption: 0x025c09c0 *** > Aborted (core dumped) > > The svga linux winsys makes extensive use of the backtrace functionality > using u_debug_flush.c ok, I can reproduce.. hopefully patch following shortly.. BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] util/u_atomic: provide 64bit atomics where they're missing
This commit appears to intermittently provoke gpu hangs on 32-bit Intel systems when running piglit.shaders.glsl-max-varyings >max_varying_components Since the behavior is intermittent, I may have identified the wrong patch. Please let me know if this patch seems unlikely to affect the test. Matt Turnerwrites: > On Thu, Mar 30, 2017 at 3:47 PM, Matt Turner wrote: >> On Thu, Mar 30, 2017 at 3:26 PM, Grazvydas Ignotas > wrote: >>> There are still some distributions trying to support unfortunate people >>> with old or exotic CPUs that don't have 64bit atomic operations. When >>> compiling for such a machine, gcc conveniently inserts a library call to >>> a helper, but it's implementation is missing and we get a linker error. >>> This allows us to provide our own implementation, which is marked weak >>> to prefer a better implementation, should one exist. >>> >>> v2: changed copyright, some style adjustments >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089 >>> Signed-off-by: Grazvydas Ignotas >>> Reviewed-by: Matt Turner >> >> Thanks, I'll commit this. >> >>> --- >>> no commit access, but request sent: >>> https://bugs.freedesktop.org/show_bug.cgi?id=100467 >> >> Thanks. I commented on the bug and said I approve :) > > I modified the patch slightly to print the results of the test with > AC_MSG_CHECKING/AC_MSG_RESULT and pushed it as commit a6a38a038bd62e. > > I'll do some build tests on various architectures. I think I like the idea > of this going to the stable 17.0 branch. > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: Enable VK_KHR_incremental_present.
Reviewed-by: Jason EkstrandOn April 3, 2017 12:36:16 PM Bas Nieuwenhuizen wrote: Just enabling the driver-independent implementation that Jason did. Signed-off-by: Bas Nieuwenhuizen --- src/amd/vulkan/radv_device.c | 4 src/amd/vulkan/radv_entrypoints_gen.py | 1 + src/amd/vulkan/radv_wsi.c | 11 ++- 3 files changed, 15 insertions(+), 1 deletion(-) diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c index 5c48be1d11a..6456eb17f56 100644 --- a/src/amd/vulkan/radv_device.c +++ b/src/amd/vulkan/radv_device.c @@ -92,6 +92,10 @@ static const VkExtensionProperties instance_extensions[] = { static const VkExtensionProperties common_device_extensions[] = { { + .extensionName = VK_KHR_INCREMENTAL_PRESENT_EXTENSION_NAME, + .specVersion = 1, + }, + { .extensionName = VK_KHR_MAINTENANCE1_EXTENSION_NAME, .specVersion = 1, }, diff --git a/src/amd/vulkan/radv_entrypoints_gen.py b/src/amd/vulkan/radv_entrypoints_gen.py index b7b2bcf97e4..bee29dc1202 100644 --- a/src/amd/vulkan/radv_entrypoints_gen.py +++ b/src/amd/vulkan/radv_entrypoints_gen.py @@ -31,6 +31,7 @@ supported_extensions = [ 'VK_AMD_draw_indirect_count', 'VK_NV_dedicated_allocation', 'VK_KHR_get_physical_device_properties2', + 'VK_KHR_incremental_present', 'VK_KHR_maintenance1', 'VK_KHR_sampler_mirror_clamp_to_edge', 'VK_KHR_shader_draw_parameters', diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c index 8b66095b263..b8999f4eb02 100644 --- a/src/amd/vulkan/radv_wsi.c +++ b/src/amd/vulkan/radv_wsi.c @@ -26,6 +26,7 @@ #include "radv_private.h" #include "radv_meta.h" #include "wsi_common.h" +#include "util/vk_util.h" static const struct wsi_callbacks wsi_cbs = { .get_phys_device_format_properties = radv_GetPhysicalDeviceFormatProperties, @@ -452,9 +453,14 @@ VkResult radv_QueuePresentKHR( RADV_FROM_HANDLE(radv_queue, queue, _queue); VkResult result = VK_SUCCESS; + const VkPresentRegionsKHR *regions = +vk_find_struct_const(pPresentInfo->pNext, PRESENT_REGIONS_KHR); + for (uint32_t i = 0; i < pPresentInfo->swapchainCount; i++) { RADV_FROM_HANDLE(wsi_swapchain, swapchain, pPresentInfo->pSwapchains[i]); struct radeon_winsys_cs *cs; + const VkPresentRegionKHR *region = NULL; + assert(radv_device_from_handle(swapchain->device) == queue->device); if (swapchain->fences[0] == VK_NULL_HANDLE) { result = radv_CreateFence(radv_device_to_handle(queue->device), @@ -484,9 +490,12 @@ VkResult radv_QueuePresentKHR( pPresentInfo->waitSemaphoreCount, NULL, 0, false, base_fence); fence->submitted = true; + if (regions && regions->pRegions) + region = >pRegions[i]; + result = swapchain->queue_present(swapchain, pPresentInfo->pImageIndices[i], - NULL); + region); /* TODO: What if one of them returns OUT_OF_DATE? */ if (result != VK_SUCCESS) return result; -- 2.12.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support
On Mon, Apr 3, 2017 at 4:06 PM, Thomas Hellstromwrote: > On 04/03/2017 07:33 PM, Thomas Hellstrom wrote: >> On 04/03/2017 07:13 PM, Rob Clark wrote: >>> On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstrom >>> wrote: Hi, Rob, On 03/24/2017 10:21 PM, Rob Clark wrote: > It's kinda sad that (a) we don't have debug_backtrace support on !X86 > and that (b) we re-invent our own crude backtrace support in the first > place. If available, use libunwind instead. The backtrace format is > based on what xserver and weston use, since it is nice not to have to > figure out a different format. > > Signed-off-by: Rob Clark Did you consider glibc "backtrace()", I think it's also available on ARM... >>> I had not.. although xserver and weston are already using libunwind. >>> I'm not sure about portability of libunwind to other libc >>> implementations (but I guess it is at least not worse than using a >>> glibc specific API). >>> >>> I suppose we could always add a fallback to backtrace(). >>> > Hmm. This commit (bisected) appears to break svga/vmwgfx in DEBUG mode: > > *** Error in `glxgears': malloc(): memory corruption: 0x025c09c0 *** > Aborted (core dumped) > > The svga linux winsys makes extensive use of the backtrace functionality > using u_debug_flush.c hmm, do you think I should be able to reproduce by sprinkling some calls to debug_flush_alert()? I was using it mostly w/ the refcnt logging before.. BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3] intel: tools: add aubinator_error_decode tool
This is pretty much the same tool as what i-g-t has, only with a more fancy decoding of the instructions/registers. It also doesn't support anything before gen4. v2 (from Matt): Drop authors Remove undefined automake variable v3: Fix incorrect offsets for dword > 1 (Jordan) Signed-off-by: Lionel LandwerlinAcked-by: Matt Turner --- src/intel/Makefile.tools.am | 19 +- src/intel/common/gen_decoder.c | 11 + src/intel/common/gen_decoder.h | 1 + src/intel/tools/.gitignore | 1 + src/intel/tools/aubinator_error_decode.c | 733 +++ 5 files changed, 764 insertions(+), 1 deletion(-) create mode 100644 src/intel/tools/aubinator_error_decode.c diff --git a/src/intel/Makefile.tools.am b/src/intel/Makefile.tools.am index dc073e8daa..576beea4f2 100644 --- a/src/intel/Makefile.tools.am +++ b/src/intel/Makefile.tools.am @@ -19,7 +19,9 @@ # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS # IN THE SOFTWARE. -noinst_PROGRAMS += tools/aubinator +noinst_PROGRAMS += \ + tools/aubinator \ + tools/aubinator_error_decode tools_aubinator_SOURCES = \ tools/aubinator.c \ @@ -42,3 +44,18 @@ tools_aubinator_LDADD = \ $(EXPAT_LIBS) \ $(ZLIB_LIBS) \ -lm + + +tools_aubinator_error_decode_SOURCES = \ + tools/aubinator_error_decode.c + +tools_aubinator_error_decode_LDADD = \ + common/libintel_common.la \ + $(top_builddir)/src/util/libmesautil.la \ + $(EXPAT_LIBS) \ + $(ZLIB_LIBS) + +tools_aubinator_error_decode_CFLAGS = \ + $(AM_CFLAGS) \ + $(EXPAT_CFLAGS) \ + $(ZLIB_CFLAGS) diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c index 6cc6896b05..1ae78c88e3 100644 --- a/src/intel/common/gen_decoder.c +++ b/src/intel/common/gen_decoder.c @@ -108,6 +108,17 @@ gen_spec_find_register(struct gen_spec *spec, uint32_t offset) return NULL; } +struct gen_group * +gen_spec_find_register_by_name(struct gen_spec *spec, const char *name) +{ + for (int i = 0; i < spec->nregisters; i++) { + if (strcmp(spec->registers[i]->name, name) == 0) + return spec->registers[i]; + } + + return NULL; +} + struct gen_enum * gen_spec_find_enum(struct gen_spec *spec, const char *name) { diff --git a/src/intel/common/gen_decoder.h b/src/intel/common/gen_decoder.h index be37e8a542..870bd7f784 100644 --- a/src/intel/common/gen_decoder.h +++ b/src/intel/common/gen_decoder.h @@ -45,6 +45,7 @@ struct gen_spec *gen_spec_load_from_path(const struct gen_device_info *devinfo, uint32_t gen_spec_get_gen(struct gen_spec *spec); struct gen_group *gen_spec_find_instruction(struct gen_spec *spec, const uint32_t *p); struct gen_group *gen_spec_find_register(struct gen_spec *spec, uint32_t offset); +struct gen_group *gen_spec_find_register_by_name(struct gen_spec *spec, const char *name); int gen_group_get_length(struct gen_group *group, const uint32_t *p); const char *gen_group_get_name(struct gen_group *group); uint32_t gen_group_get_opcode(struct gen_group *group); diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore index 0c80a6fed2..27437f9eef 100644 --- a/src/intel/tools/.gitignore +++ b/src/intel/tools/.gitignore @@ -1 +1,2 @@ /aubinator +/aubinator_error_decode diff --git a/src/intel/tools/aubinator_error_decode.c b/src/intel/tools/aubinator_error_decode.c new file mode 100644 index 00..b55cc33761 --- /dev/null +++ b/src/intel/tools/aubinator_error_decode.c @@ -0,0 +1,733 @@ +/* + * Copyright © 2007-2017 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include
Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support
On 04/03/2017 07:33 PM, Thomas Hellstrom wrote: > On 04/03/2017 07:13 PM, Rob Clark wrote: >> On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstrom>> wrote: >>> Hi, Rob, >>> >>> On 03/24/2017 10:21 PM, Rob Clark wrote: It's kinda sad that (a) we don't have debug_backtrace support on !X86 and that (b) we re-invent our own crude backtrace support in the first place. If available, use libunwind instead. The backtrace format is based on what xserver and weston use, since it is nice not to have to figure out a different format. Signed-off-by: Rob Clark >>> Did you consider glibc "backtrace()", I think it's also available on ARM... >> I had not.. although xserver and weston are already using libunwind. >> I'm not sure about portability of libunwind to other libc >> implementations (but I guess it is at least not worse than using a >> glibc specific API). >> >> I suppose we could always add a fallback to backtrace(). >> Hmm. This commit (bisected) appears to break svga/vmwgfx in DEBUG mode: *** Error in `glxgears': malloc(): memory corruption: 0x025c09c0 *** Aborted (core dumped) The svga linux winsys makes extensive use of the backtrace functionality using u_debug_flush.c /Thomas ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: oa-counters: keep on reading reports until delimiting timestamp
On Mar 30, 2017 16:16, "Lionel Landwerlin"wrote: While exercising reading report with moderate load, we might have to wait for all the reports to land in the OA buffer, otherwise we might miss some reports. That means we need to keep on reading the OA stream until the last report we read has a timestamp older that the timestamp recorded by the MI_REPORT_PERF_COUNT at the end of the performance query. The expectation is that since we have received the second MI_REPORT_PERF_COUNT report that implies any periodic/automatic reports that the could possibly relate to a query must have been written to the OABUFFER. Since we read() as much as is available from i915 perf when we come to finish processing a query then we shouldn't miss anything. We shouldn't synchronously wait in a busy loop until we've found a report that has a timestamp >= our end MI_RPC because that could be a really significant amount of time (e.g. 50+ milliseconds on HSW). NB. the periodic samples are just to account for counter overflow. On Gen8+ the loop would likely block until the next context switch happens. Was this proposal based on a code inspection or did you maybe hit a problem correctly measuring something? Maybe if based on inspection we can find somewhere to put a comment to clarify the assumptions above? Br, - Robert Signed-off-by: Lionel Landwerlin Cc: Robert Bragg --- src/mesa/drivers/dri/i965/brw_performance_query.c | 51 ++- 1 file changed, 32 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_performance_query.c b/src/mesa/drivers/dri/i965/brw_performance_query.c index 4a94e4b3cc..076c59e633 100644 --- a/src/mesa/drivers/dri/i965/brw_performance_query.c +++ b/src/mesa/drivers/dri/i965/brw_performance_query.c @@ -647,38 +647,50 @@ discard_all_queries(struct brw_context *brw) } static bool -read_oa_samples(struct brw_context *brw) +read_oa_samples_until(struct brw_context *brw, + uint32_t start_timestamp, + uint32_t end_timestamp) { - while (1) { + uint32_t last_timestamp = start_timestamp; + + while (last_timestamp < end_timestamp) { struct brw_oa_sample_buf *buf = get_free_sample_buf(brw); + uint32_t offset; int len; while ((len = read(brw->perfquery.oa_stream_fd, buf->buf, - sizeof(buf->buf))) < 0 && errno == EINTR) + sizeof(buf->buf))) < 0 && + (errno == EINTR || errno == EAGAIN)) ; if (len <= 0) { exec_list_push_tail(>perfquery.free_sample_buffers, >link); - if (len < 0) { -if (errno == EAGAIN) - return true; -else { - DBG("Error reading i915 perf samples: %m\n"); - return false; -} - } else { + if (len < 0) +DBG("Error reading i915 perf samples: %m\n"); + else DBG("Spurious EOF reading i915 perf samples\n"); -return false; - } + + return false; } buf->len = len; exec_list_push_tail(>perfquery.sample_buffers, >link); + + /* Go through the reports and update the last timestamp. */ + while (offset < buf->len) { + const struct drm_i915_perf_record_header *header = +(const struct drm_i915_perf_record_header *) >buf[offset]; + uint32_t *report = (uint32_t *) (header + 1); + + if (header->type == DRM_I915_PERF_RECORD_SAMPLE) +last_timestamp = report[1]; + + offset += header->size; + } } - unreachable("not reached"); - return false; + return true; } /** @@ -709,10 +721,6 @@ accumulate_oa_reports(struct brw_context *brw, assert(o->Ready); - /* Collect the latest periodic OA reports from i915 perf */ - if (!read_oa_samples(brw)) - goto error; - drm_intel_bo_map(obj->oa.bo, false); query_buffer = obj->oa.bo->virtual; @@ -728,6 +736,11 @@ accumulate_oa_reports(struct brw_context *brw, goto error; } + /* Collect the latest periodic OA reports from i915 perf */ + if (!read_oa_samples_until(brw, start[1], end[1])) + goto error; + + /* See if we have any periodic reports to accumulate too... */ /* N.B. The oa.samples_head was set when the query began and -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radv: overhaul fragment shader sample positions.
Series is Reviewed-by: Bas NieuwenhuizenNot really a fan of yet another flag in the command buffer, but not sure what would be the best solution. On Mon, Apr 3, 2017 at 5:44 AM, Dave Airlie wrote: > From: Dave Airlie > > The current code was broken, and I decided to redesign it instead. > > This puts the sample positions for all samples into the queue > constant descriptor buffer after all the spill/ring descriptors. > > It then uses a single offset register to point how far into the > samples the samples for num_samples are. This saves one user sgpr > and means we only generate the sample position data in the rare > single case where we need it currently. > > This doesn't fix the failing CTS tests without the followup > fix. > > Signed-off-by: Dave Airlie > --- > src/amd/common/ac_nir_to_llvm.c | 31 +++- > src/amd/common/ac_nir_to_llvm.h | 4 ++- > src/amd/vulkan/radv_cmd_buffer.c | 61 > > src/amd/vulkan/radv_device.c | 40 ++ > src/amd/vulkan/radv_private.h| 2 ++ > 5 files changed, 87 insertions(+), 51 deletions(-) > > diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c > index 520e4cf..aabcabe 100644 > --- a/src/amd/common/ac_nir_to_llvm.c > +++ b/src/amd/common/ac_nir_to_llvm.c > @@ -109,7 +109,7 @@ struct nir_to_llvm_context { > LLVMValueRef hs_ring_tess_factor; > > LLVMValueRef prim_mask; > - LLVMValueRef sample_positions; > + LLVMValueRef sample_pos_offset; > LLVMValueRef persp_sample, persp_center, persp_centroid; > LLVMValueRef linear_sample, linear_center, linear_centroid; > LLVMValueRef front_face; > @@ -573,6 +573,7 @@ static void create_function(struct nir_to_llvm_context > *ctx) > ctx->stage == MESA_SHADER_VERTEX || > ctx->stage == MESA_SHADER_TESS_CTRL || > ctx->stage == MESA_SHADER_TESS_EVAL || > + ctx->stage == MESA_SHADER_FRAGMENT || > ctx->is_gs_copy_shader) > need_ring_offsets = true; > > @@ -584,7 +585,7 @@ static void create_function(struct nir_to_llvm_context > *ctx) > need_push_constants = false; > > if (need_ring_offsets && !ctx->options->supports_spill) { > - arg_types[arg_idx++] = const_array(ctx->v16i8, 8); /* address > of rings */ > + arg_types[arg_idx++] = const_array(ctx->v16i8, 16); /* > address of rings */ > } > > /* 1 for each descriptor set */ > @@ -679,7 +680,7 @@ static void create_function(struct nir_to_llvm_context > *ctx) > arg_types[arg_idx++] = ctx->i32; // GS instance id > break; > case MESA_SHADER_FRAGMENT: > - arg_types[arg_idx++] = const_array(ctx->f32, 32); /* sample > positions */ > + arg_types[arg_idx++] = ctx->i32; /* sample position offset */ > user_sgpr_count = arg_idx; > arg_types[arg_idx++] = ctx->i32; /* prim mask */ > sgpr_count = arg_idx; > @@ -735,7 +736,7 @@ static void create_function(struct nir_to_llvm_context > *ctx) > > LLVMPointerType(ctx->i8, CONST_ADDR_SPACE), >NULL, 0, > AC_FUNC_ATTR_READNONE); > ctx->ring_offsets = LLVMBuildBitCast(ctx->builder, > ctx->ring_offsets, > - > const_array(ctx->v16i8, 8), ""); > + > const_array(ctx->v16i8, 16), ""); > } else > ctx->ring_offsets = LLVMGetParam(ctx->main_function, > arg_idx++); > } > @@ -844,9 +845,9 @@ static void create_function(struct nir_to_llvm_context > *ctx) > ctx->gs_invocation_id = LLVMGetParam(ctx->main_function, > arg_idx++); > break; > case MESA_SHADER_FRAGMENT: > - set_userdata_location_shader(ctx, AC_UD_PS_SAMPLE_POS, > user_sgpr_idx, 2); > - user_sgpr_idx += 2; > - ctx->sample_positions = LLVMGetParam(ctx->main_function, > arg_idx++); > + set_userdata_location_shader(ctx, AC_UD_PS_SAMPLE_POS_OFFSET, > user_sgpr_idx, 1); > + user_sgpr_idx += 1; > + ctx->sample_pos_offset = LLVMGetParam(ctx->main_function, > arg_idx++); > ctx->prim_mask = LLVMGetParam(ctx->main_function, arg_idx++); > ctx->persp_sample = LLVMGetParam(ctx->main_function, > arg_idx++); > ctx->persp_center = LLVMGetParam(ctx->main_function, > arg_idx++); > @@ -3518,15 +3519,17 @@ static LLVMValueRef lookup_interp_param(struct > nir_to_llvm_context *ctx, > static LLVMValueRef
[Mesa-dev] [PATCH] radv: Enable VK_KHR_incremental_present.
Just enabling the driver-independent implementation that Jason did. Signed-off-by: Bas Nieuwenhuizen--- src/amd/vulkan/radv_device.c | 4 src/amd/vulkan/radv_entrypoints_gen.py | 1 + src/amd/vulkan/radv_wsi.c | 11 ++- 3 files changed, 15 insertions(+), 1 deletion(-) diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c index 5c48be1d11a..6456eb17f56 100644 --- a/src/amd/vulkan/radv_device.c +++ b/src/amd/vulkan/radv_device.c @@ -92,6 +92,10 @@ static const VkExtensionProperties instance_extensions[] = { static const VkExtensionProperties common_device_extensions[] = { { + .extensionName = VK_KHR_INCREMENTAL_PRESENT_EXTENSION_NAME, + .specVersion = 1, + }, + { .extensionName = VK_KHR_MAINTENANCE1_EXTENSION_NAME, .specVersion = 1, }, diff --git a/src/amd/vulkan/radv_entrypoints_gen.py b/src/amd/vulkan/radv_entrypoints_gen.py index b7b2bcf97e4..bee29dc1202 100644 --- a/src/amd/vulkan/radv_entrypoints_gen.py +++ b/src/amd/vulkan/radv_entrypoints_gen.py @@ -31,6 +31,7 @@ supported_extensions = [ 'VK_AMD_draw_indirect_count', 'VK_NV_dedicated_allocation', 'VK_KHR_get_physical_device_properties2', + 'VK_KHR_incremental_present', 'VK_KHR_maintenance1', 'VK_KHR_sampler_mirror_clamp_to_edge', 'VK_KHR_shader_draw_parameters', diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c index 8b66095b263..b8999f4eb02 100644 --- a/src/amd/vulkan/radv_wsi.c +++ b/src/amd/vulkan/radv_wsi.c @@ -26,6 +26,7 @@ #include "radv_private.h" #include "radv_meta.h" #include "wsi_common.h" +#include "util/vk_util.h" static const struct wsi_callbacks wsi_cbs = { .get_phys_device_format_properties = radv_GetPhysicalDeviceFormatProperties, @@ -452,9 +453,14 @@ VkResult radv_QueuePresentKHR( RADV_FROM_HANDLE(radv_queue, queue, _queue); VkResult result = VK_SUCCESS; + const VkPresentRegionsKHR *regions = +vk_find_struct_const(pPresentInfo->pNext, PRESENT_REGIONS_KHR); + for (uint32_t i = 0; i < pPresentInfo->swapchainCount; i++) { RADV_FROM_HANDLE(wsi_swapchain, swapchain, pPresentInfo->pSwapchains[i]); struct radeon_winsys_cs *cs; + const VkPresentRegionKHR *region = NULL; + assert(radv_device_from_handle(swapchain->device) == queue->device); if (swapchain->fences[0] == VK_NULL_HANDLE) { result = radv_CreateFence(radv_device_to_handle(queue->device), @@ -484,9 +490,12 @@ VkResult radv_QueuePresentKHR( pPresentInfo->waitSemaphoreCount, NULL, 0, false, base_fence); fence->submitted = true; + if (regions && regions->pRegions) + region = >pRegions[i]; + result = swapchain->queue_present(swapchain, pPresentInfo->pImageIndices[i], - NULL); + region); /* TODO: What if one of them returns OUT_OF_DATE? */ if (result != VK_SUCCESS) return result; -- 2.12.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache
On Fri 31 Mar 2017, Chad Versace wrote: > On Wed 15 Mar 2017, Jason Ekstrand wrote: > > This cache allows us to easily ensure that we have a unique anv_bo for > > each gem handle. We'll need this in order to support multiple-import of > > memory objects and semaphores. > > > > v2 (Jason Ekstrand): > > - Reject BO imports if the size doesn't match the prime fd size as > >reported by lseek(). > > > > v3 (Jason Ekstrand): > > - Fix reference counting around cache_release (Chris Willson) > > - Move the mutex_unlock() later in cache_release > > --- > > src/intel/vulkan/anv_allocator.c | 261 > > +++ > > src/intel/vulkan/anv_private.h | 26 > > 2 files changed, 287 insertions(+) > > > > +static uint32_t > > +hash_uint32_t(const void *key) > > +{ > > + return (uint32_t)(uintptr_t)key; > > +} > > This hash function does not appear hashy. > > If I correctly understand the details of Mesa's struct hash_table, > choosing the identify function for the hash function causes unwanted > clustering when inserting consecutive gem handles. Since the kernel does > allocate gem handles consecutively, the problem is real. > > For proof, consider the following: > >- Suppose a long-running process (maybe the compositor) has thrashed on the > hash table long enough that its bucket count > is ht->size = hash_sizes[7].size = 283. Suppose a spike of > compositor activity raises the hash table's density to about 0.5. > And suppose the hash table buckets are filled with the consecutive gem > handles > > {0, 0, 0, 0, 4, 5, 6, 7, 8, 9, ..., 127, 128, 0, 0, 0, ..., 0 } > > The exact density is (128 - 4 + 1) / 283 = 0.4417. > >- Next, some other in-process activity (maybe OpenGL) generated > a lot of gem handles after Vulkan's most recently imported > gem handle, 128. This point in the example---the reason why the gem handles in the anv_bo_cache skip from 128 to 287---is bogus in Vulkan. The problem *is* real for multiple in-process OpenGL contexts derived from the same EGLDisplay, using EGL_EXT_image_dma_buf_import, because each context shares the same intel_screen, and therefore the same drm device fd. But in Vulkan, each VkDevice opens its own drm device id. So, bogus example. BUT, that leads to a new question... Since each VkDevice has a unique drm device fd, and since the kernel allocates gem handles consecutively on the fd, and since struct hash_table only grows and never shrinks, and since patch 8/18 inserts every VkDeviceMemory into the cache... I believe no collisions are possible in anv_bo_cache. If there are no collisions, then the hash table is only adding overhead, and we should use a direct-addressing lookup table. The bo cache should look like this: struct anv_bo_cache { /* The array indices are gem handles. Null entries are legal. */ struct anv_bo **bos; /* Length of the array. Because the array can have holes, this * is *not* the number of gem handles in the array. */ size_t len; pthread_mutex_t mutex; }; struct anv_bo * anv_bo_cache_lookup(struct anv_bo_cache *cache, uint32_t gem_handle) { struct anv_bo *bo = NULL; pthread_mutex_lock(>mutex); if (gem_handle < cache->len) bo = cache->entries[gem_handle] == NULL) pthread_mutex_unlock(>mutex); return bo; } BUT, that leads to yet another question... Why is patch 8/18 inserting every VkDeviceMemory into the cache? If I understand things correctly, we only *need* to insert a VkDeviceMemory into the cache if, in vkAllocateMemory, either (1) VkExportMemoryAllocateInfoKHX::handleTypes != 0 or (2) VkMemoryAllocateInfo's pNext chain contains an import structure. If we insert into the cache only those VkDeviceMemories that are imported or that will be exported, then the bo cache remains small, and we *should* use a hash table. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radv/ac: round cube array coordinate before fixup.
Series is: Reviewed-by: Bas NieuwenhuizenOn Mon, Apr 3, 2017 at 5:46 AM, Dave Airlie wrote: > From: Dave Airlie > > This fixes: > dEQP-VK.glsl.texture_functions.texture.samplercubearray* > > Signed-off-by: Dave Airlie > --- > src/amd/common/ac_nir_to_llvm.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c > index 6985371..adba539 100644 > --- a/src/amd/common/ac_nir_to_llvm.c > +++ b/src/amd/common/ac_nir_to_llvm.c > @@ -4237,6 +4237,8 @@ static void visit_tex(struct nir_to_llvm_context *ctx, > nir_tex_instr *instr) > } > > if (instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE && coord) { > + if (instr->is_array && instr->op != nir_texop_lod) > + coords[3] = apply_round_slice(ctx, coords[3]); > for (chan = 0; chan < instr->coord_components; chan++) > coords[chan] = to_float(ctx, coords[chan]); > if (instr->coord_components == 3) > @@ -4264,7 +4266,9 @@ static void visit_tex(struct nir_to_llvm_context *ctx, > nir_tex_instr *instr) > } > if (instr->coord_components > 2) { > /* This seems like a bit of a hack - but it passes > Vulkan CTS with it */ > - if (instr->sampler_dim != GLSL_SAMPLER_DIM_3D && > instr->op != nir_texop_txf) { > + if (instr->sampler_dim != GLSL_SAMPLER_DIM_3D && > + instr->sampler_dim != GLSL_SAMPLER_DIM_CUBE && > + instr->op != nir_texop_txf) { > coords[2] = apply_round_slice(ctx, coords[2]); > } > address[count++] = coords[2]; > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa/glthread: Avoid unnecessary batch reallocation
--- src/mesa/main/glthread.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c index 3f07c420d4..c4d3f4a434 100644 --- a/src/mesa/main/glthread.c +++ b/src/mesa/main/glthread.c @@ -53,7 +53,8 @@ glthread_allocate_batch(struct gl_context *ctx) } static void -glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch) +glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch, + const bool release_batch) { size_t pos = 0; @@ -64,7 +65,10 @@ glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch) assert(pos == batch->used); - free(batch); + if (release_batch) + free(batch); + else + batch->used = 0; } static void * @@ -103,7 +107,7 @@ glthread_worker(void *data) glthread->busy = true; pthread_mutex_unlock(>mutex); - glthread_unmarshal_batch(ctx, batch); + glthread_unmarshal_batch(ctx, batch, true); pthread_mutex_lock(>mutex); glthread->busy = false; @@ -214,7 +218,7 @@ _mesa_glthread_flush_batch_locked(struct gl_context *ctx) * need to restore it when it returns. */ if (false) { - glthread_unmarshal_batch(ctx, batch); + glthread_unmarshal_batch(ctx, batch, true); _glapi_set_dispatch(ctx->CurrentClientDispatch); return; } @@ -269,9 +273,8 @@ _mesa_glthread_finish(struct gl_context *ctx) if (!(glthread->batch_queue || glthread->busy)) { if (glthread->batch && glthread->batch->used) { struct _glapi_table *dispatch = _glapi_get_dispatch(); - glthread_unmarshal_batch(ctx, glthread->batch); + glthread_unmarshal_batch(ctx, glthread->batch, false); _glapi_set_dispatch(dispatch); - glthread_allocate_batch(ctx); } } else { _mesa_glthread_flush_batch_locked(ctx); -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: move to using common buffer load format.
Reviewed-by: Bas NieuwenhuizenOn Mon, Apr 3, 2017 at 8:57 PM, Dave Airlie wrote: > From: Dave Airlie > > Get rid of usage of SI.vs.load.input. > > Signed-off-by: Dave Airlie > --- > src/amd/common/ac_nir_to_llvm.c | 13 + > 1 file changed, 5 insertions(+), 8 deletions(-) > > diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c > index 520e4cf..da38331 100644 > --- a/src/amd/common/ac_nir_to_llvm.c > +++ b/src/amd/common/ac_nir_to_llvm.c > @@ -4564,7 +4564,6 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx, > LLVMValueRef t_list_ptr = ctx->vertex_buffers; > LLVMValueRef t_offset; > LLVMValueRef t_list; > - LLVMValueRef args[3]; > LLVMValueRef input; > LLVMValueRef buffer_index; > int index = variable->data.location - VERT_ATTRIB_GENERIC0; > @@ -4586,13 +4585,11 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx, > t_offset = LLVMConstInt(ctx->i32, index + i, false); > > t_list = ac_build_indexed_load_const(>ac, t_list_ptr, > t_offset); > - args[0] = t_list; > - args[1] = LLVMConstInt(ctx->i32, 0, false); > - args[2] = buffer_index; > - input = ac_build_intrinsic(>ac, > - "llvm.SI.vs.load.input", ctx->v4f32, args, 3, > - AC_FUNC_ATTR_READNONE | AC_FUNC_ATTR_NOUNWIND | > - AC_FUNC_ATTR_LEGACY); > + > + input = ac_build_buffer_load_format(>ac, t_list, > + buffer_index, > + LLVMConstInt(ctx->i32, 0, > false), > + true); > > for (unsigned chan = 0; chan < 4; chan++) { > LLVMValueRef llvm_chan = LLVMConstInt(ctx->i32, chan, > false); > -- > 2.9.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] [RFC] mesa/glthread: misaligned address access
Address sanitizer reports lot of misaligned access: SUMMARY: AddressSanitizer: undefined-behavior main/marshal.c:276:31 in main/marshal.c:276:31: runtime error: load of misaligned address 0x631000104866 for type 'const GLuint' (aka 'const unsigned int'), which requires 4 byte alignment 0x631000104866: note: pointer points here 92 88 00 00 00 00 00 00 4a 03 0c 00 93 88 00 00 00 00 00 00 02 01 0c 00 40 8d 00 00 00 00 00 00 ^ SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28725:12 in main/marshal_generated.c:28726:12: runtime error: member access within misaligned address 0x6310003fc874 for type 'struct marshal_cmd_VertexAttribPointer', which requires 8 byte alignment 0x6310003fc874: note: pointer points here 01 00 00 00 7a 02 20 00 00 00 00 00 be be be be be be be be be be be be be be be be be be be be ^ SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28726:12 in main/marshal_generated.c:28726:12: runtime error: store to misaligned address 0x6310003fc87c for type 'GLint' (aka 'int'), which requires 8 byte alignment 0x6310003fc87c: note: pointer points here 00 00 00 00 be be be be be be be be be be be be be be be be be be be be be be be be be be be be This probably isn't issue for x86 as misaligned access shoul be as fast as aligned ones. But this could be issue for different architectures like ARM/MIPS, any exert here ? Any idea how to get correct aligment on different platforms ? --- src/mesa/main/marshal.h | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/marshal.h b/src/mesa/main/marshal.h index 2f1509b2d5..4842d27eeb 100644 --- a/src/mesa/main/marshal.h +++ b/src/mesa/main/marshal.h @@ -32,6 +32,7 @@ #include "main/glthread.h" #include "main/context.h" +#include "main/macros.h" struct marshal_cmd_base { @@ -55,15 +56,16 @@ _mesa_glthread_allocate_command(struct gl_context *ctx, { struct glthread_state *glthread = ctx->GLThread; struct marshal_cmd_base *cmd_base; + const size_t aligned_size = ALIGN(size, 8); if (unlikely(glthread->batch->used + size > MARSHAL_MAX_CMD_SIZE)) _mesa_glthread_flush_batch(ctx); cmd_base = (struct marshal_cmd_base *) >batch->buffer[glthread->batch->used]; - glthread->batch->used += size; + glthread->batch->used += aligned_size; cmd_base->cmd_id = cmd_id; - cmd_base->cmd_size = size; + cmd_base->cmd_size = aligned_size; return cmd_base; } -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/glthread: Avoid unnecessary batch reallocation
On 03.04.2017 20:53, Bartosz Tomczyk wrote: Actually, I can just set only batch->used to 0, but it seems to error prone. When someone adds some fields to batch struct, it will be easy to miss that it should be initialized in glthread_unmarshal_batch. Better to have it fail early and loudly with garbage data, rather than silently if 0 happens to be an accepted value. I think it should be changed (unless there's a good reason to rely on it on purpose, but I don't see one...). Cheers, Nicolai Anyway I can change it if you want to. On Mon, Apr 3, 2017 at 8:42 PM, Nicolai Hähnle> wrote: On 03.04.2017 20:38, Bartosz Tomczyk wrote: --- src/mesa/main/glthread.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c index 3f07c420d4..aa14292e59 100644 --- a/src/mesa/main/glthread.c +++ b/src/mesa/main/glthread.c @@ -53,7 +53,8 @@ glthread_allocate_batch(struct gl_context *ctx) } static void -glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch) +glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch, + const bool release_batch) { size_t pos = 0; @@ -64,7 +65,10 @@ glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch) assert(pos == batch->used); - free(batch); + if (release_batch) + free(batch); + else + memset(batch, 0, offsetof(struct glthread_batch, buffer)); Hmm. Why do we memset the batch? Seems like a trivial optimization to just not do that... Apart from that, the patch looks good to me. Cheers, Nicolai } static void * @@ -103,7 +107,7 @@ glthread_worker(void *data) glthread->busy = true; pthread_mutex_unlock(>mutex); - glthread_unmarshal_batch(ctx, batch); + glthread_unmarshal_batch(ctx, batch, true); pthread_mutex_lock(>mutex); glthread->busy = false; @@ -214,7 +218,7 @@ _mesa_glthread_flush_batch_locked(struct gl_context *ctx) * need to restore it when it returns. */ if (false) { - glthread_unmarshal_batch(ctx, batch); + glthread_unmarshal_batch(ctx, batch, true); _glapi_set_dispatch(ctx->CurrentClientDispatch); return; } @@ -269,9 +273,8 @@ _mesa_glthread_finish(struct gl_context *ctx) if (!(glthread->batch_queue || glthread->busy)) { if (glthread->batch && glthread->batch->used) { struct _glapi_table *dispatch = _glapi_get_dispatch(); - glthread_unmarshal_batch(ctx, glthread->batch); + glthread_unmarshal_batch(ctx, glthread->batch, false); _glapi_set_dispatch(dispatch); - glthread_allocate_batch(ctx); } } else { _mesa_glthread_flush_batch_locked(ctx); -- Lerne, wie die Welt wirklich ist, Aber vergiss niemals, wie sie sein sollte. -- Lerne, wie die Welt wirklich ist, Aber vergiss niemals, wie sie sein sollte. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radv: move to using common buffer load format.
From: Dave AirlieGet rid of usage of SI.vs.load.input. Signed-off-by: Dave Airlie --- src/amd/common/ac_nir_to_llvm.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 520e4cf..da38331 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -4564,7 +4564,6 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx, LLVMValueRef t_list_ptr = ctx->vertex_buffers; LLVMValueRef t_offset; LLVMValueRef t_list; - LLVMValueRef args[3]; LLVMValueRef input; LLVMValueRef buffer_index; int index = variable->data.location - VERT_ATTRIB_GENERIC0; @@ -4586,13 +4585,11 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx, t_offset = LLVMConstInt(ctx->i32, index + i, false); t_list = ac_build_indexed_load_const(>ac, t_list_ptr, t_offset); - args[0] = t_list; - args[1] = LLVMConstInt(ctx->i32, 0, false); - args[2] = buffer_index; - input = ac_build_intrinsic(>ac, - "llvm.SI.vs.load.input", ctx->v4f32, args, 3, - AC_FUNC_ATTR_READNONE | AC_FUNC_ATTR_NOUNWIND | - AC_FUNC_ATTR_LEGACY); + + input = ac_build_buffer_load_format(>ac, t_list, + buffer_index, + LLVMConstInt(ctx->i32, 0, false), + true); for (unsigned chan = 0; chan < 4; chan++) { LLVMValueRef llvm_chan = LLVMConstInt(ctx->i32, chan, false); -- 2.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/glthread: Avoid unnecessary batch reallocation
Actually, I can just set only batch->used to 0, but it seems to error prone. When someone adds some fields to batch struct, it will be easy to miss that it should be initialized in glthread_unmarshal_batch. Anyway I can change it if you want to. On Mon, Apr 3, 2017 at 8:42 PM, Nicolai Hähnlewrote: > On 03.04.2017 20:38, Bartosz Tomczyk wrote: > >> --- >> src/mesa/main/glthread.c | 15 +-- >> 1 file changed, 9 insertions(+), 6 deletions(-) >> >> diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c >> index 3f07c420d4..aa14292e59 100644 >> --- a/src/mesa/main/glthread.c >> +++ b/src/mesa/main/glthread.c >> @@ -53,7 +53,8 @@ glthread_allocate_batch(struct gl_context *ctx) >> } >> >> static void >> -glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch >> *batch) >> +glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch >> *batch, >> + const bool release_batch) >> { >> size_t pos = 0; >> >> @@ -64,7 +65,10 @@ glthread_unmarshal_batch(struct gl_context *ctx, >> struct glthread_batch *batch) >> >> assert(pos == batch->used); >> >> - free(batch); >> + if (release_batch) >> + free(batch); >> + else >> + memset(batch, 0, offsetof(struct glthread_batch, buffer)); >> > > Hmm. Why do we memset the batch? Seems like a trivial optimization to just > not do that... > > Apart from that, the patch looks good to me. > > Cheers, > Nicolai > > > > } >> >> static void * >> @@ -103,7 +107,7 @@ glthread_worker(void *data) >>glthread->busy = true; >>pthread_mutex_unlock(>mutex); >> >> - glthread_unmarshal_batch(ctx, batch); >> + glthread_unmarshal_batch(ctx, batch, true); >> >>pthread_mutex_lock(>mutex); >>glthread->busy = false; >> @@ -214,7 +218,7 @@ _mesa_glthread_flush_batch_locked(struct gl_context >> *ctx) >> * need to restore it when it returns. >> */ >> if (false) { >> - glthread_unmarshal_batch(ctx, batch); >> + glthread_unmarshal_batch(ctx, batch, true); >>_glapi_set_dispatch(ctx->CurrentClientDispatch); >>return; >> } >> @@ -269,9 +273,8 @@ _mesa_glthread_finish(struct gl_context *ctx) >> if (!(glthread->batch_queue || glthread->busy)) { >>if (glthread->batch && glthread->batch->used) { >> struct _glapi_table *dispatch = _glapi_get_dispatch(); >> - glthread_unmarshal_batch(ctx, glthread->batch); >> + glthread_unmarshal_batch(ctx, glthread->batch, false); >> _glapi_set_dispatch(dispatch); >> - glthread_allocate_batch(ctx); >>} >> } else { >>_mesa_glthread_flush_batch_locked(ctx); >> >> > > -- > Lerne, wie die Welt wirklich ist, > Aber vergiss niemals, wie sie sein sollte. > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] radeonsi: use i32_0/1 instead of *int_bld.zero/one in most places
On 3 April 2017 at 19:52, Marek Olšákwrote: > From: Marek Olšák radv uses i32zero and i32one, it might be nice to be consistent, but I don't mind which way. Dave. > > --- > src/gallium/drivers/radeonsi/si_shader.c | 88 > ++ > .../drivers/radeonsi/si_shader_tgsi_setup.c| 14 ++-- > 2 files changed, 47 insertions(+), 55 deletions(-) > > diff --git a/src/gallium/drivers/radeonsi/si_shader.c > b/src/gallium/drivers/radeonsi/si_shader.c > index 21efd9a..a5d370b 100644 > --- a/src/gallium/drivers/radeonsi/si_shader.c > +++ b/src/gallium/drivers/radeonsi/si_shader.c > @@ -538,38 +538,38 @@ static void declare_input_vs( > break; > } > } > > static LLVMValueRef get_primitive_id(struct lp_build_tgsi_context *bld_base, > unsigned swizzle) > { > struct si_shader_context *ctx = si_shader_context(bld_base); > > if (swizzle > 0) > - return bld_base->uint_bld.zero; > + return ctx->i32_0; > > switch (ctx->type) { > case PIPE_SHADER_VERTEX: > return LLVMGetParam(ctx->main_fn, > ctx->param_vs_prim_id); > case PIPE_SHADER_TESS_CTRL: > return LLVMGetParam(ctx->main_fn, > SI_PARAM_PATCH_ID); > case PIPE_SHADER_TESS_EVAL: > return LLVMGetParam(ctx->main_fn, > ctx->param_tes_patch_id); > case PIPE_SHADER_GEOMETRY: > return LLVMGetParam(ctx->main_fn, > SI_PARAM_PRIMITIVE_ID); > default: > assert(0); > - return bld_base->uint_bld.zero; > + return ctx->i32_0; > } > } > > /** > * Return the value of tgsi_ind_register for indexing. > * This is the indirect index with the constant offset added to it. > */ > static LLVMValueRef get_indirect_index(struct si_shader_context *ctx, >const struct tgsi_ind_register *ind, >int rel_index) > @@ -1096,28 +1096,28 @@ static LLVMValueRef fetch_input_gs( > vtx_offset_param += SI_PARAM_VTX2_OFFSET - 2; > } > vtx_offset = lp_build_mul_imm(uint, > LLVMGetParam(ctx->main_fn, >vtx_offset_param), > 4); > > param = si_shader_io_get_unique_index(semantic_name, semantic_index); > soffset = LLVMConstInt(ctx->i32, (param * 4 + swizzle) * 256, 0); > > - value = ac_build_buffer_load(>ac, ctx->esgs_ring, 1, uint->zero, > + value = ac_build_buffer_load(>ac, ctx->esgs_ring, 1, ctx->i32_0, > vtx_offset, soffset, 0, 1, 0, true); > if (tgsi_type_is_64bit(type)) { > LLVMValueRef value2; > soffset = LLVMConstInt(ctx->i32, (param * 4 + swizzle + 1) * > 256, 0); > > value2 = ac_build_buffer_load(>ac, ctx->esgs_ring, 1, > - uint->zero, vtx_offset, soffset, > + ctx->i32_0, vtx_offset, soffset, > 0, 1, 0, true); > return si_llvm_emit_fetch_64bit(bld_base, type, > value, value2); > } > return LLVMBuildBitCast(gallivm->builder, > value, > tgsi2llvmtype(bld_base, type), ""); > } > > static int lookup_interp_param_index(unsigned interpolate, unsigned location) > @@ -1169,21 +1169,20 @@ static void interp_fs_input(struct si_shader_context > *ctx, > unsigned semantic_index, > unsigned num_interp_inputs, > unsigned colors_read_mask, > LLVMValueRef interp_param, > LLVMValueRef prim_mask, > LLVMValueRef face, > LLVMValueRef result[4]) > { > struct lp_build_tgsi_context *bld_base = >bld_base; > struct lp_build_context *base = _base->base; > - struct lp_build_context *uint = _base->uint_bld; > struct gallivm_state *gallivm = base->gallivm; > LLVMValueRef attr_number; > LLVMValueRef i, j; > > unsigned chan; > > /* fs.constant returns the param from the middle vertex, so it's not > * really useful for flat shading. It's meant to be used for custom > * interpolation (but the intrinsic can't fetch from the other two > * vertices). > @@ -1198,41 +1197,41 @@ static void interp_fs_input(struct si_shader_context > *ctx, >
Re: [Mesa-dev] [PATCH] travis: Support LLVM 3.8+ on Trusty-based Travis-CI via apt-get not apt addon
On 2 April 2017 at 21:48, Rhys Kiddwrote: > Per comments by Travis-CI, the apt addon is only really needed for the > container-based Precise builds, as they don't yet support Trusty on that > platform. > > Mesa currently uses Trusty fully-virtualized environment (due to sudo: > required). > > See further: > https://docs.travis-ci.com/user/trusty-ci-environment/#Fully-virtualized-via-sudo%3A-required > https://github.com/travis-ci/apt-source-whitelist/pull/205#issuecomment-216054237 > > Signed-off-by: Rhys Kidd You're a start Rhys, thank you ! Reviewed-by: Emil Velikov and pushed to master. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support
Hi Rob, On 24 March 2017 at 21:21, Rob Clarkwrote: > It's kinda sad that (a) we don't have debug_backtrace support on !X86 > and that (b) we re-invent our own crude backtrace support in the first > place. If available, use libunwind instead. The backtrace format is > based on what xserver and weston use, since it is nice not to have to > figure out a different format. > > Signed-off-by: Rob Clark > --- > configure.ac | 24 > src/gallium/Automake.inc | 1 + > src/gallium/auxiliary/util/u_debug_stack.c | 91 > ++ > src/gallium/auxiliary/util/u_debug_stack.h | 15 - > 4 files changed, 129 insertions(+), 2 deletions(-) > > diff --git a/configure.ac b/configure.ac > index a99684b..5046acb 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -1025,6 +1025,30 @@ AC_SUBST([LLVM_LIBS]) > AC_SUBST([LLVM_LDFLAGS]) > AC_SUBST([LLVM_INCLUDEDIR]) > > +dnl > +dnl libunwind > +dnl > +AC_ARG_ENABLE([libunwind], > +[AS_HELP_STRING([--enable-libunwind], > +[Use libunwind for backtracing (default: auto)])], > +[LIBUNWIND="$enableval"], > +[LIBUNWIND="auto"]) > + > +PKG_CHECK_MODULES(LIBUNWIND, libunwind, [HAVE_LIBUNWIND=yes], > [HAVE_LIBUNWIND=no]) > +if test "x$LIBUNWIND" = "xauto"; then > +LIBUNWIND="$HAVE_LIBUNWIND" > +fi > + > +if test "x$LIBUNWIND" = "xyes"; then > +if test "x$HAVE_LIBUNWIND" != "xyes"; then > +AC_MSG_ERROR([libunwind requested but not installed.]) > +fi > +AC_DEFINE(HAVE_LIBUNWIND, 1, [Have libunwind support]) > +fi > + > +AM_CONDITIONAL(HAVE_LIBUNWIND, [test "x$LIBUNWIND" = xyes]) > + > + > dnl Options for APIs > AC_ARG_ENABLE([opengl], > [AS_HELP_STRING([--disable-opengl], > diff --git a/src/gallium/Automake.inc b/src/gallium/Automake.inc > index a01fa54..48b5a44 100644 > --- a/src/gallium/Automake.inc > +++ b/src/gallium/Automake.inc > @@ -46,6 +46,7 @@ GALLIUM_TARGET_CFLAGS = \ > > GALLIUM_COMMON_LIB_DEPS = \ > -lm \ > + $(LIBUNWIND_LIBS) \ We're using the LIBUNWIND_LIBS but LIBUNWIND_CFLAGS is missing. Please it to src/gallium/auxiliary/Makefile.am's AM_CFLAGS > diff --git a/src/gallium/auxiliary/util/u_debug_stack.h > b/src/gallium/auxiliary/util/u_debug_stack.h > index 04eba08..0effcbe 100644 > --- a/src/gallium/auxiliary/util/u_debug_stack.h > +++ b/src/gallium/auxiliary/util/u_debug_stack.h > @@ -30,6 +30,11 @@ > > #include > > +#ifdef HAVE_LIBUNWIND > +#define UNW_LOCAL_ONLY > +#include > +#endif > + This hunk is not needed in the header. Please move it to the C file. With the above Reviewed-by: Emil Velikov -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/glthread: Avoid unnecessary batch reallocation
On 03.04.2017 20:38, Bartosz Tomczyk wrote: --- src/mesa/main/glthread.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c index 3f07c420d4..aa14292e59 100644 --- a/src/mesa/main/glthread.c +++ b/src/mesa/main/glthread.c @@ -53,7 +53,8 @@ glthread_allocate_batch(struct gl_context *ctx) } static void -glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch) +glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch, + const bool release_batch) { size_t pos = 0; @@ -64,7 +65,10 @@ glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch) assert(pos == batch->used); - free(batch); + if (release_batch) + free(batch); + else + memset(batch, 0, offsetof(struct glthread_batch, buffer)); Hmm. Why do we memset the batch? Seems like a trivial optimization to just not do that... Apart from that, the patch looks good to me. Cheers, Nicolai } static void * @@ -103,7 +107,7 @@ glthread_worker(void *data) glthread->busy = true; pthread_mutex_unlock(>mutex); - glthread_unmarshal_batch(ctx, batch); + glthread_unmarshal_batch(ctx, batch, true); pthread_mutex_lock(>mutex); glthread->busy = false; @@ -214,7 +218,7 @@ _mesa_glthread_flush_batch_locked(struct gl_context *ctx) * need to restore it when it returns. */ if (false) { - glthread_unmarshal_batch(ctx, batch); + glthread_unmarshal_batch(ctx, batch, true); _glapi_set_dispatch(ctx->CurrentClientDispatch); return; } @@ -269,9 +273,8 @@ _mesa_glthread_finish(struct gl_context *ctx) if (!(glthread->batch_queue || glthread->busy)) { if (glthread->batch && glthread->batch->used) { struct _glapi_table *dispatch = _glapi_get_dispatch(); - glthread_unmarshal_batch(ctx, glthread->batch); + glthread_unmarshal_batch(ctx, glthread->batch, false); _glapi_set_dispatch(dispatch); - glthread_allocate_batch(ctx); } } else { _mesa_glthread_flush_batch_locked(ctx); -- Lerne, wie die Welt wirklich ist, Aber vergiss niemals, wie sie sein sollte. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] intel: gen_decoder: store pointer to current decoded field in iterator
Both are Reviewed-by: Matt Turner___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa/glthread: Avoid unnecessary batch reallocation
--- src/mesa/main/glthread.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c index 3f07c420d4..aa14292e59 100644 --- a/src/mesa/main/glthread.c +++ b/src/mesa/main/glthread.c @@ -53,7 +53,8 @@ glthread_allocate_batch(struct gl_context *ctx) } static void -glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch) +glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch, + const bool release_batch) { size_t pos = 0; @@ -64,7 +65,10 @@ glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch) assert(pos == batch->used); - free(batch); + if (release_batch) + free(batch); + else + memset(batch, 0, offsetof(struct glthread_batch, buffer)); } static void * @@ -103,7 +107,7 @@ glthread_worker(void *data) glthread->busy = true; pthread_mutex_unlock(>mutex); - glthread_unmarshal_batch(ctx, batch); + glthread_unmarshal_batch(ctx, batch, true); pthread_mutex_lock(>mutex); glthread->busy = false; @@ -214,7 +218,7 @@ _mesa_glthread_flush_batch_locked(struct gl_context *ctx) * need to restore it when it returns. */ if (false) { - glthread_unmarshal_batch(ctx, batch); + glthread_unmarshal_batch(ctx, batch, true); _glapi_set_dispatch(ctx->CurrentClientDispatch); return; } @@ -269,9 +273,8 @@ _mesa_glthread_finish(struct gl_context *ctx) if (!(glthread->batch_queue || glthread->busy)) { if (glthread->batch && glthread->batch->used) { struct _glapi_table *dispatch = _glapi_get_dispatch(); - glthread_unmarshal_batch(ctx, glthread->batch); + glthread_unmarshal_batch(ctx, glthread->batch, false); _glapi_set_dispatch(dispatch); - glthread_allocate_batch(ctx); } } else { _mesa_glthread_flush_batch_locked(ctx); -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] bin/get-fixes-pick-list.sh: fix typo
Reviewed-by: Emil Velikov-Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] util: fix MSVC warning in u_align_u32()
On 3 April 2017 at 17:09, Brian Paulwrote: > To silence > C:\Users\Brian\projects\mesa\src\util/u_vector.h(41) : warning C4146: unary > minus operator applied to unsigned type, result still unsigned For the series: Reviewed-by: Emil Velikov -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv/winsys: only workout color/depth levels if we have color/depth
Reviewed-by: Bas NieuwenhuizenOn Mon, Apr 3, 2017 at 7:17 AM, Dave Airlie wrote: > From: Dave Airlie > > This fixes an old bug that seems to get triggered by > dEQP-VK.memory.requirements.image.sparse_tiling_optimal > > We return early when allocating S8_UINT due to there being > no color or depth, and end up with image size of 0. > > Signed-off-by: Dave Airlie > --- > src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c | 48 > -- > 1 file changed, 26 insertions(+), 22 deletions(-) > > diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c > b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c > index 0433952..53ca6ff 100644 > --- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c > +++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c > @@ -481,28 +481,32 @@ static int radv_amdgpu_winsys_surface_init(struct > radeon_winsys *_ws, > surf->htile_size = surf->htile_slice_size = 0; > surf->htile_alignment = 1; > > - /* Calculate texture layout information. */ > - for (level = 0; level <= surf->last_level; level++) { > - r = radv_compute_level(ws->addrlib, surf, false, level, type, > compressed, > - , , > , ); > - if (r) > - return r; > - > - if (level == 0) { > - surf->bo_alignment = AddrSurfInfoOut.baseAlign; > - surf->pipe_config = > AddrSurfInfoOut.pTileInfo->pipeConfig - 1; > - radv_set_micro_tile_mode(surf, >info); > - > - /* For 2D modes only. */ > - if (AddrSurfInfoOut.tileMode >= > ADDR_TM_2D_TILED_THIN1) { > - surf->bankw = > AddrSurfInfoOut.pTileInfo->bankWidth; > - surf->bankh = > AddrSurfInfoOut.pTileInfo->bankHeight; > - surf->mtilea = > AddrSurfInfoOut.pTileInfo->macroAspectRatio; > - surf->tile_split = > AddrSurfInfoOut.pTileInfo->tileSplitBytes; > - surf->num_banks = > AddrSurfInfoOut.pTileInfo->banks; > - surf->macro_tile_index = > AddrSurfInfoOut.macroModeIndex; > - } else { > - surf->macro_tile_index = 0; > + if (AddrSurfInfoIn.flags.color || AddrSurfInfoIn.flags.depth) { > + /* Calculate texture layout information. */ > + for (level = 0; level <= surf->last_level; level++) { > + r = radv_compute_level(ws->addrlib, surf, false, > level, type, compressed, > + , > , , ); > + if (r) { > + assert(0); > + return r; > + } > + > + if (level == 0) { > + surf->bo_alignment = > AddrSurfInfoOut.baseAlign; > + surf->pipe_config = > AddrSurfInfoOut.pTileInfo->pipeConfig - 1; > + radv_set_micro_tile_mode(surf, >info); > + > + /* For 2D modes only. */ > + if (AddrSurfInfoOut.tileMode >= > ADDR_TM_2D_TILED_THIN1) { > + surf->bankw = > AddrSurfInfoOut.pTileInfo->bankWidth; > + surf->bankh = > AddrSurfInfoOut.pTileInfo->bankHeight; > + surf->mtilea = > AddrSurfInfoOut.pTileInfo->macroAspectRatio; > + surf->tile_split = > AddrSurfInfoOut.pTileInfo->tileSplitBytes; > + surf->num_banks = > AddrSurfInfoOut.pTileInfo->banks; > + surf->macro_tile_index = > AddrSurfInfoOut.macroModeIndex; > + } else { > + surf->macro_tile_index = 0; > + } > } > } > } > -- > 2.9.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] util/u_atomic: provide 64bit atomics where they're missing
On Thu, Mar 30, 2017 at 3:47 PM, Matt Turnerwrote: > On Thu, Mar 30, 2017 at 3:26 PM, Grazvydas Ignotas wrote: >> There are still some distributions trying to support unfortunate people >> with old or exotic CPUs that don't have 64bit atomic operations. When >> compiling for such a machine, gcc conveniently inserts a library call to >> a helper, but it's implementation is missing and we get a linker error. >> This allows us to provide our own implementation, which is marked weak >> to prefer a better implementation, should one exist. >> >> v2: changed copyright, some style adjustments >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089 >> Signed-off-by: Grazvydas Ignotas >> Reviewed-by: Matt Turner > > Thanks, I'll commit this. > >> --- >> no commit access, but request sent: >> https://bugs.freedesktop.org/show_bug.cgi?id=100467 > > Thanks. I commented on the bug and said I approve :) I modified the patch slightly to print the results of the test with AC_MSG_CHECKING/AC_MSG_RESULT and pushed it as commit a6a38a038bd62e. I'll do some build tests on various architectures. I think I like the idea of this going to the stable 17.0 branch. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radv: Increase descriptor limits.
We supported more generally. Decreased the dynamic buffers though, as we only support 16 for uniform+storage. Signed-off-by: Bas Nieuwenhuizen--- src/amd/vulkan/radv_device.c | 28 ++-- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c index 5c48be1d11a..d3aac90468c 100644 --- a/src/amd/vulkan/radv_device.c +++ b/src/amd/vulkan/radv_device.c @@ -545,21 +545,21 @@ void radv_GetPhysicalDeviceProperties( .bufferImageGranularity = 64, /* A cache line */ .sparseAddressSpaceSize = 0xu, /* buffer max size */ .maxBoundDescriptorSets = MAX_SETS, - .maxPerStageDescriptorSamplers= 64, - .maxPerStageDescriptorUniformBuffers = 64, - .maxPerStageDescriptorStorageBuffers = 64, - .maxPerStageDescriptorSampledImages = 64, - .maxPerStageDescriptorStorageImages = 64, - .maxPerStageDescriptorInputAttachments= 64, - .maxPerStageResources = 128, + .maxPerStageDescriptorSamplers= (1u << 31) / 16, + .maxPerStageDescriptorUniformBuffers = (1u << 31) / 16, + .maxPerStageDescriptorStorageBuffers = (1u << 31) / 16, + .maxPerStageDescriptorSampledImages = (1u << 31) / 96, + .maxPerStageDescriptorStorageImages = (1u << 31) / 64, + .maxPerStageDescriptorInputAttachments= (1u << 31) / 64, + .maxPerStageResources = (1u << 31) / 32, .maxDescriptorSetSamplers = 256, - .maxDescriptorSetUniformBuffers = 256, - .maxDescriptorSetUniformBuffersDynamic= 256, - .maxDescriptorSetStorageBuffers = 256, - .maxDescriptorSetStorageBuffersDynamic= 256, - .maxDescriptorSetSampledImages= 256, - .maxDescriptorSetStorageImages= 256, - .maxDescriptorSetInputAttachments = 256, + .maxDescriptorSetUniformBuffers = (1u << 31) / 16, + .maxDescriptorSetUniformBuffersDynamic= 8, + .maxDescriptorSetStorageBuffers = (1u << 31) / 16, + .maxDescriptorSetStorageBuffersDynamic= 8, + .maxDescriptorSetSampledImages= (1u << 31) / 96, + .maxDescriptorSetStorageImages= (1u << 31) / 64, + .maxDescriptorSetInputAttachments = (1u << 31) / 64, .maxVertexInputAttributes = 32, .maxVertexInputBindings = 32, .maxVertexInputAttributeOffset= 2047, -- 2.12.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: Fix blob memory leak
--- src/compiler/glsl/blob.h | 11 +++ src/compiler/glsl/shader_cache.cpp | 2 +- src/compiler/glsl/tests/blob_test.c | 8 src/mesa/state_tracker/st_shader_cache.c | 2 +- 4 files changed, 17 insertions(+), 6 deletions(-) diff --git a/src/compiler/glsl/blob.h b/src/compiler/glsl/blob.h index 9de12e6eb8..940c81e13b 100644 --- a/src/compiler/glsl/blob.h +++ b/src/compiler/glsl/blob.h @@ -27,6 +27,7 @@ #include #include #include +#include #ifdef __cplusplus extern "C" { @@ -80,6 +81,16 @@ struct blob * blob_create(void); /** + * Destroy a blob and free its memory. + */ +static inline void +blob_destroy(struct blob *blob) +{ + free(blob->data); + free(blob); +} + +/** * Add some unstructured, fixed-size data to a blob. * * \return True unless allocation failed. diff --git a/src/compiler/glsl/shader_cache.cpp b/src/compiler/glsl/shader_cache.cpp index ea1bc01f02..f5e6a22bb9 100644 --- a/src/compiler/glsl/shader_cache.cpp +++ b/src/compiler/glsl/shader_cache.cpp @@ -1273,7 +1273,7 @@ shader_cache_write_program_metadata(struct gl_context *ctx, disk_cache_put(cache, prog->data->sha1, metadata->data, metadata->size); - free(metadata); + blob_destroy(metadata); if (ctx->_Shader->Flags & GLSL_CACHE_INFO) { _mesa_sha1_format(sha1_buf, prog->data->sha1); diff --git a/src/compiler/glsl/tests/blob_test.c b/src/compiler/glsl/tests/blob_test.c index 01b5ef0b2f..df0e91af35 100644 --- a/src/compiler/glsl/tests/blob_test.c +++ b/src/compiler/glsl/tests/blob_test.c @@ -184,7 +184,7 @@ test_write_and_read_functions (void) "read_consumes_all_bytes"); expect_equal(false, reader.overrun, "read_does_not_overrun"); - free(blob); + blob_destroy(blob); } /* Test that data values are written and read with proper alignment. */ @@ -242,7 +242,7 @@ test_alignment(void) "aligned read of intptr_t"); } - free(blob); + blob_destroy(blob); } /* Test that we detect overrun. */ @@ -264,7 +264,7 @@ test_overrun(void) expect_equal(0, blob_read_uint32(), "read at overrun"); expect_equal(true, reader.overrun, "overrun flag set"); - free(blob); + blob_destroy(blob); } /* Test that we can read and write some large objects, (exercising the code in @@ -308,7 +308,7 @@ test_big_objects(void) expect_equal(false, reader.overrun, "overrun flag not set reading large objects"); - free(blob); + blob_destroy(blob); ralloc_free(ctx); } diff --git a/src/mesa/state_tracker/st_shader_cache.c b/src/mesa/state_tracker/st_shader_cache.c index e8c7289ec6..1a11f1135d 100644 --- a/src/mesa/state_tracker/st_shader_cache.c +++ b/src/mesa/state_tracker/st_shader_cache.c @@ -135,7 +135,7 @@ st_store_tgsi_in_disk_cache(struct st_context *st, struct gl_program *prog, _mesa_shader_stage_to_string(prog->info.stage), sha1_buf); } - free(blob); + blob_destroy(blob); } static void -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support
On 04/03/2017 07:13 PM, Rob Clark wrote: > On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstrom> wrote: >> Hi, Rob, >> >> On 03/24/2017 10:21 PM, Rob Clark wrote: >>> It's kinda sad that (a) we don't have debug_backtrace support on !X86 >>> and that (b) we re-invent our own crude backtrace support in the first >>> place. If available, use libunwind instead. The backtrace format is >>> based on what xserver and weston use, since it is nice not to have to >>> figure out a different format. >>> >>> Signed-off-by: Rob Clark >> Did you consider glibc "backtrace()", I think it's also available on ARM... > I had not.. although xserver and weston are already using libunwind. > I'm not sure about portability of libunwind to other libc > implementations (but I guess it is at least not worse than using a > glibc specific API). > > I suppose we could always add a fallback to backtrace(). > >> Also is the output format the same as before, or at least compatible with >> gallium/tools/addr2line.sh? > quite possibly not.. I chose to align to the format that xserver and > weston was already using. Otoh, not sure if you would need to use > addr2line.sh since it already decodes things to human readable > fxn/file names. > > BR, > -R backtrace() (or the homebrew i386 implementation) + addr2line.sh gives you both function name, source file and line number, which IMO is pretty useful. I'll give the new format a shot. Perhaps if needed we could have a config option to choose the method. I guess this is mostly used in DEBUG builds anyway, so a developer can choose the desired method... /Thomas > >> Thanks, >> Thomas >> ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support
On Mon, Apr 3, 2017 at 10:13 AM, Rob Clarkwrote: > On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstrom > wrote: >> Hi, Rob, >> >> On 03/24/2017 10:21 PM, Rob Clark wrote: >>> It's kinda sad that (a) we don't have debug_backtrace support on !X86 >>> and that (b) we re-invent our own crude backtrace support in the first >>> place. If available, use libunwind instead. The backtrace format is >>> based on what xserver and weston use, since it is nice not to have to >>> figure out a different format. >>> >>> Signed-off-by: Rob Clark >> >> Did you consider glibc "backtrace()", I think it's also available on ARM... > > I had not.. although xserver and weston are already using libunwind. > I'm not sure about portability of libunwind to other libc > implementations (but I guess it is at least not worse than using a > glibc specific API). We did use glibc backtrace initially in weston, but switched to libbacktrace to get better support for symbols in dlopened modules. Here's the commit message: compositor: Use libunwind if available for better backtraces libunwind has a dwarf parser and automatically queries the dlinfo for location of dlopened modules. The resulting backtrace is much better and includes stack frames in dynamically loaded modules. krh: Originally submitted for Xorg, adapted for weston: http://lists.x.org/archives/xorg-devel/2013-February/035493.html Note this require libunwind at least 1.1 to get the pkg-config files. Signed-off-by: Marcin Slusarz See https://cgit.freedesktop.org/wayland/weston/commit/?id=554a0da74a3f6fc945503a3eb1bfcc9038441b39 Kristian > I suppose we could always add a fallback to backtrace(). > >> Also is the output format the same as before, or at least compatible with >> gallium/tools/addr2line.sh? > > quite possibly not.. I chose to align to the format that xserver and > weston was already using. Otoh, not sure if you would need to use > addr2line.sh since it already decodes things to human readable > fxn/file names. > > BR, > -R > >> Thanks, >> Thomas >> > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support
On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstromwrote: > Hi, Rob, > > On 03/24/2017 10:21 PM, Rob Clark wrote: >> It's kinda sad that (a) we don't have debug_backtrace support on !X86 >> and that (b) we re-invent our own crude backtrace support in the first >> place. If available, use libunwind instead. The backtrace format is >> based on what xserver and weston use, since it is nice not to have to >> figure out a different format. >> >> Signed-off-by: Rob Clark > > Did you consider glibc "backtrace()", I think it's also available on ARM... I had not.. although xserver and weston are already using libunwind. I'm not sure about portability of libunwind to other libc implementations (but I guess it is at least not worse than using a glibc specific API). I suppose we could always add a fallback to backtrace(). > Also is the output format the same as before, or at least compatible with > gallium/tools/addr2line.sh? quite possibly not.. I chose to align to the format that xserver and weston was already using. Otoh, not sure if you would need to use addr2line.sh since it already decodes things to human readable fxn/file names. BR, -R > Thanks, > Thomas > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support
Hi, Rob, On 03/24/2017 10:21 PM, Rob Clark wrote: > It's kinda sad that (a) we don't have debug_backtrace support on !X86 > and that (b) we re-invent our own crude backtrace support in the first > place. If available, use libunwind instead. The backtrace format is > based on what xserver and weston use, since it is nice not to have to > figure out a different format. > > Signed-off-by: Rob ClarkDid you consider glibc "backtrace()", I think it's also available on ARM... Also is the output format the same as before, or at least compatible with gallium/tools/addr2line.sh? Thanks, Thomas > --- > configure.ac | 24 > src/gallium/Automake.inc | 1 + > src/gallium/auxiliary/util/u_debug_stack.c | 91 > ++ > src/gallium/auxiliary/util/u_debug_stack.h | 15 - > 4 files changed, 129 insertions(+), 2 deletions(-) > > diff --git a/configure.ac b/configure.ac > index a99684b..5046acb 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -1025,6 +1025,30 @@ AC_SUBST([LLVM_LIBS]) > AC_SUBST([LLVM_LDFLAGS]) > AC_SUBST([LLVM_INCLUDEDIR]) > > +dnl > +dnl libunwind > +dnl > +AC_ARG_ENABLE([libunwind], > +[AS_HELP_STRING([--enable-libunwind], > +[Use libunwind for backtracing (default: auto)])], > +[LIBUNWIND="$enableval"], > +[LIBUNWIND="auto"]) > + > +PKG_CHECK_MODULES(LIBUNWIND, libunwind, [HAVE_LIBUNWIND=yes], > [HAVE_LIBUNWIND=no]) > +if test "x$LIBUNWIND" = "xauto"; then > +LIBUNWIND="$HAVE_LIBUNWIND" > +fi > + > +if test "x$LIBUNWIND" = "xyes"; then > +if test "x$HAVE_LIBUNWIND" != "xyes"; then > +AC_MSG_ERROR([libunwind requested but not installed.]) > +fi > +AC_DEFINE(HAVE_LIBUNWIND, 1, [Have libunwind support]) > +fi > + > +AM_CONDITIONAL(HAVE_LIBUNWIND, [test "x$LIBUNWIND" = xyes]) > + > + > dnl Options for APIs > AC_ARG_ENABLE([opengl], > [AS_HELP_STRING([--disable-opengl], > diff --git a/src/gallium/Automake.inc b/src/gallium/Automake.inc > index a01fa54..48b5a44 100644 > --- a/src/gallium/Automake.inc > +++ b/src/gallium/Automake.inc > @@ -46,6 +46,7 @@ GALLIUM_TARGET_CFLAGS = \ > > GALLIUM_COMMON_LIB_DEPS = \ > -lm \ > + $(LIBUNWIND_LIBS) \ > $(LIBSENSORS_LIBS) \ > $(CLOCK_LIB) \ > $(PTHREAD_LIBS) \ > diff --git a/src/gallium/auxiliary/util/u_debug_stack.c > b/src/gallium/auxiliary/util/u_debug_stack.c > index f941234..cf05f13 100644 > --- a/src/gallium/auxiliary/util/u_debug_stack.c > +++ b/src/gallium/auxiliary/util/u_debug_stack.c > @@ -36,6 +36,95 @@ > #include "u_debug_symbol.h" > #include "u_debug_stack.h" > > +#if defined(HAVE_LIBUNWIND) > + > +#ifndef _GNU_SOURCE > +#define _GNU_SOURCE > +#endif > +#include > + > +void > +debug_backtrace_capture(struct debug_stack_frame *backtrace, > +unsigned start_frame, > +unsigned nr_frames) > +{ > + unw_cursor_t cursor; > + unw_context_t context; > + unw_proc_info_t pip; > + unsigned i = 0; > + int ret; > + > + pip.unwind_info = NULL; > + > + unw_getcontext(); > + unw_init_local(, ); > + > + while ((start_frame > 0) && (unw_step() > 0)) > + start_frame--; > + > + while (unw_step() > 0) { > + char procname[256]; > + const char *filename; > + unw_word_t off; > + Dl_info dlinfo; > + > + unw_get_proc_info(, ); > + > + ret = unw_get_proc_name(, procname, 256, ); > + if (ret && ret != -UNW_ENOMEM) { > + procname[0] = '?'; > + procname[1] = 0; > + } > + > + if (dladdr((void *)(uintptr_t)(pip.start_ip + off), ) && > dlinfo.dli_fname && > + *dlinfo.dli_fname) > + filename = dlinfo.dli_fname; > + else > + filename = "?"; > + > + snprintf(backtrace[i].buf, sizeof(backtrace[i].buf), > +"%u: %s (%s%s+0x%x) [%p]", i, filename, procname, > +ret == -UNW_ENOMEM ? "..." : "", (int)off, > +(void *)(uintptr_t)(pip.start_ip + off)); > + > + i++; > + } > + > + while (i < nr_frames) { > + backtrace[i].buf[0] = '\0'; > + i++; > + } > +} > + > +void > +debug_backtrace_dump(const struct debug_stack_frame *backtrace, > + unsigned nr_frames) > +{ > + unsigned i; > + > + for (i = 0; i < nr_frames; ++i) { > + if (backtrace[i].buf[0] == '\0') > + break; > + debug_printf("\t%s\n", backtrace[i].buf); > + } > +} > + > +void > +debug_backtrace_print(FILE *f, > + const struct debug_stack_frame *backtrace, > + unsigned nr_frames) > +{ > + unsigned i; > + > + for (i = 0; i < nr_frames; ++i) { > + if (backtrace[i].buf[0] == '\0') > + break; > + fprintf(f, "\t%s\n", backtrace[i].buf); > + } > +} > + > +#else /* ! HAVE_LIBUNWIND */ > + > #if defined(PIPE_OS_WINDOWS) > #include > #endif > @@ -179,3 +268,5 @@ debug_backtrace_print(FILE
Re: [Mesa-dev] [PATCH v2] anv: add support for allocating more than 1 block of memory
On Mon, Apr 3, 2017 at 7:44 AM, Jason Ekstrandwrote: > On Mon, Apr 3, 2017 at 5:02 AM, Juan A. Suarez Romero > wrote: > >> On Wed, 2017-03-29 at 12:06 -0700, Jason Ekstrand wrote: >> > Looking over the patch, I think I've convinced myself that it's >> correct. (I honestly wasn't expecting to come to that conclusion without >> more iteration.) That said, this raises some interesting questions. I >> added Kristian to the Cc in case he has any input. >> > >> > 1. Should we do powers of two or linear. I'm still a fan of powers of >> two. >> >> In which specific part do you mean? In free block lists? or in the >> allocator_new? >> > > In the state pool, we just round all allocation sizes up to the nearest > power-of-two, and then the index into the array of free lists isn't "size - > base", it's "ilog2(size) - base". > > >> > >> > 2. Should block pools even have a block size at all? We could just >> make every block pool allow any power-of-two size from 4 KiB up to. say, 1 >> MiB and then make the block size part of the state pool or stream that's >> allocating from it. At the moment, I like this idea, but I've given it >> very little thought. >> > >> So IIUC, the idea would be the block pool is just a flat chunk of >> memory, where we later fetch blocks of memory from, as requested by >> applications. Is that correct? >> > Thinking about this again, I think your statement may have been more correct than I thought. If we make the state_stream chain off of a state_pool rather than the block_pool, we could make the block pool structure a simple bi-directional growing BO and trust in the state pool for 100% of the re-use. That would probably be a way simpler structure. For that matter, the state pool could just own the block_pool and setup/teardown would be easier. > Sorry, but this patch gave me some sudden revelations and things are still > in the process of reforming in my brain. In the past, we assumed a > two-layered allocation strategy where we had a block pool which was the > base and then the state pool and state stream allocators sat on top of it. > Originally, the state pool allocator was just for a single size as well. > > Now that the block pool is going to be capable of allocating multiple > sizes, the whole mental model of the separation falls apart. The new > future that I see is one where the block pool and state pool aren't > separate. Instead, we have a single thing which I'll call state_pool (we > have to pick one of the names) which lets you allocate a state of any size > from 32B to 2MB. The state stream will then allocate off of a state_pool > instead of a block_pool since they're now the same. For smaller states, we > still want to allocate reasonably large chunks (probably 4K) so that we > ensure that things are nicely aligned. I think I've got a pretty good idea > of how it should work at this point and can write more if you'd like. > > Before we dive in and do a pile of refactoring, I think this patch is > pretty much good-to-go assuming we fix the power-of-two thing and it fixes > a bug so let's focus there. Are you interested in doing the refactoring? > If not, that's ok, I'm happy to do it and it wouldn't take me long now that > I've gotten a chance to think about it. If you are interested, go for it! > > >> > 3. If we go with the idea in 2. should we still call it block_pool? I >> think we can keep the name but it doesn't it as well as it once did. >> > >> > Thanks for working on this! I'm sorry it's taken so long to respond. >> Every time I've looked at it, my brain hasn't been in the right state to >> think about lock-free code. :-/ >> > >> > On Wed, Mar 15, 2017 at 5:05 AM, Juan A. Suarez Romero < >> jasua...@igalia.com> wrote: >> > > Current Anv allocator assign memory in terms of a fixed block size. >> > > >> > > But there can be cases where this block is not enough for a memory >> > > request, and thus several blocks must be assigned in a row. >> > > >> > > This commit adds support for specifying how many blocks of memory must >> > > be assigned. >> > > >> > > This fixes a number dEQP-VK.pipeline.render_to_image.* tests that >> crash. >> > > >> > > v2: lock-free free-list is not handled correctly (Jason) >> > > --- >> > > src/intel/vulkan/anv_allocator.c | 81 >> +++--- >> > > src/intel/vulkan/anv_batch_chain.c | 4 +- >> > > src/intel/vulkan/anv_private.h | 7 +++- >> > > 3 files changed, 66 insertions(+), 26 deletions(-) >> > > >> > > diff --git a/src/intel/vulkan/anv_allocator.c >> b/src/intel/vulkan/anv_allocator.c >> > > index 45c663b..3924551 100644 >> > > --- a/src/intel/vulkan/anv_allocator.c >> > > +++ b/src/intel/vulkan/anv_allocator.c >> > > @@ -257,7 +257,8 @@ anv_block_pool_init(struct anv_block_pool *pool, >> > > pool->device = device; >> > > anv_bo_init(>bo, 0, 0); >> > > pool->block_size = block_size; >> > > - pool->free_list =
Re: [Mesa-dev] [PATCH v2] anv: add support for allocating more than 1 block of memory
On Mon, Apr 3, 2017 at 9:40 AM, Kristian Høgsbergwrote: > On Wed, Mar 29, 2017 at 12:06 PM, Jason Ekstrand > wrote: > > Looking over the patch, I think I've convinced myself that it's > correct. (I > > honestly wasn't expecting to come to that conclusion without more > > iteration.) That said, this raises some interesting questions. I added > > Kristian to the Cc in case he has any input. > > I haven't looked closely, and I agree it's increasingly tricky code to > review. I'd be careful about making this a fully generic any-block > size allocator. The premise, when we first designed this, was that for > something like a fixed-size, power-of-two pool, we could write a fast, > lock-less and fragmentation free allocator without getting in over our > heads. Yeah, it's getting tricky but I don't think it's outside the realm of humans yet. :-) > However, if this evolves (devolves?) into "malloc, but for > bos", it may be time to take a step back and look if something like > jemalloc with bo backing is a better choice. > Yeah, it may be time to start considering that. Unfortunately, I don't think we can use jemalloc for this in a safe way. jemalloc does provide a way to manage pools but it does so, not with a pool pointer, but with an arena number in a global namespace. If an app were to use jemalloc (I wouldn't be surprised in the gaming world if jemalloc is fairly common-place) and uses arenas, we could end up with silent collisions. At some point (probably later this year), I hope to look into pulling in a foreign memory allocator and use that for BO placement and start softpinning the universe (no relocations!). That may be a good time to revisit allocation strategies a bit. Kristian > > > > > 1. Should we do powers of two or linear. I'm still a fan of powers of > two. > > > > 2. Should block pools even have a block size at all? We could just make > > every block pool allow any power-of-two size from 4 KiB up to. say, 1 MiB > > and then make the block size part of the state pool or stream that's > > allocating from it. At the moment, I like this idea, but I've given it > very > > little thought. > > > > 3. If we go with the idea in 2. should we still call it block_pool? I > > think we can keep the name but it doesn't it as well as it once did. > > > > Thanks for working on this! I'm sorry it's taken so long to respond. > Every > > time I've looked at it, my brain hasn't been in the right state to think > > about lock-free code. :-/ > > > > On Wed, Mar 15, 2017 at 5:05 AM, Juan A. Suarez Romero < > jasua...@igalia.com> > > wrote: > >> > >> Current Anv allocator assign memory in terms of a fixed block size. > >> > >> But there can be cases where this block is not enough for a memory > >> request, and thus several blocks must be assigned in a row. > >> > >> This commit adds support for specifying how many blocks of memory must > >> be assigned. > >> > >> This fixes a number dEQP-VK.pipeline.render_to_image.* tests that > crash. > >> > >> v2: lock-free free-list is not handled correctly (Jason) > >> --- > >> src/intel/vulkan/anv_allocator.c | 81 > >> +++--- > >> src/intel/vulkan/anv_batch_chain.c | 4 +- > >> src/intel/vulkan/anv_private.h | 7 +++- > >> 3 files changed, 66 insertions(+), 26 deletions(-) > >> > >> diff --git a/src/intel/vulkan/anv_allocator.c > >> b/src/intel/vulkan/anv_allocator.c > >> index 45c663b..3924551 100644 > >> --- a/src/intel/vulkan/anv_allocator.c > >> +++ b/src/intel/vulkan/anv_allocator.c > >> @@ -257,7 +257,8 @@ anv_block_pool_init(struct anv_block_pool *pool, > >> pool->device = device; > >> anv_bo_init(>bo, 0, 0); > >> pool->block_size = block_size; > >> - pool->free_list = ANV_FREE_LIST_EMPTY; > >> + for (uint32_t i = 0; i < ANV_MAX_BLOCKS; i++) > >> + pool->free_list[i] = ANV_FREE_LIST_EMPTY; > >> pool->back_free_list = ANV_FREE_LIST_EMPTY; > >> > >> pool->fd = memfd_create("block pool", MFD_CLOEXEC); > >> @@ -500,30 +501,35 @@ fail: > >> > >> static uint32_t > >> anv_block_pool_alloc_new(struct anv_block_pool *pool, > >> - struct anv_block_state *pool_state) > >> + struct anv_block_state *pool_state, > >> + uint32_t n_blocks) > > > > > > Maybe have this take a size rather than n_blocks? It's only ever called > by > > stuff in the block pool so the caller can do the multiplication. It > would > > certainly make some of the math below easier. > > > >> > >> { > >> struct anv_block_state state, old, new; > >> > >> while (1) { > >> - state.u64 = __sync_fetch_and_add(_state->u64, > >> pool->block_size); > >> - if (state.next < state.end) { > >> + state.u64 = __sync_fetch_and_add(_state->u64, n_blocks * > >> pool->block_size); > >> + if (state.next > state.end) { > >> + futex_wait(_state->end, state.end); > >> + continue; > >>
[Mesa-dev] [PATCH] bin/get-fixes-pick-list.sh: fix typo
Replace "nore" by "more". --- bin/get-fixes-pick-list.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/bin/get-fixes-pick-list.sh b/bin/get-fixes-pick-list.sh index 59bcae4..75242a2 100755 --- a/bin/get-fixes-pick-list.sh +++ b/bin/get-fixes-pick-list.sh @@ -27,7 +27,7 @@ do # For each one try to extract the tag fixes_count=`git show $sha | grep -i "fixes:" | wc -l` if [ "x$fixes_count" != x1 ] ; then - echo WARNING: Commit $sha has nore than one Fixes tag + echo WARNING: Commit $sha has more than one Fixes tag fi fixes=`git show $sha | grep -i "fixes:" | head -n 1` # The following sed/cut combination is borrowed from GregKH -- 2.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/9] gallium: decrease the size of pipe_box - 24 -> 16 bytes
On Apr 3, 2017 5:11 PM, "Alex Deucher"wrote: On Sun, Apr 2, 2017 at 2:00 PM, Marek Olšák wrote: > From: Marek Olšák > > Also: > > pipe_transfer: 48 -> 40 bytes. > pipe_blit_info = 176 -> 160 bytes. > --- > src/gallium/include/pipe/p_state.h | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/src/gallium/include/pipe/p_state.h b/src/gallium/include/pipe/p_state.h > index 392bb8b..6a147ef 100644 > --- a/src/gallium/include/pipe/p_state.h > +++ b/src/gallium/include/pipe/p_state.h > @@ -472,25 +472,25 @@ struct pipe_image_view > } u; > }; > > > /** > * Subregion of 1D/2D/3D image resource. > */ > struct pipe_box > { > int x; > - int y; > - int z; > + int16_t y; > + int16_t z; > int width; > - int height; > - int depth; > + int16_t height; > + int16_t depth; > }; Not specific to this patch per se, but will this cause any issues in the future as max surface sizes supported by hw increase? I feel like we'll need to change this back at some point. We should be fine as long as OpenGL and/or games don't require bigger sizes. Marek Alex > > > /** > * A memory object/resource such as a vertex buffer or texture. > */ > struct pipe_resource > { > struct pipe_reference reference; > struct pipe_screen *screen; /**< screen that this texture belongs to */ > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] anv: add support for allocating more than 1 block of memory
On Wed, Mar 29, 2017 at 12:06 PM, Jason Ekstrandwrote: > Looking over the patch, I think I've convinced myself that it's correct. (I > honestly wasn't expecting to come to that conclusion without more > iteration.) That said, this raises some interesting questions. I added > Kristian to the Cc in case he has any input. I haven't looked closely, and I agree it's increasingly tricky code to review. I'd be careful about making this a fully generic any-block size allocator. The premise, when we first designed this, was that for something like a fixed-size, power-of-two pool, we could write a fast, lock-less and fragmentation free allocator without getting in over our heads. However, if this evolves (devolves?) into "malloc, but for bos", it may be time to take a step back and look if something like jemalloc with bo backing is a better choice. Kristian > > 1. Should we do powers of two or linear. I'm still a fan of powers of two. > > 2. Should block pools even have a block size at all? We could just make > every block pool allow any power-of-two size from 4 KiB up to. say, 1 MiB > and then make the block size part of the state pool or stream that's > allocating from it. At the moment, I like this idea, but I've given it very > little thought. > > 3. If we go with the idea in 2. should we still call it block_pool? I > think we can keep the name but it doesn't it as well as it once did. > > Thanks for working on this! I'm sorry it's taken so long to respond. Every > time I've looked at it, my brain hasn't been in the right state to think > about lock-free code. :-/ > > On Wed, Mar 15, 2017 at 5:05 AM, Juan A. Suarez Romero > wrote: >> >> Current Anv allocator assign memory in terms of a fixed block size. >> >> But there can be cases where this block is not enough for a memory >> request, and thus several blocks must be assigned in a row. >> >> This commit adds support for specifying how many blocks of memory must >> be assigned. >> >> This fixes a number dEQP-VK.pipeline.render_to_image.* tests that crash. >> >> v2: lock-free free-list is not handled correctly (Jason) >> --- >> src/intel/vulkan/anv_allocator.c | 81 >> +++--- >> src/intel/vulkan/anv_batch_chain.c | 4 +- >> src/intel/vulkan/anv_private.h | 7 +++- >> 3 files changed, 66 insertions(+), 26 deletions(-) >> >> diff --git a/src/intel/vulkan/anv_allocator.c >> b/src/intel/vulkan/anv_allocator.c >> index 45c663b..3924551 100644 >> --- a/src/intel/vulkan/anv_allocator.c >> +++ b/src/intel/vulkan/anv_allocator.c >> @@ -257,7 +257,8 @@ anv_block_pool_init(struct anv_block_pool *pool, >> pool->device = device; >> anv_bo_init(>bo, 0, 0); >> pool->block_size = block_size; >> - pool->free_list = ANV_FREE_LIST_EMPTY; >> + for (uint32_t i = 0; i < ANV_MAX_BLOCKS; i++) >> + pool->free_list[i] = ANV_FREE_LIST_EMPTY; >> pool->back_free_list = ANV_FREE_LIST_EMPTY; >> >> pool->fd = memfd_create("block pool", MFD_CLOEXEC); >> @@ -500,30 +501,35 @@ fail: >> >> static uint32_t >> anv_block_pool_alloc_new(struct anv_block_pool *pool, >> - struct anv_block_state *pool_state) >> + struct anv_block_state *pool_state, >> + uint32_t n_blocks) > > > Maybe have this take a size rather than n_blocks? It's only ever called by > stuff in the block pool so the caller can do the multiplication. It would > certainly make some of the math below easier. > >> >> { >> struct anv_block_state state, old, new; >> >> while (1) { >> - state.u64 = __sync_fetch_and_add(_state->u64, >> pool->block_size); >> - if (state.next < state.end) { >> + state.u64 = __sync_fetch_and_add(_state->u64, n_blocks * >> pool->block_size); >> + if (state.next > state.end) { >> + futex_wait(_state->end, state.end); >> + continue; >> + } else if ((state.next + (n_blocks - 1) * pool->block_size) < >> state.end) { > > > First off, please keep the if's in the same order unless we have a reason to > re-arrange them. It would make this way easier to review. :-) > > Second, I think this would be much easier to read as: > > if (state.next + size <= state.end) { >/* Success */ > } else if (state.next <= state.end) { >/* Our block is the one that crosses the line */ > } else { >/* Wait like everyone else */ > } > >> >> assert(pool->map); >> return state.next; >> - } else if (state.next == state.end) { >> - /* We allocated the first block outside the pool, we have to >> grow it. >> - * pool_state->next acts a mutex: threads who try to allocate >> now will >> - * get block indexes above the current limit and hit futex_wait >> - * below. */ >> - new.next = state.next + pool->block_size; >> + } else { >> + /* We allocated the firsts blocks outside the pool, we have to >> grow >> +
[Mesa-dev] [Bug 100259] [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'
https://bugs.freedesktop.org/show_bug.cgi?id=100259 --- Comment #14 from Emil Velikov--- Ouch that is some nasty bug in Slackware packaging. Note you want /usr/local/ and /usr/ in the same order across --with-pkg-config-dir and --with-system-libdir. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 9/9] gallium: decrease the size of pipe_draw_info - 88 -> 80 bytes
On Apr 3, 2017 4:04 PM, "Brian Paul"wrote: I need to test this series on Windows and with MinGW first. I'm worried about enums with bitfields. Hopefully it'll work. Enums without bitfields always occupy 4 bytes. If bitfields don't work with those, we'll have to stop using enums in these cases. Marek -Brian On Mon, Apr 3, 2017 at 2:46 AM, Nicolai Hähnle wrote: > Some of these are a bit subtle, but I think they're fine. Series is: > > Reviewed-by: Nicolai Hähnle > > > On 02.04.2017 20:00, Marek Olšák wrote: > >> From: Marek Olšák >> >> --- >> src/gallium/auxiliary/indices/u_primconvert.c | 10 -- >> src/gallium/include/pipe/p_state.h| 2 +- >> 2 files changed, 9 insertions(+), 3 deletions(-) >> >> diff --git a/src/gallium/auxiliary/indices/u_primconvert.c >> b/src/gallium/auxiliary/indices/u_primconvert.c >> index 2bdfade..1ffca4b 100644 >> --- a/src/gallium/auxiliary/indices/u_primconvert.c >> +++ b/src/gallium/auxiliary/indices/u_primconvert.c >> @@ -121,39 +121,45 @@ util_primconvert_draw_vbo(struct >> primconvert_context *pc, >> util_draw_init_info(_info); >> new_info.indexed = true; >> new_info.min_index = info->min_index; >> new_info.max_index = info->max_index; >> new_info.index_bias = info->index_bias; >> new_info.start_instance = info->start_instance; >> new_info.instance_count = info->instance_count; >> new_info.primitive_restart = info->primitive_restart; >> new_info.restart_index = info->restart_index; >> if (info->indexed) { >> + enum pipe_prim_type mode = 0; >> + >>u_index_translator(pc->primtypes_mask, >> info->mode, pc->saved_ib.index_size, >> info->count, >> pc->api_pv, pc->api_pv, >> info->primitive_restart ? PR_ENABLE : >> PR_DISABLE, >> - _info.mode, _ib.index_size, >> _info.count, >> + , _ib.index_size, _info.count, >> _func); >> + new_info.mode = mode; >>src = ib->user_buffer; >>if (!src) { >> src = pipe_buffer_map(pc->pipe, ib->buffer, >> PIPE_TRANSFER_READ, _transfer); >>} >>src = (const uint8_t *)src + ib->offset; >> } >> else { >> + enum pipe_prim_type mode = 0; >> + >>u_index_generator(pc->primtypes_mask, >> info->mode, info->start, info->count, >> pc->api_pv, pc->api_pv, >> -_info.mode, _ib.index_size, >> _info.count, >> +, _ib.index_size, _info.count, >> _func); >> + new_info.mode = mode; >> } >> >> u_upload_alloc(pc->pipe->stream_uploader, 0, new_ib.index_size * >> new_info.count, 4, >>_ib.offset, _ib.buffer, ); >> >> if (info->indexed) { >>trans_func(src, info->start, info->count, new_info.count, >> info->restart_index, dst); >> } >> else { >>gen_func(info->start, new_info.count, dst); >> diff --git a/src/gallium/include/pipe/p_state.h >> b/src/gallium/include/pipe/p_state.h >> index d68a4d4..eeabf8b 100644 >> --- a/src/gallium/include/pipe/p_state.h >> +++ b/src/gallium/include/pipe/p_state.h >> @@ -634,21 +634,21 @@ struct pipe_index_buffer >> const void *user_buffer; /**< pointer to a user buffer if buffer == >> NULL */ >> }; >> >> >> /** >> * Information to describe a draw_vbo call. >> */ >> struct pipe_draw_info >> { >> boolean indexed; /**< use index buffer */ >> - enum pipe_prim_type mode; /**< the mode of the primitive */ >> + enum pipe_prim_type mode:8; /**< the mode of the primitive */ >> boolean primitive_restart; >> ubyte vertices_per_patch; /**< the number of vertices per patch */ >> >> unsigned start; /**< the index of the first vertex */ >> unsigned count; /**< number of vertices */ >> >> unsigned start_instance; /**< first instance id */ >> unsigned instance_count; /**< number of instances */ >> >> unsigned drawid; /**< id of this draw in a multidraw */ >> >> > > -- > Lerne, wie die Welt wirklich ist, > Aber vergiss niemals, wie sie sein sollte. > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] util: #include "c99_compat.h" to fix Windows build
On 03/04/17 17:09, Brian Paul wrote: Otherwise, we were getting the definition for 'inline' by chance from some other preceeding #include. --- src/util/list.h | 1 + src/util/mesa-sha1.h | 1 + 2 files changed, 2 insertions(+) diff --git a/src/util/list.h b/src/util/list.h index 07eb9f3..6edb750 100644 --- a/src/util/list.h +++ b/src/util/list.h @@ -41,6 +41,7 @@ #include #include #include +#include "c99_compat.h" struct list_head diff --git a/src/util/mesa-sha1.h b/src/util/mesa-sha1.h index d3f7aff..bde50ba 100644 --- a/src/util/mesa-sha1.h +++ b/src/util/mesa-sha1.h @@ -24,6 +24,7 @@ #define MESA_SHA1_H #include +#include "c99_compat.h" #include "sha1/sha1.h" #ifdef __cplusplus Looks good. Series is Reviewed-by: Jose Fonseca___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv/pipeline: do not disable depth writes if depth testing is disabled
On Mon, Apr 3, 2017 at 8:55 AM, Nanley Cherywrote: > On Mon, Apr 03, 2017 at 08:02:54AM +0200, Iago Toral wrote: > > Can anyone review this one? > > > > On Wed, 2017-03-29 at 08:58 +0200, Iago Toral Quiroga wrote: > > > Writing and testing are two different things and they can be set > > > separately > > > by the application. If an application wants to record depth data > > > without > > > caring for the depth test, it can enable depth test and set the depth > > > compare function to VK_COMPARE_OP_ALWAYS or it can simply disable > > > depth testing altogether. Some CTS tests do the latter. > > > > > Doesn't disabling the depth test prevent any updates to the depth > buffer? > > Section 25.10 Depth Test from the Vulkan spec says: > >When disabled, the depth comparison and subsequent possible updates >to the value of the depth component of the depth/stencil attachment >are bypassed and the fragment is passed to the next operation. > Thanks! Maybe I wasn't crazy when I wrote that line of code. Consider my review retracted until we're sure this isn't a test bug. > -Nanley > > > > Fixes all multisample tests with depth-only formats in: > > > dEQP-VK.renderpass.multisample.* > > > --- > > > src/intel/vulkan/genX_pipeline.c | 4 > > > 1 file changed, 4 deletions(-) > > > > > > diff --git a/src/intel/vulkan/genX_pipeline.c > > > b/src/intel/vulkan/genX_pipeline.c > > > index 85a9e4f..dc393cb 100644 > > > --- a/src/intel/vulkan/genX_pipeline.c > > > +++ b/src/intel/vulkan/genX_pipeline.c > > > @@ -728,10 +728,6 @@ > > > sanitize_ds_state(VkPipelineDepthStencilStateCreateInfo *state, > > > { > > > *stencilWriteEnable = state->stencilTestEnable; > > > > > > - /* If the depth test is disabled, we won't be writing anything. > > > */ > > > - if (!state->depthTestEnable) > > > - state->depthWriteEnable = false; > > > - > > > /* The Vulkan spec requires that if either depth or stencil is > > > not present, > > > * the pipeline is to act as if the test silently passes. > > > */ > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] util: fix MSVC warning in u_align_u32()
To silence C:\Users\Brian\projects\mesa\src\util/u_vector.h(41) : warning C4146: unary minus operator applied to unsigned type, result still unsigned --- src/util/u_vector.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/util/u_vector.h b/src/util/u_vector.h index f97a8b4..c7fcb37 100644 --- a/src/util/u_vector.h +++ b/src/util/u_vector.h @@ -38,7 +38,7 @@ static inline uint32_t u_align_u32(uint32_t v, uint32_t a) { - assert(a != 0 && a == (a & -a)); + assert(a != 0 && a == (a & -((int32_t) a))); return (v + a - 1) & ~(a - 1); } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] util: s/SHA1_H/MESA_SHA1_H/
To follow the convention of other header include guards. --- src/util/mesa-sha1.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/util/mesa-sha1.h b/src/util/mesa-sha1.h index a81aba9..d3f7aff 100644 --- a/src/util/mesa-sha1.h +++ b/src/util/mesa-sha1.h @@ -20,8 +20,8 @@ * DEALINGS IN THE SOFTWARE. */ -#ifndef SHA1_H -#define SHA1_H +#ifndef MESA_SHA1_H +#define MESA_SHA1_H #include #include "sha1/sha1.h" -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] util: #include "c99_compat.h" to fix Windows build
Otherwise, we were getting the definition for 'inline' by chance from some other preceeding #include. --- src/util/list.h | 1 + src/util/mesa-sha1.h | 1 + 2 files changed, 2 insertions(+) diff --git a/src/util/list.h b/src/util/list.h index 07eb9f3..6edb750 100644 --- a/src/util/list.h +++ b/src/util/list.h @@ -41,6 +41,7 @@ #include #include #include +#include "c99_compat.h" struct list_head diff --git a/src/util/mesa-sha1.h b/src/util/mesa-sha1.h index d3f7aff..bde50ba 100644 --- a/src/util/mesa-sha1.h +++ b/src/util/mesa-sha1.h @@ -24,6 +24,7 @@ #define MESA_SHA1_H #include +#include "c99_compat.h" #include "sha1/sha1.h" #ifdef __cplusplus -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv/nir: Delete the apply_dynamic_offsets prototype
On Wed, Mar 22, 2017 at 03:13:56PM -0700, Jason Ekstrand wrote: > That pass hasn't existed since dd4db84640bbb694f180dd50850c3388f67228be > but the prototype stuck around for no reason. > --- > src/intel/vulkan/anv_nir.h | 3 --- > 1 file changed, 3 deletions(-) > This patch is Reviewed-by: Nanley Chery> diff --git a/src/intel/vulkan/anv_nir.h b/src/intel/vulkan/anv_nir.h > index a929dd9..3f97701 100644 > --- a/src/intel/vulkan/anv_nir.h > +++ b/src/intel/vulkan/anv_nir.h > @@ -35,9 +35,6 @@ void anv_nir_lower_input_attachments(nir_shader *shader); > > void anv_nir_lower_push_constants(nir_shader *shader); > > -void anv_nir_apply_dynamic_offsets(struct anv_pipeline *pipeline, > - nir_shader *shader, > - struct brw_stage_prog_data *prog_data); > void anv_nir_apply_pipeline_layout(struct anv_pipeline *pipeline, > nir_shader *shader, > struct brw_stage_prog_data *prog_data, > -- > 2.5.0.400.gff86faf > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/3] nv50/ir: fix AlgebraicOpt for slcts with mods
Signed-off-by: Karol Herbst--- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 4c92a1efb5..bd60a84998 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -1797,10 +1797,10 @@ AlgebraicOpt::handleSLCT(Instruction *slct) if (slct->getSrc(2)->asImm()->compare(slct->asCmp()->setCond, 0.0f)) slct->setSrc(0, slct->getSrc(1)); } else - if (slct->getSrc(0) != slct->getSrc(1)) { + if (slct->getSrc(0) != slct->getSrc(1) || slct->src(0).mod != slct->src(1).mod) return; - } - slct->op = OP_MOV; + slct->op = slct->src(0).mod.getOp(); + slct->src(0).mod = slct->src(0).mod ^ Modifier(slct->op); slct->setSrc(1, NULL); slct->setSrc(2, NULL); } -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/3] nv50/ir: handle logops with NOT in AlgebraicOpt
Signed-off-by: Karol Herbst--- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index bd60a84998..0de84fe9fc 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -1856,6 +1856,12 @@ AlgebraicOpt::handleLOGOP(Instruction *logop) set0 = cloneForward(func, set0); set1 = cloneShallow(func, set1); + + if (logop->src(0).mod == Modifier(NV50_IR_MOD_NOT)) + set0->asCmp()->setCond = inverseCondCode(set0->asCmp()->setCond); + if (logop->src(1).mod == Modifier(NV50_IR_MOD_NOT)) + set1->asCmp()->setCond = inverseCondCode(set1->asCmp()->setCond); + logop->bb->insertAfter(logop, set1); logop->bb->insertAfter(logop, set0); -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/3] nv50/ir: run some passes multiple times
With the shader cache, compilation time matters less. As a side effect we can write more optimizations to produce better optimized code. total instructions in shared programs : 3931743 -> 3917512 (-0.36%) total gprs used in shared programs: 481460 -> 481680 (0.05%) total local used in shared programs : 27481 -> 26761 (-2.62%) total bytes used in shared programs : 36032672 -> 35902648 (-0.36%) localgpr inst bytes helped 48 13338433843 hurt 1 295 75 75 Signed-off-by: Karol Herbst--- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp| 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 0de84fe9fc..505de08573 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -3729,12 +3729,17 @@ Program::optimizeSSA(int level) RUN_PASS(1, CopyPropagation, run); RUN_PASS(1, MergeSplits, run); RUN_PASS(2, GlobalCSE, run); - RUN_PASS(1, LocalCSE, run); - RUN_PASS(2, AlgebraicOpt, run); - RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks - RUN_PASS(1, ConstantFolding, foldAll); - RUN_PASS(2, LateAlgebraicOpt, run); - RUN_PASS(1, Split64BitOpPreRA, run); + for (int i = 0; i < 2; ++i) { + RUN_PASS(1, LocalCSE, run); + RUN_PASS(2, AlgebraicOpt, run); + RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks + RUN_PASS(1, ConstantFolding, foldAll); + RUN_PASS(2, LateAlgebraicOpt, run); + // only once + if (i == 0) + RUN_PASS(1, Split64BitOpPreRA, run); + RUN_PASS(1, DeadCodeElim, buryAll); + } RUN_PASS(1, LoadPropagation, run); RUN_PASS(1, IndirectPropagation, run); RUN_PASS(2, MemoryOpt, run); -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 0/3] nv50/ir: Preapre for running Opts inside a loop
Slowly we are getting to the point, that we miss enough optimization opportunities as the result of our own passes. For this we need to fix AlgebraicOpt to be able to handle mods on sources without creating new issues. The last patch enables looping opts. v2: update commit author Karol Herbst (3): nv50/ir: fix AlgebraicOpt for slcts with mods nv50/ir: handle logops with NOT in AlgebraicOpt nv50/ir: run some passes multiple times .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 29 +++--- 1 file changed, 20 insertions(+), 9 deletions(-) -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/3] nv50/ir: Preapre for running Opts inside a loop
Slowly we are getting to the point, that we miss enough optimization opportunities as the result of our own passes. For this we need to fix AlgebraicOpt to be able to handle mods on sources without creating new issues. The last patch enables looping opts. Karol Herbst (3): nv50/ir: fix AlgebraicOpt for slcts with mods nv50/ir: handle logops with NOT in AlgebraicOpt nv50/ir: run some passes multiple times .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 29 +++--- 1 file changed, 20 insertions(+), 9 deletions(-) -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] nv50/ir: handle logops with NOT in AlgebraicOpt
Signed-off-by: Karol Herbst--- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index bd60a84998..0de84fe9fc 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -1856,6 +1856,12 @@ AlgebraicOpt::handleLOGOP(Instruction *logop) set0 = cloneForward(func, set0); set1 = cloneShallow(func, set1); + + if (logop->src(0).mod == Modifier(NV50_IR_MOD_NOT)) + set0->asCmp()->setCond = inverseCondCode(set0->asCmp()->setCond); + if (logop->src(1).mod == Modifier(NV50_IR_MOD_NOT)) + set1->asCmp()->setCond = inverseCondCode(set1->asCmp()->setCond); + logop->bb->insertAfter(logop, set1); logop->bb->insertAfter(logop, set0); -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] nv50/ir: run some passes multiple times
From: Karol HerbstWith the shader cache, compilation time matters less. As a side effect we can write more optimizations to produce better optimized code. total instructions in shared programs : 3931743 -> 3917512 (-0.36%) total gprs used in shared programs: 481460 -> 481680 (0.05%) total local used in shared programs : 27481 -> 26761 (-2.62%) total bytes used in shared programs : 36032672 -> 35902648 (-0.36%) localgpr inst bytes helped 48 13338433843 hurt 1 295 75 75 Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp| 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 0de84fe9fc..505de08573 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -3729,12 +3729,17 @@ Program::optimizeSSA(int level) RUN_PASS(1, CopyPropagation, run); RUN_PASS(1, MergeSplits, run); RUN_PASS(2, GlobalCSE, run); - RUN_PASS(1, LocalCSE, run); - RUN_PASS(2, AlgebraicOpt, run); - RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks - RUN_PASS(1, ConstantFolding, foldAll); - RUN_PASS(2, LateAlgebraicOpt, run); - RUN_PASS(1, Split64BitOpPreRA, run); + for (int i = 0; i < 2; ++i) { + RUN_PASS(1, LocalCSE, run); + RUN_PASS(2, AlgebraicOpt, run); + RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks + RUN_PASS(1, ConstantFolding, foldAll); + RUN_PASS(2, LateAlgebraicOpt, run); + // only once + if (i == 0) + RUN_PASS(1, Split64BitOpPreRA, run); + RUN_PASS(1, DeadCodeElim, buryAll); + } RUN_PASS(1, LoadPropagation, run); RUN_PASS(1, IndirectPropagation, run); RUN_PASS(2, MemoryOpt, run); -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] nv50/ir: fix AlgebraicOpt for slcts with mods
From: Karol HerbstSigned-off-by: Karol Herbst --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 4c92a1efb5..bd60a84998 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -1797,10 +1797,10 @@ AlgebraicOpt::handleSLCT(Instruction *slct) if (slct->getSrc(2)->asImm()->compare(slct->asCmp()->setCond, 0.0f)) slct->setSrc(0, slct->getSrc(1)); } else - if (slct->getSrc(0) != slct->getSrc(1)) { + if (slct->getSrc(0) != slct->getSrc(1) || slct->src(0).mod != slct->src(1).mod) return; - } - slct->op = OP_MOV; + slct->op = slct->src(0).mod.getOp(); + slct->src(0).mod = slct->src(0).mod ^ Modifier(slct->op); slct->setSrc(1, NULL); slct->setSrc(2, NULL); } -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv/pipeline: do not disable depth writes if depth testing is disabled
On Mon, Apr 03, 2017 at 08:02:54AM +0200, Iago Toral wrote: > Can anyone review this one? > > On Wed, 2017-03-29 at 08:58 +0200, Iago Toral Quiroga wrote: > > Writing and testing are two different things and they can be set > > separately > > by the application. If an application wants to record depth data > > without > > caring for the depth test, it can enable depth test and set the depth > > compare function to VK_COMPARE_OP_ALWAYS or it can simply disable > > depth testing altogether. Some CTS tests do the latter. > > Doesn't disabling the depth test prevent any updates to the depth buffer? Section 25.10 Depth Test from the Vulkan spec says: When disabled, the depth comparison and subsequent possible updates to the value of the depth component of the depth/stencil attachment are bypassed and the fragment is passed to the next operation. -Nanley > > Fixes all multisample tests with depth-only formats in: > > dEQP-VK.renderpass.multisample.* > > --- > > src/intel/vulkan/genX_pipeline.c | 4 > > 1 file changed, 4 deletions(-) > > > > diff --git a/src/intel/vulkan/genX_pipeline.c > > b/src/intel/vulkan/genX_pipeline.c > > index 85a9e4f..dc393cb 100644 > > --- a/src/intel/vulkan/genX_pipeline.c > > +++ b/src/intel/vulkan/genX_pipeline.c > > @@ -728,10 +728,6 @@ > > sanitize_ds_state(VkPipelineDepthStencilStateCreateInfo *state, > > { > > *stencilWriteEnable = state->stencilTestEnable; > > > > - /* If the depth test is disabled, we won't be writing anything. > > */ > > - if (!state->depthTestEnable) > > - state->depthWriteEnable = false; > > - > > /* The Vulkan spec requires that if either depth or stencil is > > not present, > > * the pipeline is to act as if the test silently passes. > > */ > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/9] gallium: decrease the size of pipe_box - 24 -> 16 bytes
On Sun, Apr 2, 2017 at 2:00 PM, Marek Olšákwrote: > From: Marek Olšák > > Also: > > pipe_transfer: 48 -> 40 bytes. > pipe_blit_info = 176 -> 160 bytes. > --- > src/gallium/include/pipe/p_state.h | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/src/gallium/include/pipe/p_state.h > b/src/gallium/include/pipe/p_state.h > index 392bb8b..6a147ef 100644 > --- a/src/gallium/include/pipe/p_state.h > +++ b/src/gallium/include/pipe/p_state.h > @@ -472,25 +472,25 @@ struct pipe_image_view > } u; > }; > > > /** > * Subregion of 1D/2D/3D image resource. > */ > struct pipe_box > { > int x; > - int y; > - int z; > + int16_t y; > + int16_t z; > int width; > - int height; > - int depth; > + int16_t height; > + int16_t depth; > }; Not specific to this patch per se, but will this cause any issues in the future as max surface sizes supported by hw increase? I feel like we'll need to change this back at some point. Alex > > > /** > * A memory object/resource such as a vertex buffer or texture. > */ > struct pipe_resource > { > struct pipe_reference reference; > struct pipe_screen *screen; /**< screen that this texture belongs to */ > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/9] gallium: decrease the size of pipe_box - 24 -> 16 bytes
On 04/02/2017 12:00 PM, Marek Olšák wrote: From: Marek OlšákAlso: pipe_transfer: 48 -> 40 bytes. pipe_blit_info = 176 -> 160 bytes. --- src/gallium/include/pipe/p_state.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/gallium/include/pipe/p_state.h b/src/gallium/include/pipe/p_state.h index 392bb8b..6a147ef 100644 --- a/src/gallium/include/pipe/p_state.h +++ b/src/gallium/include/pipe/p_state.h @@ -472,25 +472,25 @@ struct pipe_image_view } u; }; /** * Subregion of 1D/2D/3D image resource. */ struct pipe_box { int x; - int y; - int z; + int16_t y; + int16_t z; int width; - int height; - int depth; + int16_t height; + int16_t depth; I think a comment explaining why x/width are int but y/z/height/depth are int16_t (texture buffer objects, right?) would be good. Otherwise, someone's going to wonder about that and maybe submit a patch to change it. Same thing for the pipe_resource change. I tested the series on Windows/MSVC and MinGW and didn't see any problems. With the above comments, Reviewed-by: Brian Paul BTW, some years ago I looked at using bitfields in gl_texture_image and gl_texture_object and it looked like a substantial size reduction was possible. Maybe you or someone else would like to look into that. -Brian }; /** * A memory object/resource such as a vertex buffer or texture. */ struct pipe_resource { struct pipe_reference reference; struct pipe_screen *screen; /**< screen that this texture belongs to */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] vulkan/wsi/wayland: Pass damage through to the compositor
Hi, On 3 April 2017 at 15:45, Jason Ekstrandwrote: > On Mon, Apr 3, 2017 at 1:54 AM, Daniel Stone wrote: >> Very scrupulous version check, but you forgot to actually use >> wl_surface_damage_buffer. ;) > > Gah!!! I'll get that fixed. Assuming that change, now that you've looked > at it all, would you mind reviewing at least this patch? Yeah, with a trivial change to use surface_buffer, the series is: Reviewed-by: Daniel Stone Cheers, Daniel ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv/pipeline: do not disable depth writes if depth testing is disabled
Sorry. I thought I had. Reviewed-by: Jason EkstrandOn Sun, Apr 2, 2017 at 11:02 PM, Iago Toral wrote: > Can anyone review this one? > > On Wed, 2017-03-29 at 08:58 +0200, Iago Toral Quiroga wrote: > > Writing and testing are two different things and they can be set > > separately > > by the application. If an application wants to record depth data > > without > > caring for the depth test, it can enable depth test and set the depth > > compare function to VK_COMPARE_OP_ALWAYS or it can simply disable > > depth testing altogether. Some CTS tests do the latter. > > > > Fixes all multisample tests with depth-only formats in: > > dEQP-VK.renderpass.multisample.* > > --- > > src/intel/vulkan/genX_pipeline.c | 4 > > 1 file changed, 4 deletions(-) > > > > diff --git a/src/intel/vulkan/genX_pipeline.c > > b/src/intel/vulkan/genX_pipeline.c > > index 85a9e4f..dc393cb 100644 > > --- a/src/intel/vulkan/genX_pipeline.c > > +++ b/src/intel/vulkan/genX_pipeline.c > > @@ -728,10 +728,6 @@ > > sanitize_ds_state(VkPipelineDepthStencilStateCreateInfo *state, > > { > > *stencilWriteEnable = state->stencilTestEnable; > > > > - /* If the depth test is disabled, we won't be writing anything. > > */ > > - if (!state->depthTestEnable) > > - state->depthWriteEnable = false; > > - > > /* The Vulkan spec requires that if either depth or stencil is > > not present, > > * the pipeline is to act as if the test silently passes. > > */ > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] vulkan/wsi/wayland: Pass damage through to the compositor
On Mon, Apr 3, 2017 at 1:54 AM, Daniel Stonewrote: > Hi Jason, > > On 1 April 2017 at 06:37, Jason Ekstrand wrote: > > @@ -594,7 +595,19 @@ wsi_wl_swapchain_queue_present(struct > wsi_swapchain *wsi_chain, > > > > assert(image_index < chain->base.image_count); > > wl_surface_attach(chain->surface, chain->images[image_index].buffer, > 0, 0); > > - wl_surface_damage(chain->surface, 0, 0, INT32_MAX, INT32_MAX); > > + > > + if (chain->surface_version >= 4 && damage && > > + damage->pRectangles && damage->rectangleCount > 0) { > > + for (unsigned i = 0; i < damage->rectangleCount; i++) { > > + const VkRectLayerKHR *rect = >pRectangles[i]; > > + assert(rect->layer == 0); > > + wl_surface_damage(chain->surface, > > + rect->offset.x, rect->offset.y, > > + rect->extent.width, rect->extent.height); > > Very scrupulous version check, but you forgot to actually use > wl_surface_damage_buffer. ;) > Gah!!! I'll get that fixed. Assuming that change, now that you've looked at it all, would you mind reviewing at least this patch? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] anv: add support for allocating more than 1 block of memory
On Mon, Apr 3, 2017 at 5:02 AM, Juan A. Suarez Romerowrote: > On Wed, 2017-03-29 at 12:06 -0700, Jason Ekstrand wrote: > > Looking over the patch, I think I've convinced myself that it's > correct. (I honestly wasn't expecting to come to that conclusion without > more iteration.) That said, this raises some interesting questions. I > added Kristian to the Cc in case he has any input. > > > > 1. Should we do powers of two or linear. I'm still a fan of powers of > two. > > In which specific part do you mean? In free block lists? or in the > allocator_new? > In the state pool, we just round all allocation sizes up to the nearest power-of-two, and then the index into the array of free lists isn't "size - base", it's "ilog2(size) - base". > > > > 2. Should block pools even have a block size at all? We could just make > every block pool allow any power-of-two size from 4 KiB up to. say, 1 MiB > and then make the block size part of the state pool or stream that's > allocating from it. At the moment, I like this idea, but I've given it > very little thought. > > > So IIUC, the idea would be the block pool is just a flat chunk of > memory, where we later fetch blocks of memory from, as requested by > applications. Is that correct? > Sorry, but this patch gave me some sudden revelations and things are still in the process of reforming in my brain. In the past, we assumed a two-layered allocation strategy where we had a block pool which was the base and then the state pool and state stream allocators sat on top of it. Originally, the state pool allocator was just for a single size as well. Now that the block pool is going to be capable of allocating multiple sizes, the whole mental model of the separation falls apart. The new future that I see is one where the block pool and state pool aren't separate. Instead, we have a single thing which I'll call state_pool (we have to pick one of the names) which lets you allocate a state of any size from 32B to 2MB. The state stream will then allocate off of a state_pool instead of a block_pool since they're now the same. For smaller states, we still want to allocate reasonably large chunks (probably 4K) so that we ensure that things are nicely aligned. I think I've got a pretty good idea of how it should work at this point and can write more if you'd like. Before we dive in and do a pile of refactoring, I think this patch is pretty much good-to-go assuming we fix the power-of-two thing and it fixes a bug so let's focus there. Are you interested in doing the refactoring? If not, that's ok, I'm happy to do it and it wouldn't take me long now that I've gotten a chance to think about it. If you are interested, go for it! > > 3. If we go with the idea in 2. should we still call it block_pool? I > think we can keep the name but it doesn't it as well as it once did. > > > > Thanks for working on this! I'm sorry it's taken so long to respond. > Every time I've looked at it, my brain hasn't been in the right state to > think about lock-free code. :-/ > > > > On Wed, Mar 15, 2017 at 5:05 AM, Juan A. Suarez Romero < > jasua...@igalia.com> wrote: > > > Current Anv allocator assign memory in terms of a fixed block size. > > > > > > But there can be cases where this block is not enough for a memory > > > request, and thus several blocks must be assigned in a row. > > > > > > This commit adds support for specifying how many blocks of memory must > > > be assigned. > > > > > > This fixes a number dEQP-VK.pipeline.render_to_image.* tests that > crash. > > > > > > v2: lock-free free-list is not handled correctly (Jason) > > > --- > > > src/intel/vulkan/anv_allocator.c | 81 > +++--- > > > src/intel/vulkan/anv_batch_chain.c | 4 +- > > > src/intel/vulkan/anv_private.h | 7 +++- > > > 3 files changed, 66 insertions(+), 26 deletions(-) > > > > > > diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_ > allocator.c > > > index 45c663b..3924551 100644 > > > --- a/src/intel/vulkan/anv_allocator.c > > > +++ b/src/intel/vulkan/anv_allocator.c > > > @@ -257,7 +257,8 @@ anv_block_pool_init(struct anv_block_pool *pool, > > > pool->device = device; > > > anv_bo_init(>bo, 0, 0); > > > pool->block_size = block_size; > > > - pool->free_list = ANV_FREE_LIST_EMPTY; > > > + for (uint32_t i = 0; i < ANV_MAX_BLOCKS; i++) > > > + pool->free_list[i] = ANV_FREE_LIST_EMPTY; > > > pool->back_free_list = ANV_FREE_LIST_EMPTY; > > > > > > pool->fd = memfd_create("block pool", MFD_CLOEXEC); > > > @@ -500,30 +501,35 @@ fail: > > > > > > static uint32_t > > > anv_block_pool_alloc_new(struct anv_block_pool *pool, > > > - struct anv_block_state *pool_state) > > > + struct anv_block_state *pool_state, > > > + uint32_t n_blocks) > > > > Maybe have this take a size rather than n_blocks? It's only ever called > by stuff in
Re: [Mesa-dev] [PATCH 9/9] gallium: decrease the size of pipe_draw_info - 88 -> 80 bytes
I need to test this series on Windows and with MinGW first. I'm worried about enums with bitfields. -Brian On Mon, Apr 3, 2017 at 2:46 AM, Nicolai Hähnlewrote: > Some of these are a bit subtle, but I think they're fine. Series is: > > Reviewed-by: Nicolai Hähnle > > > On 02.04.2017 20:00, Marek Olšák wrote: > >> From: Marek Olšák >> >> --- >> src/gallium/auxiliary/indices/u_primconvert.c | 10 -- >> src/gallium/include/pipe/p_state.h| 2 +- >> 2 files changed, 9 insertions(+), 3 deletions(-) >> >> diff --git a/src/gallium/auxiliary/indices/u_primconvert.c >> b/src/gallium/auxiliary/indices/u_primconvert.c >> index 2bdfade..1ffca4b 100644 >> --- a/src/gallium/auxiliary/indices/u_primconvert.c >> +++ b/src/gallium/auxiliary/indices/u_primconvert.c >> @@ -121,39 +121,45 @@ util_primconvert_draw_vbo(struct >> primconvert_context *pc, >> util_draw_init_info(_info); >> new_info.indexed = true; >> new_info.min_index = info->min_index; >> new_info.max_index = info->max_index; >> new_info.index_bias = info->index_bias; >> new_info.start_instance = info->start_instance; >> new_info.instance_count = info->instance_count; >> new_info.primitive_restart = info->primitive_restart; >> new_info.restart_index = info->restart_index; >> if (info->indexed) { >> + enum pipe_prim_type mode = 0; >> + >>u_index_translator(pc->primtypes_mask, >> info->mode, pc->saved_ib.index_size, >> info->count, >> pc->api_pv, pc->api_pv, >> info->primitive_restart ? PR_ENABLE : >> PR_DISABLE, >> - _info.mode, _ib.index_size, >> _info.count, >> + , _ib.index_size, _info.count, >> _func); >> + new_info.mode = mode; >>src = ib->user_buffer; >>if (!src) { >> src = pipe_buffer_map(pc->pipe, ib->buffer, >> PIPE_TRANSFER_READ, _transfer); >>} >>src = (const uint8_t *)src + ib->offset; >> } >> else { >> + enum pipe_prim_type mode = 0; >> + >>u_index_generator(pc->primtypes_mask, >> info->mode, info->start, info->count, >> pc->api_pv, pc->api_pv, >> -_info.mode, _ib.index_size, >> _info.count, >> +, _ib.index_size, _info.count, >> _func); >> + new_info.mode = mode; >> } >> >> u_upload_alloc(pc->pipe->stream_uploader, 0, new_ib.index_size * >> new_info.count, 4, >>_ib.offset, _ib.buffer, ); >> >> if (info->indexed) { >>trans_func(src, info->start, info->count, new_info.count, >> info->restart_index, dst); >> } >> else { >>gen_func(info->start, new_info.count, dst); >> diff --git a/src/gallium/include/pipe/p_state.h >> b/src/gallium/include/pipe/p_state.h >> index d68a4d4..eeabf8b 100644 >> --- a/src/gallium/include/pipe/p_state.h >> +++ b/src/gallium/include/pipe/p_state.h >> @@ -634,21 +634,21 @@ struct pipe_index_buffer >> const void *user_buffer; /**< pointer to a user buffer if buffer == >> NULL */ >> }; >> >> >> /** >> * Information to describe a draw_vbo call. >> */ >> struct pipe_draw_info >> { >> boolean indexed; /**< use index buffer */ >> - enum pipe_prim_type mode; /**< the mode of the primitive */ >> + enum pipe_prim_type mode:8; /**< the mode of the primitive */ >> boolean primitive_restart; >> ubyte vertices_per_patch; /**< the number of vertices per patch */ >> >> unsigned start; /**< the index of the first vertex */ >> unsigned count; /**< number of vertices */ >> >> unsigned start_instance; /**< first instance id */ >> unsigned instance_count; /**< number of instances */ >> >> unsigned drawid; /**< id of this draw in a multidraw */ >> >> > > -- > Lerne, wie die Welt wirklich ist, > Aber vergiss niemals, wie sie sein sollte. > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] anv: add support for allocating more than 1 block of memory
On Wed, 2017-03-29 at 12:06 -0700, Jason Ekstrand wrote: > Looking over the patch, I think I've convinced myself that it's correct. (I > honestly wasn't expecting to come to that conclusion without more iteration.) > That said, this raises some interesting questions. I added Kristian to the > Cc in case he has any input. > > 1. Should we do powers of two or linear. I'm still a fan of powers of two. In which specific part do you mean? In free block lists? or in the allocator_new? > > 2. Should block pools even have a block size at all? We could just make > every block pool allow any power-of-two size from 4 KiB up to. say, 1 MiB and > then make the block size part of the state pool or stream that's allocating > from it. At the moment, I like this idea, but I've given it very little > thought. > So IIUC, the idea would be the block pool is just a flat chunk of memory, where we later fetch blocks of memory from, as requested by applications. Is that correct? > 3. If we go with the idea in 2. should we still call it block_pool? I think > we can keep the name but it doesn't it as well as it once did. > > Thanks for working on this! I'm sorry it's taken so long to respond. Every > time I've looked at it, my brain hasn't been in the right state to think > about lock-free code. :-/ > > On Wed, Mar 15, 2017 at 5:05 AM, Juan A. Suarez Romero> wrote: > > Current Anv allocator assign memory in terms of a fixed block size. > > > > But there can be cases where this block is not enough for a memory > > request, and thus several blocks must be assigned in a row. > > > > This commit adds support for specifying how many blocks of memory must > > be assigned. > > > > This fixes a number dEQP-VK.pipeline.render_to_image.* tests that crash. > > > > v2: lock-free free-list is not handled correctly (Jason) > > --- > > src/intel/vulkan/anv_allocator.c | 81 > > +++--- > > src/intel/vulkan/anv_batch_chain.c | 4 +- > > src/intel/vulkan/anv_private.h | 7 +++- > > 3 files changed, 66 insertions(+), 26 deletions(-) > > > > diff --git a/src/intel/vulkan/anv_allocator.c > > b/src/intel/vulkan/anv_allocator.c > > index 45c663b..3924551 100644 > > --- a/src/intel/vulkan/anv_allocator.c > > +++ b/src/intel/vulkan/anv_allocator.c > > @@ -257,7 +257,8 @@ anv_block_pool_init(struct anv_block_pool *pool, > > pool->device = device; > > anv_bo_init(>bo, 0, 0); > > pool->block_size = block_size; > > - pool->free_list = ANV_FREE_LIST_EMPTY; > > + for (uint32_t i = 0; i < ANV_MAX_BLOCKS; i++) > > + pool->free_list[i] = ANV_FREE_LIST_EMPTY; > > pool->back_free_list = ANV_FREE_LIST_EMPTY; > > > > pool->fd = memfd_create("block pool", MFD_CLOEXEC); > > @@ -500,30 +501,35 @@ fail: > > > > static uint32_t > > anv_block_pool_alloc_new(struct anv_block_pool *pool, > > - struct anv_block_state *pool_state) > > + struct anv_block_state *pool_state, > > + uint32_t n_blocks) > > Maybe have this take a size rather than n_blocks? It's only ever called by > stuff in the block pool so the caller can do the multiplication. It would > certainly make some of the math below easier. > > > { > > struct anv_block_state state, old, new; > > > > while (1) { > > - state.u64 = __sync_fetch_and_add(_state->u64, pool->block_size); > > - if (state.next < state.end) { > > + state.u64 = __sync_fetch_and_add(_state->u64, n_blocks * > > pool->block_size); > > + if (state.next > state.end) { > > + futex_wait(_state->end, state.end); > > + continue; > > + } else if ((state.next + (n_blocks - 1) * pool->block_size) < > > state.end) { > > First off, please keep the if's in the same order unless we have a reason to > re-arrange them. It would make this way easier to review. :-) > > Second, I think this would be much easier to read as: > > if (state.next + size <= state.end) { > /* Success */ > } else if (state.next <= state.end) { > /* Our block is the one that crosses the line */ > } else { > /* Wait like everyone else */ > } > > > assert(pool->map); > > return state.next; > > - } else if (state.next == state.end) { > > - /* We allocated the first block outside the pool, we have to grow > > it. > > - * pool_state->next acts a mutex: threads who try to allocate now > > will > > - * get block indexes above the current limit and hit futex_wait > > - * below. */ > > - new.next = state.next + pool->block_size; > > + } else { > > + /* We allocated the firsts blocks outside the pool, we have to > > grow > > + * it. pool_state->next acts a mutex: threads who try to allocate > > + * now will get block indexes above the current limit and hit > > + * futex_wait below. > > + */ > > +
[Mesa-dev] [PATCH v2] docs: mark GL_ARB_vertex_attrib_64bit and OpenGL 4.2 as supported by i965/gen7+
v2 (Andreas Boll): - Mark GL 4.1 as supported by i965/gen7+ - Mark GL_ARB_shader_precision as supported by i965/gen7+ - Update release notes --- docs/features.txt | 4 ++-- docs/relnotes/17.1.0.html | 3 +++ 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/docs/features.txt b/docs/features.txt index 5c22acf..e71b197 100644 --- a/docs/features.txt +++ b/docs/features.txt @@ -136,12 +136,12 @@ GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi GL_ARB_transform_feedback3DONE (i965/gen7+, llvmpipe, softpipe, swr) -GL 4.1, GLSL 4.10 --- all DONE: i965/hsw+, nvc0, r600, radeonsi +GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi GL_ARB_ES2_compatibility DONE (i965, nv50, llvmpipe, softpipe, swr) GL_ARB_get_program_binary DONE (0 binary formats) GL_ARB_separate_shader_objectsDONE (all drivers) - GL_ARB_shader_precision DONE (i965/hsw+, all drivers that support GLSL 4.10) + GL_ARB_shader_precision DONE (i965/gen7+, all drivers that support GLSL 4.10) GL_ARB_vertex_attrib_64bitDONE (i965/gen7+, llvmpipe, softpipe) GL_ARB_viewport_array DONE (i965, nv50, llvmpipe, softpipe) diff --git a/docs/relnotes/17.1.0.html b/docs/relnotes/17.1.0.html index ada1e38..dbf8379 100644 --- a/docs/relnotes/17.1.0.html +++ b/docs/relnotes/17.1.0.html @@ -44,9 +44,12 @@ Note: some of the new features are only available with certain drivers. +OpenGL 4.2 on i965/ivb GL_ARB_gpu_shader_int64 on i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe +GL_ARB_shader_precision on i965/ivb GL_ARB_transform_feedback2 on i965/gen6 GL_ARB_transform_feedback_overflow_query on i965/gen6+ +GL_ARB_vertex_attrib_64bit on i965/ivb Geometry shaders enabled on swr -- 2.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 100402] [d3d9 bisected] Diablo III fails to start after commit 0630d3600bfb770cf3b23761c45b3add3b277c6b
https://bugs.freedesktop.org/show_bug.cgi?id=100402 Vedran Miletićchanged: What|Removed |Added CC||ved...@miletic.net -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] targets/va: export radeon winsys_create functions
Am 03.04.2017 um 12:08 schrieb Marek Olšák: From: Marek OlšákIt silences the following radeonsi LLVM warning due to a previous commit adding an LLVM workaround: "mesa: for the -simplifycfg-sink-common option: may only occur zero or one times!" Cc: 17.0 Reviewed-by: Christian König --- src/gallium/targets/omx/omx.sym | 5 + src/gallium/targets/pipe-loader/pipe.sym | 5 + src/gallium/targets/va/va.sym| 5 + 3 files changed, 15 insertions(+) diff --git a/src/gallium/targets/omx/omx.sym b/src/gallium/targets/omx/omx.sym index af22aed..e8a2876 100644 --- a/src/gallium/targets/omx/omx.sym +++ b/src/gallium/targets/omx/omx.sym @@ -1,6 +1,11 @@ { global: omx_component_library_Setup; + + # Workaround for an LLVM warning with -simplifycfg-sink-common + # due to LLVM being initialized multiple times. + radeon_drm_winsys_create; + amdgpu_winsys_create; local: *; }; diff --git a/src/gallium/targets/pipe-loader/pipe.sym b/src/gallium/targets/pipe-loader/pipe.sym index b2fa619..605cb83 100644 --- a/src/gallium/targets/pipe-loader/pipe.sym +++ b/src/gallium/targets/pipe-loader/pipe.sym @@ -1,7 +1,12 @@ { global: driver_descriptor; swrast_driver_descriptor; + + # Workaround for an LLVM warning with -simplifycfg-sink-common + # due to LLVM being initialized multiple times. + radeon_drm_winsys_create; + amdgpu_winsys_create; local: *; }; diff --git a/src/gallium/targets/va/va.sym b/src/gallium/targets/va/va.sym index c925b2e..917c3d3 100644 --- a/src/gallium/targets/va/va.sym +++ b/src/gallium/targets/va/va.sym @@ -1,6 +1,11 @@ { global: __vaDriverInit_*_*; + + # Workaround for an LLVM warning with -simplifycfg-sink-common + # due to LLVM being initialized multiple times. + radeon_drm_winsys_create; + amdgpu_winsys_create; local: *; }; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] r600g: check rasterizer primitive states like in radeonsi
For the series: Reviewed-by: Marek OlšákMarek On Sun, Apr 2, 2017 at 7:33 PM, Constantine Kharlamov wrote: > Specifically, non-line primitives skipped, and defaulting to reset on > each packet. > > The skip of non-line primitives saves ≈110 resetting of > PA_SC_LINE_STIPPLE register per frame in Kane > > Signed-off-by: Constantine Kharlamov > --- > src/gallium/drivers/r600/r600_state_common.c | 21 + > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/src/gallium/drivers/r600/r600_state_common.c > b/src/gallium/drivers/r600/r600_state_common.c > index e4d1660933..4de2a7344b 100644 > --- a/src/gallium/drivers/r600/r600_state_common.c > +++ b/src/gallium/drivers/r600/r600_state_common.c > @@ -1670,19 +1670,24 @@ void r600_emit_clip_misc_state(struct r600_context > *rctx, struct r600_atom *atom > static inline void r600_emit_rasterizer_prim_state(struct r600_context *rctx) > { > struct radeon_winsys_cs *cs = rctx->b.gfx.cs; > - unsigned ls_mask = 0; > enum pipe_prim_type rast_prim = rctx->current_rast_prim; > - if (rast_prim == rctx->last_rast_prim) > + > + /* Skip this if not rendering lines. */ > + if (rast_prim != PIPE_PRIM_LINES && > + rast_prim != PIPE_PRIM_LINE_LOOP && > + rast_prim != PIPE_PRIM_LINE_STRIP && > + rast_prim != PIPE_PRIM_LINES_ADJACENCY && > + rast_prim != PIPE_PRIM_LINE_STRIP_ADJACENCY) > return; > > - if (rast_prim == PIPE_PRIM_LINES) > - ls_mask = 1; > - else if (rast_prim == PIPE_PRIM_LINE_STRIP || > -rast_prim == PIPE_PRIM_LINE_LOOP) > - ls_mask = 2; > + if (rast_prim == rctx->last_rast_prim) > + return; > > + /* For lines, reset the stipple pattern at each primitive. Otherwise, > +* reset the stipple pattern at each packet (line strips, line loops). > +*/ > radeon_set_context_reg(cs, R_028A0C_PA_SC_LINE_STIPPLE, > - S_028A0C_AUTO_RESET_CNTL(ls_mask) | > + S_028A0C_AUTO_RESET_CNTL(rast_prim == > PIPE_PRIM_LINES ? 1 : 2) | >(rctx->rasterizer ? > rctx->rasterizer->pa_sc_line_stipple : 0)); > rctx->last_rast_prim = rast_prim; > } > -- > 2.12.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] targets/va: export radeon winsys_create functions
I've changed the commit message to "targets: export radeon winsys_create functions to silence LLVM warning". Marek On Mon, Apr 3, 2017 at 12:08 PM, Marek Olšákwrote: > From: Marek Olšák > > It silences the following radeonsi LLVM warning due to a previous > commit adding an LLVM workaround: > "mesa: for the -simplifycfg-sink-common option: may only occur zero or one >times!" > > Cc: 17.0 > --- > src/gallium/targets/omx/omx.sym | 5 + > src/gallium/targets/pipe-loader/pipe.sym | 5 + > src/gallium/targets/va/va.sym| 5 + > 3 files changed, 15 insertions(+) > > diff --git a/src/gallium/targets/omx/omx.sym b/src/gallium/targets/omx/omx.sym > index af22aed..e8a2876 100644 > --- a/src/gallium/targets/omx/omx.sym > +++ b/src/gallium/targets/omx/omx.sym > @@ -1,6 +1,11 @@ > { > global: > omx_component_library_Setup; > + > + # Workaround for an LLVM warning with -simplifycfg-sink-common > + # due to LLVM being initialized multiple times. > + radeon_drm_winsys_create; > + amdgpu_winsys_create; > local: > *; > }; > diff --git a/src/gallium/targets/pipe-loader/pipe.sym > b/src/gallium/targets/pipe-loader/pipe.sym > index b2fa619..605cb83 100644 > --- a/src/gallium/targets/pipe-loader/pipe.sym > +++ b/src/gallium/targets/pipe-loader/pipe.sym > @@ -1,7 +1,12 @@ > { > global: > driver_descriptor; > swrast_driver_descriptor; > + > + # Workaround for an LLVM warning with -simplifycfg-sink-common > + # due to LLVM being initialized multiple times. > + radeon_drm_winsys_create; > + amdgpu_winsys_create; > local: > *; > }; > diff --git a/src/gallium/targets/va/va.sym b/src/gallium/targets/va/va.sym > index c925b2e..917c3d3 100644 > --- a/src/gallium/targets/va/va.sym > +++ b/src/gallium/targets/va/va.sym > @@ -1,6 +1,11 @@ > { > global: > __vaDriverInit_*_*; > + > + # Workaround for an LLVM warning with -simplifycfg-sink-common > + # due to LLVM being initialized multiple times. > + radeon_drm_winsys_create; > + amdgpu_winsys_create; > local: > *; > }; > -- > 2.7.4 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] targets/va: export radeon winsys_create functions
From: Marek OlšákIt silences the following radeonsi LLVM warning due to a previous commit adding an LLVM workaround: "mesa: for the -simplifycfg-sink-common option: may only occur zero or one times!" Cc: 17.0 --- src/gallium/targets/omx/omx.sym | 5 + src/gallium/targets/pipe-loader/pipe.sym | 5 + src/gallium/targets/va/va.sym| 5 + 3 files changed, 15 insertions(+) diff --git a/src/gallium/targets/omx/omx.sym b/src/gallium/targets/omx/omx.sym index af22aed..e8a2876 100644 --- a/src/gallium/targets/omx/omx.sym +++ b/src/gallium/targets/omx/omx.sym @@ -1,6 +1,11 @@ { global: omx_component_library_Setup; + + # Workaround for an LLVM warning with -simplifycfg-sink-common + # due to LLVM being initialized multiple times. + radeon_drm_winsys_create; + amdgpu_winsys_create; local: *; }; diff --git a/src/gallium/targets/pipe-loader/pipe.sym b/src/gallium/targets/pipe-loader/pipe.sym index b2fa619..605cb83 100644 --- a/src/gallium/targets/pipe-loader/pipe.sym +++ b/src/gallium/targets/pipe-loader/pipe.sym @@ -1,7 +1,12 @@ { global: driver_descriptor; swrast_driver_descriptor; + + # Workaround for an LLVM warning with -simplifycfg-sink-common + # due to LLVM being initialized multiple times. + radeon_drm_winsys_create; + amdgpu_winsys_create; local: *; }; diff --git a/src/gallium/targets/va/va.sym b/src/gallium/targets/va/va.sym index c925b2e..917c3d3 100644 --- a/src/gallium/targets/va/va.sym +++ b/src/gallium/targets/va/va.sym @@ -1,6 +1,11 @@ { global: __vaDriverInit_*_*; + + # Workaround for an LLVM warning with -simplifycfg-sink-common + # due to LLVM being initialized multiple times. + radeon_drm_winsys_create; + amdgpu_winsys_create; local: *; }; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/omx/dec: Properly undefine DEBUG macro
Yes, that is correct. I wasn't thinking clearly. On Mon, Apr 3, 2017 at 1:57 PM Christian Königwrote: > Am 03.04.2017 um 06:35 schrieb Shaleen Jain: > > --- > > src/gallium/state_trackers/omx/vid_dec.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/src/gallium/state_trackers/omx/vid_dec.c > b/src/gallium/state_trackers/omx/vid_dec.c > > index 9a6efb8e28..94664eba04 100644 > > --- a/src/gallium/state_trackers/omx/vid_dec.c > > +++ b/src/gallium/state_trackers/omx/vid_dec.c > > @@ -37,7 +37,7 @@ > > #include > > > > /* bellagio defines a DEBUG macro that we don't want */ > > -#ifndef DEBUG > > +#ifdef DEBUG > > NAK, the logic locked correct as it was before. > > Why do you want to invert it? > > Regards, > Christian. > > > #include > > #undef DEBUG > > #else > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] radeonsi: use ctx->types instead of bld->types etc.
From: Marek Olšákeven vec_type is f32. --- src/gallium/drivers/radeonsi/si_shader.c | 28 ++ .../drivers/radeonsi/si_shader_tgsi_setup.c| 16 ++--- 2 files changed, 21 insertions(+), 23 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index a5d370b..0200172 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -3306,21 +3306,21 @@ static LLVMValueRef image_fetch_coords( struct gallivm_state *gallivm = bld_base->base.gallivm; LLVMBuilderRef builder = gallivm->builder; unsigned target = inst->Memory.Texture; unsigned num_coords = tgsi_util_get_texture_coord_dim(target); LLVMValueRef coords[4]; LLVMValueRef tmp; int chan; for (chan = 0; chan < num_coords; ++chan) { tmp = lp_build_emit_fetch(bld_base, inst, src, chan); - tmp = LLVMBuildBitCast(builder, tmp, bld_base->uint_bld.elem_type, ""); + tmp = LLVMBuildBitCast(builder, tmp, ctx->i32, ""); coords[chan] = tmp; } /* 1D textures are allocated and used as 2D on GFX9. */ if (ctx->screen->b.chip_class >= GFX9) { if (target == TGSI_TEXTURE_1D) { coords[1] = ctx->i32_0; num_coords++; } else if (target == TGSI_TEXTURE_1D_ARRAY) { coords[2] = coords[1]; @@ -3414,31 +3414,31 @@ static void buffer_append_args( static void load_fetch_args( struct lp_build_tgsi_context * bld_base, struct lp_build_emit_data * emit_data) { struct si_shader_context *ctx = si_shader_context(bld_base); struct gallivm_state *gallivm = bld_base->base.gallivm; const struct tgsi_full_instruction * inst = emit_data->inst; unsigned target = inst->Memory.Texture; LLVMValueRef rsrc; - emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4); + emit_data->dst_type = ctx->v4f32; if (inst->Src[0].Register.File == TGSI_FILE_BUFFER) { LLVMBuilderRef builder = gallivm->builder; LLVMValueRef offset; LLVMValueRef tmp; rsrc = shader_buffer_fetch_rsrc(ctx, >Src[0]); tmp = lp_build_emit_fetch(bld_base, inst, 1, 0); - offset = LLVMBuildBitCast(builder, tmp, bld_base->uint_bld.elem_type, ""); + offset = LLVMBuildBitCast(builder, tmp, ctx->i32, ""); buffer_append_args(ctx, emit_data, rsrc, ctx->i32_0, offset, false, false); } else if (inst->Src[0].Register.File == TGSI_FILE_IMAGE) { LLVMValueRef coords; image_fetch_rsrc(bld_base, >Src[0], false, target, ); coords = image_fetch_coords(bld_base, inst, 1); if (target == TGSI_TEXTURE_BUFFER) { @@ -3522,32 +3522,31 @@ static LLVMValueRef get_memory_ptr(struct si_shader_context *ctx, ptr = LLVMBuildBitCast(builder, ptr, LLVMPointerType(type, addr_space), ""); return ptr; } static void load_emit_memory( struct si_shader_context *ctx, struct lp_build_emit_data *emit_data) { const struct tgsi_full_instruction *inst = emit_data->inst; - struct lp_build_context *base = >bld_base.base; struct gallivm_state *gallivm = >gallivm; LLVMBuilderRef builder = gallivm->builder; unsigned writemask = inst->Dst[0].Register.WriteMask; LLVMValueRef channels[4], ptr, derived_ptr, index; int chan; - ptr = get_memory_ptr(ctx, inst, base->elem_type, 1); + ptr = get_memory_ptr(ctx, inst, ctx->f32, 1); for (chan = 0; chan < 4; ++chan) { if (!(writemask & (1 << chan))) { - channels[chan] = LLVMGetUndef(base->elem_type); + channels[chan] = LLVMGetUndef(ctx->f32); continue; } index = LLVMConstInt(ctx->i32, chan, 0); derived_ptr = LLVMBuildGEP(builder, ptr, , 1, ""); channels[chan] = LLVMBuildLoad(builder, derived_ptr, ""); } emit_data->output[emit_data->chan] = lp_build_gather_values(gallivm, channels, 4); } @@ -3692,21 +3691,21 @@ static void store_fetch_args( memory = tgsi_full_src_register_from_dst(>Dst[0]); if (inst->Dst[0].Register.File == TGSI_FILE_BUFFER) { LLVMValueRef offset; LLVMValueRef tmp; rsrc = shader_buffer_fetch_rsrc(ctx, ); tmp = lp_build_emit_fetch(bld_base, inst, 0, 0); - offset = LLVMBuildBitCast(builder, tmp, bld_base->uint_bld.elem_type, ""); + offset = LLVMBuildBitCast(builder, tmp,