Re: [Mesa-dev] [PATCH] AndroidIA: android: add libmesa_genxml as dep to libmesa_isl
doh sorry about that 'AndroidIA' there, we are using it to differentiate patches that we have in our tree and are not in Mesa master yet. On 03/30/2017 08:51 AM, Tapani Pälli wrote: This is to fix following compile error with libmesa_isl: mesa/src/intel/isl/isl.c:28:10: fatal error: 'genxml/genX_bits.h' file not found Fixes: f0eaf38 ("genxml: New generated header genX_bits.h (v6)") Signed-off-by: Tapani Pälli--- src/intel/Android.isl.mk | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/intel/Android.isl.mk b/src/intel/Android.isl.mk index bc58b97..67e6d2d 100644 --- a/src/intel/Android.isl.mk +++ b/src/intel/Android.isl.mk @@ -186,7 +186,8 @@ LOCAL_WHOLE_STATIC_LIBRARIES := \ libmesa_isl_gen7 \ libmesa_isl_gen75 \ libmesa_isl_gen8 \ - libmesa_isl_gen9 + libmesa_isl_gen9 \ + libmesa_genxml # Autogenerated sources ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] AndroidIA: android: add libmesa_genxml as dep to libmesa_isl
This is to fix following compile error with libmesa_isl: mesa/src/intel/isl/isl.c:28:10: fatal error: 'genxml/genX_bits.h' file not found Fixes: f0eaf38 ("genxml: New generated header genX_bits.h (v6)") Signed-off-by: Tapani Pälli--- src/intel/Android.isl.mk | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/intel/Android.isl.mk b/src/intel/Android.isl.mk index bc58b97..67e6d2d 100644 --- a/src/intel/Android.isl.mk +++ b/src/intel/Android.isl.mk @@ -186,7 +186,8 @@ LOCAL_WHOLE_STATIC_LIBRARIES := \ libmesa_isl_gen7 \ libmesa_isl_gen75 \ libmesa_isl_gen8 \ - libmesa_isl_gen9 + libmesa_isl_gen9 \ + libmesa_genxml # Autogenerated sources -- 2.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: disable glthread when DEBUG_OUTPUT_SYNCHRONOUS is enabled
We could re-enable it also but I haven't tested that yet, and I'm not sure we care much anyway. --- src/mesa/main/debug_output.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/main/debug_output.c b/src/mesa/main/debug_output.c index bc933db..2b22645 100644 --- a/src/mesa/main/debug_output.c +++ b/src/mesa/main/debug_output.c @@ -22,20 +22,21 @@ * OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include "context.h" #include "debug_output.h" #include "dispatch.h" #include "enums.h" +#include "glthread.h" #include "imports.h" #include "hash.h" #include "mtypes.h" #include "version.h" #include "util/hash_table.h" #include "util/simple_list.h" static mtx_t DynamicIDMutex = _MTX_INITIALIZER_NP; static GLuint NextDynamicID = 1; @@ -741,20 +742,24 @@ _mesa_set_debug_state_int(struct gl_context *ctx, GLenum pname, GLint val) if (!debug) return false; switch (pname) { case GL_DEBUG_OUTPUT: debug->DebugOutput = (val != 0); break; case GL_DEBUG_OUTPUT_SYNCHRONOUS_ARB: debug->SyncOutput = (val != 0); + if (debug->SyncOutput) { + _mesa_glthread_finish(ctx); + _mesa_glthread_restore_dispatch(ctx); + } break; default: assert(!"unknown debug output param"); break; } _mesa_unlock_debug_state(ctx); return true; } -- 2.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [AppVeyor] mesa master #3903 completed
Build mesa 3903 completed Commit 36cb2003f1 by Harish Krupo on 3/28/2017 6:38 PM: android: pass sse4.1 flag as appropriate\n\nWe have functions which depend on sse4.1 support but we didnt pass\nthe right compile flag for it. This patch fixes it.\n\nSigned-off-by: Kalyan Kondapally\nSigned-off-by: Harish Krupo \nReviewed-by: Tapani Pälli Configure your notification preferences ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util/u_atomic: provide 64bit atomics where they're missing
On Wed, Mar 29, 2017 at 04:55:54PM -0700, Matt Turner wrote: > On Wed, Mar 29, 2017 at 4:13 PM, Grazvydas Ignotaswrote: > > There are still some distributions trying to support unfortunate people > > with old or exotic CPUs that don't have 64bit atomic operations. When > > compiling for such a machine, gcc conveniently inserts a library call to > > a helper, but it's implementation is missing and we get a linker error. > > This allows us to provide our implementation, which is marked weak to > > prefer a better implementation, should one exist. > > > > Cc: Matt Turner > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089 > > Signed-off-by: Grazvydas Ignotas > > --- > > Thanks, this is a really good idea. > > > configure.ac | 12 > > src/util/Makefile.sources | 1 + > > src/util/u_atomic.c | 71 > > +++ > > 3 files changed, 84 insertions(+) > > create mode 100644 src/util/u_atomic.c > > > > diff --git a/configure.ac b/configure.ac > > index ab9a91e..89b615b 100644 > > --- a/configure.ac > > +++ b/configure.ac > > @@ -413,10 +413,22 @@ int main() { > > if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then > > DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS" > > fi > > AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test > > x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1]) > > > > +dnl Check if host supports 64bit atomics > > +dnl note that lack of support usually results in link (not compile) error > > +AC_LINK_IFELSE([AC_LANG_SOURCE([[ > > +#include > > +uint64_t v; > > +int main() { > > +return __sync_add_and_fetch(, (uint64_t)1); > > +}]])], GCC_64BIT_ATOMICS_SUPPORTED=1) > > +if test "x$GCC_64BIT_ATOMICS_SUPPORTED" != x1; then > > +DEFINES="$DEFINES -DMISSING_64BIT_ATOMICS" > > +fi > > + > > dnl Check for Endianness > > AC_C_BIGENDIAN( > > little_endian=no, > > little_endian=yes, > > little_endian=no, > > diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources > > index 8ee45d5..e905734 100644 > > --- a/src/util/Makefile.sources > > +++ b/src/util/Makefile.sources > > @@ -41,10 +41,11 @@ MESA_UTIL_FILES := \ > > string_to_uint_map.h \ > > strndup.h \ > > strtod.c \ > > strtod.h \ > > texcompress_rgtc_tmp.h \ > > + u_atomic.c \ > > u_atomic.h \ > > u_endian.h \ > > u_queue.c \ > > u_queue.h \ > > u_string.h \ > > diff --git a/src/util/u_atomic.c b/src/util/u_atomic.c > > new file mode 100644 > > index 000..77ef119 > > --- /dev/null > > +++ b/src/util/u_atomic.c > > @@ -0,0 +1,71 @@ > > +/* > > + * Copyright ?? 2017 The Mesa Project > > The Mesa Project isn't something that can hold copyright. Your name > should be here. > > > + * > > + * Permission is hereby granted, free of charge, to any person obtaining a > > + * copy of this software and associated documentation files (the > > "Software"), > > + * to deal in the Software without restriction, including without > > limitation > > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > > + * and/or sell copies of the Software, and to permit persons to whom the > > + * Software is furnished to do so, subject to the following conditions: > > + * > > + * The above copyright notice and this permission notice (including the > > next > > + * paragraph) shall be included in all copies or substantial portions of > > the > > + * Software. > > + * > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS > > OR > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR > > OTHER > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > > DEALINGS > > + * IN THE SOFTWARE. > > + */ > > + > > +#if defined(MISSING_64BIT_ATOMICS) && defined(HAVE_PTHREAD) > > + > > +#include > > +#include > > + > > +#if defined(HAVE_FUNC_ATTRIBUTE_WEAK) && !defined(__CYGWIN__) > > +#define WEAK __attribute__((weak)) > > +#else > > +#define WEAK > > +#endif > > + > > +static pthread_mutex_t sync_mutex = PTHREAD_MUTEX_INITIALIZER; > > + > > +WEAK uint64_t __sync_add_and_fetch_8(uint64_t *ptr, uint64_t val) > > Let's do BSD-style function declarations, with the qualifiers and > return type on their own line. > > With those two trivial things changed, this is > > Reviewed-by: Matt Turner > > Grazvydas, if you have not already, please file a request for a > Freedesktop account [1] [2] and let's get you commit access. > > Jonathan, can you check whether this resolves the bug entirely? Or are > there some other __sync functions we need to implement? I see >
Re: [Mesa-dev] [PATCH] mesa/glthread: add custom marshalling for ClearBufferfv()
On 28/03/17 01:02, Gregory Hainaut wrote: Hello Timothy, 2 small questions: Will it work for DSA equivalent function, namely glClearNamedFramebufferfv ? It looks like we don't currently even bother to implement glClearNamedFramebufferfv properly. Would it be interesting to also do the equivalent for glClearBufferiv/glClearBufferuiv ? Note the *uiv variant could be easier as the size is always 4 INT, so it can be done with a scale attribute on the XML. Sure. For now I was seeing this one in use so I added it. Cheers, Gregory ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gbm/dri: Flush after unmap
On 30/03/17 12:56 AM, Thomas Hellstrom wrote: > On 03/29/2017 02:34 PM, Emil Velikov wrote: >> On 29 March 2017 at 13:02, Thomas Hellstromwrote: >>> On 03/29/2017 01:30 PM, Emil Velikov wrote: On 28 March 2017 at 20:39, Thomas Hellstrom wrote: > > Signed-off-by: Thomas Hellstrom > --- > src/gbm/backends/dri/gbm_dri.c | 9 - > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/src/gbm/backends/dri/gbm_dri.c > b/src/gbm/backends/dri/gbm_dri.c > index ac7ede8..6c2244c 100644 > --- a/src/gbm/backends/dri/gbm_dri.c > +++ b/src/gbm/backends/dri/gbm_dri.c > @@ -243,7 +243,7 @@ struct dri_extension_match { > }; > > static struct dri_extension_match dri_core_extensions[] = { > - { __DRI2_FLUSH, 1, offsetof(struct gbm_dri_device, flush) }, > + { __DRI2_FLUSH, 4, offsetof(struct gbm_dri_device, flush) }, Currently the classic nouveau, radeon/r200 and i915 drivers do not support v4 of the extension. As-is this will 'break' them... if they ever worked to begin with. One solution is to bail out (return -ENOSYS or similar) in map/unmap API of the when the DRI module is too old. Just some ^^ food for thought. >>> Hmm. Is there even a use-case for gbm with those drivers? If so we >>> should perhaps make them up-to-date with the flush extension. >>> >> Of the above: >> >> - nouveau: Does not support DRI_IMAGE, thus it doesn't work even >> before the patch. >> - i915: I have some untested ancient patches. Will see if I can rebase >> + send out. >> - radeons: ?? >> >> If someone reports an issue we can ask them to write/test some code, I guess >> ;-) > > Indeed. It looks like gbm is mostly used together with KMS anyway... All of the above drivers are KMS based, FWIW. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [RFC v3] mesa/glthread: Call unmarshal_batch directly in glthread_finish
On 30/03/17 02:31 AM, Bartosz Tomczyk wrote: > Call it directly when batch queue is empty. This avoids costly thread > synchronisation. With this fix games that previously regressed > with mesa_glthread=true like xonotic or grid autosport. The second sentence here is missing a verb (at least). -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [AppVeyor] mesa master #3902 failed
Build mesa 3902 failed Commit a930c2c612 by Dave Airlie on 3/30/2017 3:09 AM: radv: fix mask attribs properly.\n\nsome days it just doesn't pay to get out of bed.\n\nSigned-off-by: Dave AirlieConfigure your notification preferences ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] drirc: Set glsl_zero_init for Kerbal Space Program.
On Wed, Mar 29, 2017 at 7:41 PM, Francisco Jerezwrote: > This fixes the stripes of garbage rendered on the floor of the vehicle > assembly building among other rendering issues. The reason for the > misrendering seems to be that some of the GLSL shaders used by the > application use variables before initializing them, incorrectly > assuming that they will be implicitly set to zero by the > implementation. Sigh. Acked-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] drirc: Set glsl_zero_init for Kerbal Space Program.
This fixes the stripes of garbage rendered on the floor of the vehicle assembly building among other rendering issues. The reason for the misrendering seems to be that some of the GLSL shaders used by the application use variables before initializing them, incorrectly assuming that they will be implicitly set to zero by the implementation. --- src/mesa/drivers/dri/common/drirc | 8 1 file changed, 8 insertions(+) diff --git a/src/mesa/drivers/dri/common/drirc b/src/mesa/drivers/dri/common/drirc index 494e9e1..f8babb7 100644 --- a/src/mesa/drivers/dri/common/drirc +++ b/src/mesa/drivers/dri/common/drirc @@ -120,5 +120,13 @@ TODO: document the other workarounds. + + + + + + + + -- 2.10.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] stash
--- src/intel/blorp/blorp_gen4_exec_priv.h | 81 ++ src/intel/blorp/blorp_priv.h | 1 + 2 files changed, 82 insertions(+) diff --git a/src/intel/blorp/blorp_gen4_exec_priv.h b/src/intel/blorp/blorp_gen4_exec_priv.h index 90f9613..b0e4cba 100644 --- a/src/intel/blorp/blorp_gen4_exec_priv.h +++ b/src/intel/blorp/blorp_gen4_exec_priv.h @@ -40,13 +40,94 @@ blorp_emit_vs_state(struct blorp_batch *batch, return blorp_general_state_address(batch, offset); } +struct blorp_sf_key { + enum blorp_shader_type shader_type; /* Must be BLORP_SHADER_TYPE_GEN4_SF */ + + struct brw_sf_prog_key key; +}; + +static bool +blorp_get_gen4_sf_program(struct blorp_context *blorp, + const struct brw_wm_prog_data *wm_prog_data, + uint32_t *kernel, + struct brw_sf_prog_data **prog_data) +{ + struct blorp_sf_key key = { + .shader_type = BLORP_SHADER_TYPE_GEN4_SF, + }; + + assert(wm_prog_data); + + /* Everything gets compacted in vertex setup, so we just need a +* pass-through for the correct number of input varyings. +*/ + const uint64_t slots_valid = VARYING_BIT_POS | + ((1 << wm_prog_data->num_varying_inputs) - 1) << VARYING_SLOT_VAR0; + + key.key.contains_flat_varying = wm_prog_data->contains_flat_varying; + key.key.attrs = slots_valid; + + STATIC_ASSERT(sizeof(key.key.interp_mode) == + sizeof(wm_prog_data->interp_mode)); + memcpy(key.key.interp_mode, wm_prog_data->interp_mode, + sizeof(key.key.interp_mode)); + + if (blorp->lookup_shader(blorp, , sizeof(key), kernel, prog_data)) + return true; + + void *mem_ctx = ralloc_context(NULL); + + const unsigned *program; + unsigned program_size; + + struct brw_vue_map vue_map; + brw_compute_vue_map(blorp->compiler->devinfo, _map, slots_valid, false); + + struct brw_sf_prog_data prog_data_tmp; + program = brw_compile_sf(blorp->compiler, mem_ctx, , +_data_tmp, _map, _size); + + bool result = + blorp->upload_shader(blorp, , sizeof(key), program, program_size, + (void *)_data_tmp, sizeof(prog_data_tmp), + kernel, prog_data); + + ralloc_free(mem_ctx); + + return result; +} + static struct blorp_address blorp_emit_sf_state(struct blorp_batch *batch, const struct blorp_params *params) { + uint32_t kernel; + struct brw_sf_prog_data *prog_data; + blorp_get_gen4_sf_program(batch->blorp, params->wm_prog_data, + , _data); + /* TODO: Handle error? */ + uint32_t offset; blorp_emit_dynamic(batch, GENX(SF_STATE), sf, AUB_TRACE_SF_STATE, 64, ) { + sf.KernelStartPointer = kernel; + sf.GRFRegisterCount = DIV_ROUND_UP(prog_data->total_grf, 16); + sf.VertexURBEntryReadLength = prog_data->urb_read_length; + sf.VertexURBEntryReadOffset = BRW_SF_URB_ENTRY_READ_OFFSET; + sf.DispatchGRFStartRegisterforURBData = 3; + +#if GEN_GEN == 5 + sf.MaximumNumberofThreads = 48; +#else + sf.MaximumNumberofThreads = 24; +#endif + + sf.URBEntryAllocationSize = prog_data->urb_entry_size; + sf.NumberofURBEntries; + + sf.ViewportTransformEnable = false; + + sf.CullMode = CULLMODE_NONE; } return blorp_general_state_address(batch, offset); diff --git a/src/intel/blorp/blorp_priv.h b/src/intel/blorp/blorp_priv.h index c61ab08..e7b3508 100644 --- a/src/intel/blorp/blorp_priv.h +++ b/src/intel/blorp/blorp_priv.h @@ -201,6 +201,7 @@ enum blorp_shader_type { BLORP_SHADER_TYPE_BLIT, BLORP_SHADER_TYPE_CLEAR, BLORP_SHADER_TYPE_LAYER_OFFSET_VS, + BLORP_SHADER_TYPE_GEN4_SF, }; struct brw_blorp_blit_prog_key -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: allow glsl_type::sampler_index() with images
Reviewed-by: Timothy Arceri___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/glsl_to_tgsi: use glsl_type::sampler_index()
Reviewed-by: Timothy Arceri___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util/u_atomic: provide 64bit atomics where they're missing
On Wed, Mar 29, 2017 at 4:13 PM, Grazvydas Ignotaswrote: > There are still some distributions trying to support unfortunate people > with old or exotic CPUs that don't have 64bit atomic operations. When > compiling for such a machine, gcc conveniently inserts a library call to > a helper, but it's implementation is missing and we get a linker error. > This allows us to provide our implementation, which is marked weak to > prefer a better implementation, should one exist. > > Cc: Matt Turner > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089 > Signed-off-by: Grazvydas Ignotas > --- Thanks, this is a really good idea. > configure.ac | 12 > src/util/Makefile.sources | 1 + > src/util/u_atomic.c | 71 > +++ > 3 files changed, 84 insertions(+) > create mode 100644 src/util/u_atomic.c > > diff --git a/configure.ac b/configure.ac > index ab9a91e..89b615b 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -413,10 +413,22 @@ int main() { > if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then > DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS" > fi > AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test > x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1]) > > +dnl Check if host supports 64bit atomics > +dnl note that lack of support usually results in link (not compile) error > +AC_LINK_IFELSE([AC_LANG_SOURCE([[ > +#include > +uint64_t v; > +int main() { > +return __sync_add_and_fetch(, (uint64_t)1); > +}]])], GCC_64BIT_ATOMICS_SUPPORTED=1) > +if test "x$GCC_64BIT_ATOMICS_SUPPORTED" != x1; then > +DEFINES="$DEFINES -DMISSING_64BIT_ATOMICS" > +fi > + > dnl Check for Endianness > AC_C_BIGENDIAN( > little_endian=no, > little_endian=yes, > little_endian=no, > diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources > index 8ee45d5..e905734 100644 > --- a/src/util/Makefile.sources > +++ b/src/util/Makefile.sources > @@ -41,10 +41,11 @@ MESA_UTIL_FILES := \ > string_to_uint_map.h \ > strndup.h \ > strtod.c \ > strtod.h \ > texcompress_rgtc_tmp.h \ > + u_atomic.c \ > u_atomic.h \ > u_endian.h \ > u_queue.c \ > u_queue.h \ > u_string.h \ > diff --git a/src/util/u_atomic.c b/src/util/u_atomic.c > new file mode 100644 > index 000..77ef119 > --- /dev/null > +++ b/src/util/u_atomic.c > @@ -0,0 +1,71 @@ > +/* > + * Copyright © 2017 The Mesa Project The Mesa Project isn't something that can hold copyright. Your name should be here. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > + * IN THE SOFTWARE. > + */ > + > +#if defined(MISSING_64BIT_ATOMICS) && defined(HAVE_PTHREAD) > + > +#include > +#include > + > +#if defined(HAVE_FUNC_ATTRIBUTE_WEAK) && !defined(__CYGWIN__) > +#define WEAK __attribute__((weak)) > +#else > +#define WEAK > +#endif > + > +static pthread_mutex_t sync_mutex = PTHREAD_MUTEX_INITIALIZER; > + > +WEAK uint64_t __sync_add_and_fetch_8(uint64_t *ptr, uint64_t val) Let's do BSD-style function declarations, with the qualifiers and return type on their own line. With those two trivial things changed, this is Reviewed-by: Matt Turner Grazvydas, if you have not already, please file a request for a Freedesktop account [1] [2] and let's get you commit access. Jonathan, can you check whether this resolves the bug entirely? Or are there some other __sync functions we need to implement? I see __sync_add_and_fetch_4, etc, in the bug report. [1] https://bugs.freedesktop.org/enter_bug.cgi?product=freedesktop.org=New%20Accounts [2] https://www.freedesktop.org/wiki/AccountRequests/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] util/u_atomic: provide 64bit atomics where they're missing
There are still some distributions trying to support unfortunate people with old or exotic CPUs that don't have 64bit atomic operations. When compiling for such a machine, gcc conveniently inserts a library call to a helper, but it's implementation is missing and we get a linker error. This allows us to provide our implementation, which is marked weak to prefer a better implementation, should one exist. Cc: Matt TurnerBugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089 Signed-off-by: Grazvydas Ignotas --- configure.ac | 12 src/util/Makefile.sources | 1 + src/util/u_atomic.c | 71 +++ 3 files changed, 84 insertions(+) create mode 100644 src/util/u_atomic.c diff --git a/configure.ac b/configure.ac index ab9a91e..89b615b 100644 --- a/configure.ac +++ b/configure.ac @@ -413,10 +413,22 @@ int main() { if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS" fi AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1]) +dnl Check if host supports 64bit atomics +dnl note that lack of support usually results in link (not compile) error +AC_LINK_IFELSE([AC_LANG_SOURCE([[ +#include +uint64_t v; +int main() { +return __sync_add_and_fetch(, (uint64_t)1); +}]])], GCC_64BIT_ATOMICS_SUPPORTED=1) +if test "x$GCC_64BIT_ATOMICS_SUPPORTED" != x1; then +DEFINES="$DEFINES -DMISSING_64BIT_ATOMICS" +fi + dnl Check for Endianness AC_C_BIGENDIAN( little_endian=no, little_endian=yes, little_endian=no, diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources index 8ee45d5..e905734 100644 --- a/src/util/Makefile.sources +++ b/src/util/Makefile.sources @@ -41,10 +41,11 @@ MESA_UTIL_FILES := \ string_to_uint_map.h \ strndup.h \ strtod.c \ strtod.h \ texcompress_rgtc_tmp.h \ + u_atomic.c \ u_atomic.h \ u_endian.h \ u_queue.c \ u_queue.h \ u_string.h \ diff --git a/src/util/u_atomic.c b/src/util/u_atomic.c new file mode 100644 index 000..77ef119 --- /dev/null +++ b/src/util/u_atomic.c @@ -0,0 +1,71 @@ +/* + * Copyright © 2017 The Mesa Project + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#if defined(MISSING_64BIT_ATOMICS) && defined(HAVE_PTHREAD) + +#include +#include + +#if defined(HAVE_FUNC_ATTRIBUTE_WEAK) && !defined(__CYGWIN__) +#define WEAK __attribute__((weak)) +#else +#define WEAK +#endif + +static pthread_mutex_t sync_mutex = PTHREAD_MUTEX_INITIALIZER; + +WEAK uint64_t __sync_add_and_fetch_8(uint64_t *ptr, uint64_t val) +{ + uint64_t r; + + pthread_mutex_lock(_mutex); + *ptr += val; + r = *ptr; + pthread_mutex_unlock(_mutex); + + return r; +} + +WEAK uint64_t __sync_sub_and_fetch_8(uint64_t *ptr, uint64_t val) +{ + uint64_t r; + + pthread_mutex_lock(_mutex); + *ptr -= val; + r = *ptr; + pthread_mutex_unlock(_mutex); + + return r; +} + +WEAK uint64_t __atomic_fetch_add_8(uint64_t *ptr, uint64_t val, int memorder) +{ + return __sync_add_and_fetch(ptr, val); +} + +WEAK uint64_t __atomic_fetch_sub_8(uint64_t *ptr, uint64_t val, int memorder) +{ + return __sync_sub_and_fetch(ptr, val); +} + +#endif -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/2] anv: Add support for 48-bit addresses
On Wed, Mar 29, 2017 at 04:51:12PM +0100, Chris Wilson wrote: > On Wed, Mar 29, 2017 at 08:36:36AM -0700, Jason Ekstrand wrote: > >On Wed, Mar 29, 2017 at 1:51 AM, Chris Wilson > ><[1]ch...@chris-wilson.co.uk> wrote: > > > diff --git a/src/intel/vulkan/anv_private.h > > b/src/intel/vulkan/anv_private.h > > > index 27c887c..425e376 100644 > > > --- a/src/intel/vulkan/anv_private.h > > > +++ b/src/intel/vulkan/anv_private.h > > > @@ -299,11 +299,34 @@ struct anv_bo { > > > * writing to them and synchronize uses on other rings (eg if the > > display > > > * server uses the blitter ring). > > > */ > > > - bool is_winsys_bo; > > > + bool is_winsys_bo:1; > > > + > > > + /* Whether or not this BO supports having a 48-bit address. Not > > all > > > + * buffers support arbitrary 48-bit addresses. In particular, we > > need to > > > + * be careful with general and instruction state buffers because > > we set the > > > + * size in STATE_BASE_ADDRESS to 0xf (the maximum) even > > though > > the BO > > > + * is most likely significantly smaller. If we let the kernel > > place it > > > + * anywhere it wants, it will default to placing it as high up > > the > > address > > > + * space as possible, the range specified by STATE_BASE_ADDRESS > > will > > > + * over-flow the 48-bit address range, and the GPU will hang. In > > order to > > > + * avoid this problem, we tell the kernel that the buffer does > > not > > support > > > + * 48-bit addresses, and it places the buffer at a 32-bit > > address. While > > > + * this solution is probably overkill, it is effective. > > > > How about just setting the field to the bo->size? You must know the bo > > already at that point so that you can set the relocation target. > > > >Actually, we don't. We have a pointer to a thing that claims to be a BO > >but the actual GEM handle and size aren't known until execbuf time. > > (Yes, > >that's a bit weird but there are good reasons for it and it's not likely > >to change. When we stop doing relocations, there's a separate plan for > >how to handle that.) > > Hmm. I honestly didn't expect that. Since you have the machinery to resolve the relocations after the fact, you could treat the size field as a different type of patching. Just an idea to resolve the placement restriction later. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: allow glsl_type::sampler_index() with images
Signed-off-by: Samuel Pitoiset--- src/compiler/glsl_types.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp index 405aa3679a..cf0fe71d1a 100644 --- a/src/compiler/glsl_types.cpp +++ b/src/compiler/glsl_types.cpp @@ -315,7 +315,7 @@ glsl_type::sampler_index() const { const glsl_type *const t = (this->is_array()) ? this->fields.array : this; - assert(t->is_sampler()); + assert(t->is_sampler() || t->is_image()); switch (t->sampler_dimensionality) { case GLSL_SAMPLER_DIM_1D: -- 2.12.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/glsl_to_tgsi: use glsl_type::sampler_index()
Signed-off-by: Samuel Pitoiset--- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 68 +- 1 file changed, 2 insertions(+), 66 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 46c97783d8..d70018c8a8 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -3883,39 +3883,7 @@ glsl_to_tgsi_visitor::visit_image_intrinsic(ir_call *ir) inst->sampler_array_size = sampler_array_size; inst->sampler_base = sampler_base; - switch (type->sampler_dimensionality) { - case GLSL_SAMPLER_DIM_1D: - inst->tex_target = (type->sampler_array) - ? TEXTURE_1D_ARRAY_INDEX : TEXTURE_1D_INDEX; - break; - case GLSL_SAMPLER_DIM_2D: - inst->tex_target = (type->sampler_array) - ? TEXTURE_2D_ARRAY_INDEX : TEXTURE_2D_INDEX; - break; - case GLSL_SAMPLER_DIM_3D: - inst->tex_target = TEXTURE_3D_INDEX; - break; - case GLSL_SAMPLER_DIM_CUBE: - inst->tex_target = (type->sampler_array) - ? TEXTURE_CUBE_ARRAY_INDEX : TEXTURE_CUBE_INDEX; - break; - case GLSL_SAMPLER_DIM_RECT: - inst->tex_target = TEXTURE_RECT_INDEX; - break; - case GLSL_SAMPLER_DIM_BUF: - inst->tex_target = TEXTURE_BUFFER_INDEX; - break; - case GLSL_SAMPLER_DIM_EXTERNAL: - inst->tex_target = TEXTURE_EXTERNAL_INDEX; - break; - case GLSL_SAMPLER_DIM_MS: - inst->tex_target = (type->sampler_array) - ? TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX : TEXTURE_2D_MULTISAMPLE_INDEX; - break; - default: - assert(!"Should not get here."); - } - + inst->tex_target = type->sampler_index(); inst->image_format = st_mesa_format_to_pipe_format(st_context(ctx), _mesa_get_shader_image_format(imgvar->data.image_format)); @@ -4425,39 +4393,7 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir) inst->tex_offset_num_offset = i; } - switch (sampler_type->sampler_dimensionality) { - case GLSL_SAMPLER_DIM_1D: - inst->tex_target = (sampler_type->sampler_array) - ? TEXTURE_1D_ARRAY_INDEX : TEXTURE_1D_INDEX; - break; - case GLSL_SAMPLER_DIM_2D: - inst->tex_target = (sampler_type->sampler_array) - ? TEXTURE_2D_ARRAY_INDEX : TEXTURE_2D_INDEX; - break; - case GLSL_SAMPLER_DIM_3D: - inst->tex_target = TEXTURE_3D_INDEX; - break; - case GLSL_SAMPLER_DIM_CUBE: - inst->tex_target = (sampler_type->sampler_array) - ? TEXTURE_CUBE_ARRAY_INDEX : TEXTURE_CUBE_INDEX; - break; - case GLSL_SAMPLER_DIM_RECT: - inst->tex_target = TEXTURE_RECT_INDEX; - break; - case GLSL_SAMPLER_DIM_BUF: - inst->tex_target = TEXTURE_BUFFER_INDEX; - break; - case GLSL_SAMPLER_DIM_EXTERNAL: - inst->tex_target = TEXTURE_EXTERNAL_INDEX; - break; - case GLSL_SAMPLER_DIM_MS: - inst->tex_target = (sampler_type->sampler_array) - ? TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX : TEXTURE_2D_MULTISAMPLE_INDEX; - break; - default: - assert(!"Should not get here."); - } - + inst->tex_target = sampler_type->sampler_index(); inst->tex_type = ir->type->base_type; this->result = result_src; -- 2.12.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Meson mesademos (Was: [RFC libdrm 0/2] Replace the build system with meson)
On 28/03/17 22:37, Dylan Baker wrote: Quoting Jose Fonseca (2017-03-28 13:45:57) On 28/03/17 21:32, Dylan Baker wrote: Quoting Jose Fonseca (2017-03-28 09:19:48) On 28/03/17 00:12, Dylan Baker wrote: Quoting Jose Fonseca (2017-03-27 09:58:59) On 27/03/17 17:42, Dylan Baker wrote: Quoting Jose Fonseca (2017-03-27 09:31:04) On 27/03/17 17:24, Dylan Baker wrote: Quoting Jose Fonseca (2017-03-26 14:53:50) I've pushed the branch to mesa/demos, so we can all collaborate without wasting time crossporting patches between private branches. https://cgit.freedesktop.org/mesa/demos/commit/?h=meson Unfortunately, I couldn't actually go very far until I hit a wall, as you can see in the last commit message. The issue is that Windows has no standard paths for dependencies includes/libraries (like /usr/include or /usr/lib), nor standard tool for dependencies (no pkgconfig). But it seems that Meson presumes any unknown dependency can be resolved with pkgconfig. The question is: how do I tell Meson where the GLEW headers/library for MinGW are supposed to be found? I know one solution might be Meson Wraps. Is that the only way? CMake makes it very easy to do it (via Cache files as explained in my commit message.) Is there a way to achieve the same, perhaps via cross_file properties or something like that? Jose I think there are two ways you could solve this: Wraps are probably the most generically correct method; what I mean by that is that a proper wrap would solve the problem for everyone, on every operating system, forever. Yeah, that sounded a good solution, particularly for windows where's so much easier to just build the dependencies as a subproject rather than fetch dependencies from somewhere, since MSVC RT versions have to match and so. > That said, I took a look at GLEW and it doesn't look like a straightforward project to port to meson, since it uses a huge pile of gnu makefiles for compilation, without any autoconf/cmake/etc. I still might take a swing at it since I want to know how hard it would be to write a wrap file for something like GLEW (and it would probably be a pretty useful project to wrap) where a meson build system is likely never going to go upstream. BTW, regarding GLEW, some time ago I actually prototyped using GLAD instead of GLEW for mesademos: https://cgit.freedesktop.org/~jrfonseca/mesademos/log/?h=glad I find GLAD much nicer that GLEW: it's easier to build, it uses upstream XML files, it supports EGL, and it's easy to bundle. Maybe we could migrate mesademos to GLAD as part of this work instead of trying to get glew "mesonfied". The other option I think you can use use is cross properties[1], which I believe is the closest thing meson has to cmake's cache files. I've pushed a couple of commits, the last one implements the cross properties idea, which gets the build farther, but then it can't find the glut headers, and I don't understand why, since "cc.has_header('GL/glut')" returns true. I still think that wraps are a better plan, but I'll have to spend some time today working on a glew wrap. [1] https://github.com/mesonbuild/meson/wiki/Cross-compilation (at the bottom under the heading "Custom Data") I'm running out of time today, but I'll try to take a look tomorrow. Jose I'd had a similar thought, but thought of libpeoxy? It supports the platforms we want, and already has a meson build system that works for windows. I have no experience with libepoxy. I know GLAD is really easy to understand, use and integrate. It's completly agnostic to toolkits like GLUT/GLFW/etc doesn't try to alias equivalent entrypoints, or anything smart like libepoxy. In particular I don't fully understand libepoxy behavior regarding wglMakeCurrent is, and whether that will create problems with GLUT, since GLUT will call wglMakeCurrent.. Jose Okay, I have libepoxy working for windows. I also got libepoxy working as a subproject, but it took a bit of hacking on their build system (there's some things they're doing that make them non-subproject safe, I'll send patches and work that out with them. https://github.com/dcbaker/libepoxy.git fix-suproject Thanks. GLEW is not the only one case though. There's also FREEGLUT. So we can't really avoid the problem of external windows binaries/subprojects. So I've been thinking, and I suspect is better if first get things working with binary GLEW / FREGLUT projects, then try the glew -> libepoxy in a 2nd step, so there's less to take in to merge meson into master. Clone that repo into $mesa-demos-root/subprojects and things should just work, or mostly work. I got epoxy compiling, but ran into some issues in the mingw glu header. Dylan I'm pretty sure the problem with MinGW glu is the lack of windows.h. We need to do the same as CMakeLists.txt snippet quoted below. I'm running out of time today, but I'll look into porting this over to meson tomorrow if you don't beat me to it. Jose if (WIN32)
[Mesa-dev] [PATCH] i965/fs: Gracefully handle TXS on multisampled textures with no LOD
This can happen for multisampled textures since they are never mipmapped and textureSize(gsampler2DMS*) does not take an LOD parameter. This fixes a shader validation error in the new Sascha deferredmultisampling demo. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100391 Cc: "13.0 17.0"--- We could also easily enough handle this in spirv_to_nir like we do with GLSL. However, it seems perfectly reasonable that multisampled txs should allow no LOD in NIR. src/intel/compiler/brw_fs_nir.cpp | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index bc1ccfb..60604e1 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -4381,9 +4381,12 @@ fs_visitor::nir_emit_texture(const fs_builder , nir_tex_instr *instr) srcs[TEX_LOGICAL_SRC_GRAD_COMPONENTS] = brw_imm_d(lod_components); if (instr->op == nir_texop_query_levels || + (instr->op == nir_texop_txs && +instr->sampler_dim == GLSL_SAMPLER_DIM_MS) || (instr->op == nir_texop_tex && stage != MESA_SHADER_FRAGMENT)) { - /* textureQueryLevels() and texture() are implemented in terms of TXS - * and TXL respectively, so we need to pass a valid LOD argument. + /* textureQueryLevels(), textureSize(), and texture() are implemented in + * terms of TXS and TXL respectively, so we need to pass a valid LOD + * argument. */ assert(srcs[TEX_LOGICAL_SRC_LOD].file == BAD_FILE); srcs[TEX_LOGICAL_SRC_LOD] = brw_imm_ud(0u); -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3] swr: [rasterizer codegen] Fix windows build
Commit comment should not include “[rasterizer codegen]”, as it doesn’t modify that code. With that fixed, Reviewed-by: Tim Rowley> On Mar 28, 2017, at 4:44 PM, George Kyriazis > wrote: Fix codegen build break that was introduced earlier v2: update rules for gen_knobs.cpp and gen_knobs.h v3: Introduce bldroot and revert generator file changes, making patch simpler. --- src/gallium/drivers/swr/SConscript | 38 +++--- 1 file changed, 31 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/swr/SConscript b/src/gallium/drivers/swr/SConscript index ad16162..18d6c9b 100644 --- a/src/gallium/drivers/swr/SConscript +++ b/src/gallium/drivers/swr/SConscript @@ -47,20 +47,25 @@ if not env['msvc'] : ]) swrroot = '#src/gallium/drivers/swr/' +bldroot = Dir('.').abspath env.CodeGenerate( target = 'rasterizer/codegen/gen_knobs.cpp', script = swrroot + 'rasterizer/codegen/gen_knobs.py', -source = 'rasterizer/codegen/templates/gen_knobs.cpp', -command = python_cmd + ' $SCRIPT --input $SOURCE --output $TARGET --gen_cpp' +source = '', +command = python_cmd + ' $SCRIPT --output $TARGET --gen_cpp' ) +Depends('rasterizer/codegen/gen_knobs.cpp', +swrroot + 'rasterizer/codegen/templates/gen_knobs.cpp') env.CodeGenerate( target = 'rasterizer/codegen/gen_knobs.h', script = swrroot + 'rasterizer/codegen/gen_knobs.py', -source = 'rasterizer/codegen/templates/gen_knobs.cpp', -command = python_cmd + ' $SCRIPT --input $SOURCE --output $TARGET --gen_h' +source = '', +command = python_cmd + ' $SCRIPT --output $TARGET --gen_h' ) +Depends('rasterizer/codegen/gen_knobs.cpp', +swrroot + 'rasterizer/codegen/templates/gen_knobs.cpp') env.CodeGenerate( target = 'rasterizer/jitter/gen_state_llvm.h', @@ -68,20 +73,26 @@ env.CodeGenerate( source = 'rasterizer/core/state.h', command = python_cmd + ' $SCRIPT --input $SOURCE --output $TARGET' ) +Depends('rasterizer/jitter/gen_state_llvm.h', +swrroot + 'rasterizer/codegen/templates/gen_llvm.hpp') env.CodeGenerate( target = 'rasterizer/jitter/gen_builder.hpp', script = swrroot + 'rasterizer/codegen/gen_llvm_ir_macros.py', source = os.path.join(llvm_includedir, 'llvm/IR/IRBuilder.h'), -command = python_cmd + ' $SCRIPT --input $SOURCE --output rasterizer/jitter --gen_h' +command = python_cmd + ' $SCRIPT --input $SOURCE --output ' + bldroot + '/rasterizer/jitter --gen_h' ) +Depends('rasterizer/jitter/gen_builder.hpp', +swrroot + 'rasterizer/codegen/templates/gen_builder.hpp') env.CodeGenerate( target = 'rasterizer/jitter/gen_builder_x86.hpp', script = swrroot + 'rasterizer/codegen/gen_llvm_ir_macros.py', source = '', -command = python_cmd + ' $SCRIPT --output rasterizer/jitter --gen_x86_h' +command = python_cmd + ' $SCRIPT --output ' + bldroot + '/rasterizer/jitter --gen_x86_h' ) +Depends('rasterizer/jitter/gen_builder.hpp', +swrroot + 'rasterizer/codegen/templates/gen_builder.hpp') env.CodeGenerate( target = './gen_swr_context_llvm.h', @@ -89,6 +100,8 @@ env.CodeGenerate( source = 'swr_context.h', command = python_cmd + ' $SCRIPT --input $SOURCE --output $TARGET' ) +Depends('rasterizer/jitter/gen_state_llvm.h', +swrroot + 'rasterizer/codegen/templates/gen_llvm.hpp') env.CodeGenerate( target = 'rasterizer/archrast/gen_ar_event.hpp', @@ -96,6 +109,8 @@ env.CodeGenerate( source = 'rasterizer/archrast/events.proto', command = python_cmd + ' $SCRIPT --proto $SOURCE --output $TARGET --gen_event_h' ) +Depends('rasterizer/jitter/gen_state_llvm.h', +swrroot + 'rasterizer/codegen/templates/gen_ar_event.hpp') env.CodeGenerate( target = 'rasterizer/archrast/gen_ar_event.cpp', @@ -103,6 +118,8 @@ env.CodeGenerate( source = 'rasterizer/archrast/events.proto', command = python_cmd + ' $SCRIPT --proto $SOURCE --output $TARGET --gen_event_cpp' ) +Depends('rasterizer/jitter/gen_state_llvm.h', +swrroot + 'rasterizer/codegen/templates/gen_ar_event.cpp') env.CodeGenerate( target = 'rasterizer/archrast/gen_ar_eventhandler.hpp', @@ -110,6 +127,8 @@ env.CodeGenerate( source = 'rasterizer/archrast/events.proto', command = python_cmd + ' $SCRIPT --proto $SOURCE --output $TARGET --gen_eventhandler_h' ) +Depends('rasterizer/jitter/gen_state_llvm.h', +swrroot + 'rasterizer/codegen/templates/gen_ar_eventhandler.hpp') env.CodeGenerate( target = 'rasterizer/archrast/gen_ar_eventhandlerfile.hpp', @@ -117,6 +136,8 @@ env.CodeGenerate( source = 'rasterizer/archrast/events.proto', command = python_cmd + ' $SCRIPT --proto $SOURCE --output $TARGET --gen_eventhandlerfile_h' ) +Depends('rasterizer/jitter/gen_state_llvm.h', +swrroot + 'rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp') # 5
Re: [Mesa-dev] [PATCH 0/9] RadeonSI cleanups
Patches 1-4 & 7 are: Reviewed-by: Samuel PitoisetOn 03/29/2017 07:58 PM, Marek Olšák wrote: General cleanups and cleanups in preparation for threaded gallium. Please review. Thanks, Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/9] RadeonSI cleanups
This series is Tested-by: Edmondo TommasinaOn Wed, Mar 29, 2017 at 7:58 PM, Marek Olšák wrote: > General cleanups and cleanups in preparation for threaded gallium. > > Please review. > > Thanks, > Marek > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] radv: Use the guard band.
Signed-off-by: Bas Nieuwenhuizen--- src/amd/vulkan/radv_cmd_buffer.c | 6 ++- src/amd/vulkan/radv_private.h| 3 +- src/amd/vulkan/si_cmd_buffer.c | 94 +++- 3 files changed, 90 insertions(+), 13 deletions(-) diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c index 09ba7cf4e18..e50245251fb 100644 --- a/src/amd/vulkan/radv_cmd_buffer.c +++ b/src/amd/vulkan/radv_cmd_buffer.c @@ -750,7 +750,9 @@ radv_emit_scissor(struct radv_cmd_buffer *cmd_buffer) { uint32_t count = cmd_buffer->state.dynamic.scissor.count; si_write_scissors(cmd_buffer->cs, 0, count, - cmd_buffer->state.dynamic.scissor.scissors); + cmd_buffer->state.dynamic.scissor.scissors, + cmd_buffer->state.dynamic.viewport.viewports, + cmd_buffer->state.emitted_pipeline->graphics.can_use_guardband); radeon_set_context_reg(cmd_buffer->cs, R_028A48_PA_SC_MODE_CNTL_0, cmd_buffer->state.pipeline->graphics.ms.pa_sc_mode_cntl_0 | S_028A48_VPORT_SCISSOR_ENABLE(count ? 1 : 0)); } @@ -1281,7 +1283,7 @@ radv_cmd_buffer_flush_state(struct radv_cmd_buffer *cmd_buffer, if (cmd_buffer->state.dirty & (RADV_CMD_DIRTY_DYNAMIC_VIEWPORT)) radv_emit_viewport(cmd_buffer); - if (cmd_buffer->state.dirty & (RADV_CMD_DIRTY_DYNAMIC_SCISSOR)) + if (cmd_buffer->state.dirty & (RADV_CMD_DIRTY_DYNAMIC_SCISSOR | RADV_CMD_DIRTY_DYNAMIC_VIEWPORT)) radv_emit_scissor(cmd_buffer); ia_multi_vgt_param = si_get_ia_multi_vgt_param(cmd_buffer, instanced_draw, indirect_draw, draw_vertex_count); diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h index 410e63ba413..e8f14dcfe02 100644 --- a/src/amd/vulkan/radv_private.h +++ b/src/amd/vulkan/radv_private.h @@ -758,7 +758,8 @@ void cik_create_gfx_config(struct radv_device *device); void si_write_viewport(struct radeon_winsys_cs *cs, int first_vp, int count, const VkViewport *viewports); void si_write_scissors(struct radeon_winsys_cs *cs, int first, - int count, const VkRect2D *scissors); + int count, const VkRect2D *scissors, + const VkViewport *viewports, bool can_use_guardband); uint32_t si_get_ia_multi_vgt_param(struct radv_cmd_buffer *cmd_buffer, bool instanced_draw, bool indirect_draw, uint32_t draw_vertex_count); diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c index 55c82a9a685..66a4681dad3 100644 --- a/src/amd/vulkan/si_cmd_buffer.c +++ b/src/amd/vulkan/si_cmd_buffer.c @@ -361,11 +361,6 @@ si_emit_config(struct radv_physical_device *physical_device, radeon_set_context_reg(cs, R_028234_PA_SU_HARDWARE_SCREEN_OFFSET, 0); radeon_set_context_reg(cs, R_028820_PA_CL_NANINF_CNTL, 0); - radeon_set_context_reg(cs, R_028BE8_PA_CL_GB_VERT_CLIP_ADJ, fui(1.0)); - radeon_set_context_reg(cs, R_028BEC_PA_CL_GB_VERT_DISC_ADJ, fui(1.0)); - radeon_set_context_reg(cs, R_028BF0_PA_CL_GB_HORZ_CLIP_ADJ, fui(1.0)); - radeon_set_context_reg(cs, R_028BF4_PA_CL_GB_HORZ_DISC_ADJ, fui(1.0)); - radeon_set_context_reg(cs, R_028AC0_DB_SRESULTS_COMPARE_STATE0, 0x0); radeon_set_context_reg(cs, R_028AC4_DB_SRESULTS_COMPARE_STATE1, 0x0); radeon_set_context_reg(cs, R_028AC8_DB_PRELOAD_CONTROL, 0x0); @@ -500,6 +495,22 @@ get_viewport_xform(const VkViewport *viewport, translate[2] = n; } +static void +get_viewport_xform_scissor(const VkRect2D *scissor, + float scale[2], float translate[2]) +{ + float x = scissor->offset.x; + float y = scissor->offset.y; + float half_width = 0.5f * scissor->extent.width; + float half_height = 0.5f * scissor->extent.height; + + scale[0] = half_width; + translate[0] = half_width + x; + scale[1] = half_height; + translate[1] = half_height + y; + +} + void si_write_viewport(struct radeon_winsys_cs *cs, int first_vp, int count, const VkViewport *viewports) @@ -533,21 +544,84 @@ si_write_viewport(struct radeon_winsys_cs *cs, int first_vp, } } +static VkRect2D si_scissor_from_viewport(const VkViewport *viewport) +{ + float scale[3], translate[3]; + VkRect2D rect; + + get_viewport_xform(viewport, scale, translate); + + rect.offset.x = translate[0] - abs(scale[0]); + rect.offset.y = translate[1] - abs(scale[1]); + rect.extent.width = ceilf(translate[0] + abs(scale[0])) - rect.offset.x; + rect.extent.height = ceilf(translate[1] + abs(scale[1])) - rect.offset.y; + + return rect; +} + +static VkRect2D si_intersect_scissor(const VkRect2D *a, const VkRect2D *b) { + VkRect2D ret; + ret.offset.x =
[Mesa-dev] [PATCH 1/4] radv: Set proper viewport & scissor for meta draws.
Signed-off-by: Bas Nieuwenhuizen--- src/amd/vulkan/radv_meta_blit.c | 53 -- src/amd/vulkan/radv_meta_blit2d.c | 52 +++-- src/amd/vulkan/radv_meta_clear.c | 54 +-- src/amd/vulkan/radv_meta_decompress.c | 39 +++-- src/amd/vulkan/radv_meta_fast_clear.c | 52 + src/amd/vulkan/radv_meta_resolve.c| 39 +++-- 6 files changed, 214 insertions(+), 75 deletions(-) diff --git a/src/amd/vulkan/radv_meta_blit.c b/src/amd/vulkan/radv_meta_blit.c index 9d4d3f02555..228aefaf4b6 100644 --- a/src/amd/vulkan/radv_meta_blit.c +++ b/src/amd/vulkan/radv_meta_blit.c @@ -246,8 +246,8 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer, unsigned vb_size = 3 * sizeof(*vb_data); vb_data[0] = (struct blit_vb_data) { .pos = { - dest_offset_0.x, - dest_offset_0.y, + -1.0, + -1.0, }, .tex_coord = { (float)src_offset_0.x / (float)src_iview->extent.width, @@ -258,8 +258,8 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer, vb_data[1] = (struct blit_vb_data) { .pos = { - dest_offset_0.x, - dest_offset_1.y, + -1.0, + 1.0, }, .tex_coord = { (float)src_offset_0.x / (float)src_iview->extent.width, @@ -270,8 +270,8 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer, vb_data[2] = (struct blit_vb_data) { .pos = { - dest_offset_1.x, - dest_offset_0.y, + 1.0, + -1.0, }, .tex_coord = { (float)src_offset_1.x / (float)src_iview->extent.width, @@ -444,6 +444,23 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer, device->meta_state.blit.pipeline_layout, 0, 1, , 0, NULL); + radv_CmdSetViewport(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &(VkViewport) { + .x = dest_offset_0.x, + .y = dest_offset_0.y, + .width = dest_offset_1.x - dest_offset_0.x, + .height = dest_offset_1.y - dest_offset_0.y, + .minDepth = 0.0f, + .maxDepth = 1.0f + }); + + radv_CmdSetScissor(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &(VkRect2D) { + .offset = (VkOffset2D) { MIN2(dest_offset_0.x, dest_offset_1.x), MIN2(dest_offset_0.y, dest_offset_1.y) }, + .extent = (VkExtent2D) { + abs(dest_offset_1.x - dest_offset_0.x), + abs(dest_offset_1.y - dest_offset_0.y) + }, + }); + radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0); radv_CmdEndRenderPass(radv_cmd_buffer_to_handle(cmd_buffer)); @@ -813,8 +830,8 @@ radv_device_init_meta_blit_color(struct radv_device *device, }, .pViewportState = &(VkPipelineViewportStateCreateInfo) { .sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO, - .viewportCount = 0, - .scissorCount = 0, + .viewportCount = 1, + .scissorCount = 1, }, .pRasterizationState = &(VkPipelineRasterizationStateCreateInfo) { .sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO, @@ -842,8 +859,10 @@ radv_device_init_meta_blit_color(struct radv_device *device, }, .pDynamicState = &(VkPipelineDynamicStateCreateInfo) { .sType = VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO, - .dynamicStateCount = 2, + .dynamicStateCount = 4, .pDynamicStates = (VkDynamicState[]) { + VK_DYNAMIC_STATE_VIEWPORT, + VK_DYNAMIC_STATE_SCISSOR, VK_DYNAMIC_STATE_LINE_WIDTH, VK_DYNAMIC_STATE_BLEND_CONSTANTS, }, @@ -990,8 +1009,8 @@ radv_device_init_meta_blit_depth(struct radv_device *device, }, .pViewportState = &(VkPipelineViewportStateCreateInfo) { .sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO, - .viewportCount = 0, -
[Mesa-dev] [PATCH 3/4] radv: Prepare for not using the guard band for lines & points.
Vulkan Clipping is defined in terms of vertices, the scissor based clipping happens on pixels. There is a difference with points and lines, as a vertex can be outside the viewport while some pixels are in. On Vulkan thoise pixels shouldn't be drawn, while they would be with the guardband. Signed-off-by: Bas Nieuwenhuizen--- src/amd/vulkan/radv_cmd_buffer.c | 5 + src/amd/vulkan/radv_pipeline.c | 26 ++ src/amd/vulkan/radv_private.h| 1 + 3 files changed, 32 insertions(+) diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c index e6f098c208d..09ba7cf4e18 100644 --- a/src/amd/vulkan/radv_cmd_buffer.c +++ b/src/amd/vulkan/radv_cmd_buffer.c @@ -730,6 +730,11 @@ radv_emit_graphics_pipeline(struct radv_cmd_buffer *cmd_buffer, radeon_set_context_reg(cmd_buffer->cs, R_0286E8_SPI_TMPRING_SIZE, S_0286E8_WAVES(pipeline->max_waves) | S_0286E8_WAVESIZE(pipeline->scratch_bytes_per_wave >> 10)); + + if (!cmd_buffer->state.emitted_pipeline || + cmd_buffer->state.emitted_pipeline->graphics.can_use_guardband != +pipeline->graphics.can_use_guardband) + cmd_buffer->state.dirty |= RADV_CMD_DIRTY_DYNAMIC_SCISSOR; cmd_buffer->state.emitted_pipeline = pipeline; } diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index 07020e8c387..a564085c884 100644 --- a/src/amd/vulkan/radv_pipeline.c +++ b/src/amd/vulkan/radv_pipeline.c @@ -1214,6 +1214,28 @@ radv_pipeline_init_multisample_state(struct radv_pipeline *pipeline, ms->pa_sc_aa_mask[1] = mask | (mask << 16); } +static bool +radv_prim_can_use_guardband(enum VkPrimitiveTopology topology) +{ + switch (topology) { + case VK_PRIMITIVE_TOPOLOGY_POINT_LIST: + case VK_PRIMITIVE_TOPOLOGY_LINE_LIST: + case VK_PRIMITIVE_TOPOLOGY_LINE_STRIP: + case VK_PRIMITIVE_TOPOLOGY_LINE_LIST_WITH_ADJACENCY: + case VK_PRIMITIVE_TOPOLOGY_LINE_STRIP_WITH_ADJACENCY: + return false; + case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST: + case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP: + case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_FAN: + case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST_WITH_ADJACENCY: + case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP_WITH_ADJACENCY: + case VK_PRIMITIVE_TOPOLOGY_PATCH_LIST: + return true; + default: + unreachable("unhandled primitive type"); + } +} + static uint32_t si_translate_prim(enum VkPrimitiveTopology topology) { @@ -1714,14 +1736,18 @@ radv_pipeline_init(struct radv_pipeline *pipeline, radv_pipeline_init_raster_state(pipeline, pCreateInfo); radv_pipeline_init_multisample_state(pipeline, pCreateInfo); pipeline->graphics.prim = si_translate_prim(pCreateInfo->pInputAssemblyState->topology); + pipeline->graphics.can_use_guardband = radv_prim_can_use_guardband(pCreateInfo->pInputAssemblyState->topology); + if (radv_pipeline_has_gs(pipeline)) { pipeline->graphics.gs_out = si_conv_gl_prim_to_gs_out(pipeline->shaders[MESA_SHADER_GEOMETRY]->info.gs.output_prim); + pipeline->graphics.can_use_guardband = pipeline->graphics.gs_out == V_028A6C_OUTPRIM_TYPE_TRISTRIP; } else { pipeline->graphics.gs_out = si_conv_prim_to_gs_out(pCreateInfo->pInputAssemblyState->topology); } if (extra && extra->use_rectlist) { pipeline->graphics.prim = V_008958_DI_PT_RECTLIST; pipeline->graphics.gs_out = V_028A6C_OUTPRIM_TYPE_TRISTRIP; + pipeline->graphics.can_use_guardband = true; } pipeline->graphics.prim_restart_enable = !!pCreateInfo->pInputAssemblyState->primitiveRestartEnable; /* prim vertex count will need TESS changes */ diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h index 31e08287c9c..410e63ba413 100644 --- a/src/amd/vulkan/radv_private.h +++ b/src/amd/vulkan/radv_private.h @@ -968,6 +968,7 @@ struct radv_pipeline { uint32_t pa_cl_vs_out_cntl; uint32_t vgt_shader_stages_en; struct radv_prim_vertex_count prim_vertex_count; + bool can_use_guardband; } graphics; }; -- 2.12.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] radv: Drop the default viewport when 0 viewports are given.
Signed-off-by: Bas Nieuwenhuizen--- src/amd/vulkan/si_cmd_buffer.c | 19 ++- 1 file changed, 2 insertions(+), 17 deletions(-) diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c index 6e50f64a29a..55c82a9a685 100644 --- a/src/amd/vulkan/si_cmd_buffer.c +++ b/src/amd/vulkan/si_cmd_buffer.c @@ -506,21 +506,7 @@ si_write_viewport(struct radeon_winsys_cs *cs, int first_vp, { int i; - if (count == 0) { - radeon_set_context_reg_seq(cs, R_02843C_PA_CL_VPORT_XSCALE, 6); - radeon_emit(cs, fui(1.0)); - radeon_emit(cs, fui(0.0)); - radeon_emit(cs, fui(1.0)); - radeon_emit(cs, fui(0.0)); - radeon_emit(cs, fui(1.0)); - radeon_emit(cs, fui(0.0)); - - radeon_set_context_reg_seq(cs, R_0282D0_PA_SC_VPORT_ZMIN_0, 2); - radeon_emit(cs, fui(0.0)); - radeon_emit(cs, fui(1.0)); - - return; - } + assert(count); radeon_set_context_reg_seq(cs, R_02843C_PA_CL_VPORT_XSCALE + first_vp * 4 * 6, count * 6); @@ -552,8 +538,7 @@ si_write_scissors(struct radeon_winsys_cs *cs, int first, int count, const VkRect2D *scissors) { int i; - if (count == 0) - return; + assert(count); radeon_set_context_reg_seq(cs, R_028250_PA_SC_VPORT_SCISSOR_0_TL + first * 4 * 2, count * 2); for (i = 0; i < count; i++) { -- 2.12.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 100259] [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'
https://bugs.freedesktop.org/show_bug.cgi?id=100259 ovarieg...@yahoo.com changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #11 from ovarieg...@yahoo.com --- It turns out this was all my fault and it was a bug in my pkgconf.SlackBuild. I was lacking /usr/lib64 as a system libdir... Sorry for the noise. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv/cmd_buffer: fix host memory leak
Reviewed-by: Jason EkstrandAnd pushed. On Wed, Mar 29, 2017 at 12:14 PM, wrote: > From: Craig Stout > > push_constants must be free'd. > > https://bugs.freedesktop.org/show_bug.cgi?id=100452 > --- > src/intel/vulkan/anv_cmd_buffer.c | 10 +- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_ > buffer.c > index 909bee2..c65eba2 100644 > --- a/src/intel/vulkan/anv_cmd_buffer.c > +++ b/src/intel/vulkan/anv_cmd_buffer.c > @@ -120,7 +120,12 @@ anv_cmd_state_reset(struct anv_cmd_buffer *cmd_buffer) > cmd_buffer->batch.status = VK_SUCCESS; > > memset(>descriptors, 0, sizeof(state->descriptors)); > - memset(>push_constants, 0, sizeof(state->push_constants)); > + for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++) { > + if (state->push_constants[i] != NULL) { > + vk_free(_buffer->pool->alloc, state->push_constants[i]); > + state->push_constants[i] = NULL; > + } > + } > memset(state->binding_tables, 0, sizeof(state->binding_tables)); > memset(state->samplers, 0, sizeof(state->samplers)); > > @@ -193,6 +198,9 @@ static VkResult anv_create_cmd_buffer( > > cmd_buffer->batch.status = VK_SUCCESS; > > + for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++) { > + cmd_buffer->state.push_constants[i] = NULL; > + } > cmd_buffer->_loader_data.loaderMagic = ICD_LOADER_MAGIC; > cmd_buffer->device = device; > cmd_buffer->pool = pool; > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: remove support for predicates from TGSI
On 29/03/17 19:02, Roland Scheidegger wrote: [resend with snipped bits as it's too big] A couple comments inline. [snip] --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -746,39 +746,30 @@ static void lp_exec_default(struct lp_exec_mask *mask, } /* stores val into an address pointed to by dst_ptr. * mask->exec_mask is used to figure out which bits of val * should be stored into the address * (0 means don't store this bit, 1 means do store). */ static void lp_exec_mask_store(struct lp_exec_mask *mask, struct lp_build_context *bld_store, - LLVMValueRef pred, LLVMValueRef val, LLVMValueRef dst_ptr) { LLVMBuilderRef builder = mask->bld->gallivm->builder; + LLVMValueRef pred = mask->has_mask ? mask->exec_mask : NULL; Calling this "pred" now seems to be somewhat of a misnomer (wasn't all that great before because it then included exec_mask but it's worse now). assert(lp_check_value(bld_store->type, val)); assert(LLVMGetTypeKind(LLVMTypeOf(dst_ptr)) == LLVMPointerTypeKind); assert(LLVMGetElementType(LLVMTypeOf(dst_ptr)) == LLVMTypeOf(val)); - /* Mix the predicate and execution mask */ - if (mask->has_mask) { - if (pred) { - pred = LLVMBuildAnd(builder, pred, mask->exec_mask, ""); - } else { - pred = mask->exec_mask; - } - } - if (pred) { LLVMValueRef res, dst; dst = LLVMBuildLoad(builder, dst_ptr, ""); res = lp_build_select(bld_store, pred, val, dst); LLVMBuildStore(builder, res, dst_ptr); } else LLVMBuildStore(builder, val, dst_ptr); } @@ -1029,36 +1020,26 @@ build_gather(struct lp_build_tgsi_context *bld_base, /** * Scatter/store vector. */ static void emit_mask_scatter(struct lp_build_tgsi_soa_context *bld, LLVMValueRef base_ptr, LLVMValueRef indexes, LLVMValueRef values, - struct lp_exec_mask *mask, - LLVMValueRef pred) + struct lp_exec_mask *mask) { struct gallivm_state *gallivm = bld->bld_base.base.gallivm; LLVMBuilderRef builder = gallivm->builder; unsigned i; - - /* Mix the predicate and execution mask */ - if (mask->has_mask) { - if (pred) { - pred = LLVMBuildAnd(builder, pred, mask->exec_mask, ""); - } - else { - pred = mask->exec_mask; - } - } + LLVMValueRef pred = mask->has_mask ? mask->exec_mask : NULL; same here. diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 6a3fb98..87d2d92 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -62,21 +62,20 @@ struct tgsi_token enum tgsi_file_type { TGSI_FILE_NULL, TGSI_FILE_CONSTANT, TGSI_FILE_INPUT, TGSI_FILE_OUTPUT, TGSI_FILE_TEMPORARY, TGSI_FILE_SAMPLER, TGSI_FILE_ADDRESS, TGSI_FILE_IMMEDIATE, - TGSI_FILE_PREDICATE, TGSI_FILE_SYSTEM_VALUE, TGSI_FILE_IMAGE, TGSI_FILE_SAMPLER_VIEW, TGSI_FILE_BUFFER, TGSI_FILE_MEMORY, TGSI_FILE_COUNT, /**< how many TGSI_FILE_ types */ }; #define TGSI_WRITEMASK_NONE 0x00 @@ -609,34 +608,31 @@ struct tgsi_property_data { /** * Opcode is the operation code to execute. A given operation defines the * semantics how the source registers (if any) are interpreted and what is * written to the destination registers (if any) as a result of execution. * * NumDstRegs and NumSrcRegs is the number of destination and source registers, * respectively. For a given operation code, those numbers are fixed and are * present here only for convenience. * - * If Predicate is TRUE, tgsi_instruction_predicate token immediately follows. - * * Saturate controls how are final results in destination registers modified. */ struct tgsi_instruction { unsigned Type : 4; /* TGSI_TOKEN_TYPE_INSTRUCTION */ unsigned NrTokens : 8; /* UINT */ unsigned Opcode : 8; /* TGSI_OPCODE_ */ unsigned Saturate : 1; /* BOOL */ unsigned NumDstRegs : 2; /* UINT */ unsigned NumSrcRegs : 4; /* UINT */ - unsigned Predicate : 1; /* BOOL */ unsigned Label : 1; unsigned Texture: 1; unsigned Memory : 1; unsigned Padding: 1; The Padding doesn't match. So, we still have code which uses this - however this code is only used for some testing, otherwise we translate this d3d9 stuff away like everybody else. Maybe it's time to ditch this stuff then - clearly no other drivers are ever going to support it and apis have moved away from it. Jose, any opinion on that? Yes, I agree. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
Re: [Mesa-dev] [RFC 3/3] mesa: expose KHR_no_error for GL
On 03/29/2017 11:01 PM, Timothy Arceri wrote: On 30/03/17 06:53, Marek Olšák wrote: The series looks good to me except the "==" -> "&" in patch 2. The patches have no effect without the GLX extension, right? Correct. I was partly sending this out to see if anyone knew what was going on since Nvidia exposes this on their driver but I couldn't find the GLX extension anywhere. Anyway in the mean time we could add an environment variable to enable it. And a driconf option also? Marek On Tue, Mar 28, 2017 at 6:35 AM, Timothy Arceriwrote: There ES is no support for now as this requires EGL_KHR_create_context_no_error to be implemented. --- src/mesa/main/extensions_table.h | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h index ec71791..4439731 100644 --- a/src/mesa/main/extensions_table.h +++ b/src/mesa/main/extensions_table.h @@ -294,20 +294,21 @@ EXT(IBM_texture_mirrored_repeat , dummy_true EXT(INGR_blend_func_separate, EXT_blend_func_separate, GLL, x , x , x , 1999) EXT(INTEL_conservative_rasterization, INTEL_conservative_rasterization , x , GLC, x , 31, 2013) EXT(INTEL_performance_query , INTEL_performance_query, GLL, GLC, x , ES2, 2013) EXT(KHR_blend_equation_advanced , KHR_blend_equation_advanced, GLL, GLC, x , ES2, 2014) EXT(KHR_blend_equation_advanced_coherent, KHR_blend_equation_advanced_coherent , GLL, GLC, x , ES2, 2014) EXT(KHR_context_flush_control , dummy_true , GLL, GLC, x , ES2, 2014) EXT(KHR_debug , dummy_true , GLL, GLC, 11, ES2, 2012) +EXT(KHR_no_error, dummy_true , GLL, GLC, x , x , 2015) EXT(KHR_robust_buffer_access_behavior , ARB_robust_buffer_access_behavior , GLL, GLC, x , ES2, 2014) EXT(KHR_robustness , KHR_robustness , GLL, GLC, x , ES2, 2012) EXT(KHR_texture_compression_astc_hdr, KHR_texture_compression_astc_hdr , GLL, GLC, x , ES2, 2012) EXT(KHR_texture_compression_astc_ldr, KHR_texture_compression_astc_ldr , GLL, GLC, x , ES2, 2012) EXT(KHR_texture_compression_astc_sliced_3d , KHR_texture_compression_astc_sliced_3d , GLL, GLC, x , ES2, 2015) EXT(MESA_pack_invert, MESA_pack_invert , GLL, GLC, x , x , 2002) EXT(MESA_shader_integer_functions , MESA_shader_integer_functions , GLL, GLC, x , 30, 2016) EXT(MESA_texture_signed_rgba, EXT_texture_snorm , GLL, GLC, x , x , 2009) EXT(MESA_window_pos , dummy_true , GLL, x , x , x , 2000) -- 2.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 3/3] mesa: expose KHR_no_error for GL
On 30/03/17 06:53, Marek Olšák wrote: The series looks good to me except the "==" -> "&" in patch 2. The patches have no effect without the GLX extension, right? Correct. I was partly sending this out to see if anyone knew what was going on since Nvidia exposes this on their driver but I couldn't find the GLX extension anywhere. Anyway in the mean time we could add an environment variable to enable it. Marek On Tue, Mar 28, 2017 at 6:35 AM, Timothy Arceriwrote: There ES is no support for now as this requires EGL_KHR_create_context_no_error to be implemented. --- src/mesa/main/extensions_table.h | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h index ec71791..4439731 100644 --- a/src/mesa/main/extensions_table.h +++ b/src/mesa/main/extensions_table.h @@ -294,20 +294,21 @@ EXT(IBM_texture_mirrored_repeat , dummy_true EXT(INGR_blend_func_separate, EXT_blend_func_separate , GLL, x , x , x , 1999) EXT(INTEL_conservative_rasterization, INTEL_conservative_rasterization , x , GLC, x , 31, 2013) EXT(INTEL_performance_query , INTEL_performance_query , GLL, GLC, x , ES2, 2013) EXT(KHR_blend_equation_advanced , KHR_blend_equation_advanced , GLL, GLC, x , ES2, 2014) EXT(KHR_blend_equation_advanced_coherent, KHR_blend_equation_advanced_coherent , GLL, GLC, x , ES2, 2014) EXT(KHR_context_flush_control , dummy_true , GLL, GLC, x , ES2, 2014) EXT(KHR_debug , dummy_true , GLL, GLC, 11, ES2, 2012) +EXT(KHR_no_error, dummy_true , GLL, GLC, x , x , 2015) EXT(KHR_robust_buffer_access_behavior , ARB_robust_buffer_access_behavior , GLL, GLC, x , ES2, 2014) EXT(KHR_robustness , KHR_robustness , GLL, GLC, x , ES2, 2012) EXT(KHR_texture_compression_astc_hdr, KHR_texture_compression_astc_hdr , GLL, GLC, x , ES2, 2012) EXT(KHR_texture_compression_astc_ldr, KHR_texture_compression_astc_ldr , GLL, GLC, x , ES2, 2012) EXT(KHR_texture_compression_astc_sliced_3d , KHR_texture_compression_astc_sliced_3d , GLL, GLC, x , ES2, 2015) EXT(MESA_pack_invert, MESA_pack_invert , GLL, GLC, x , x , 2002) EXT(MESA_shader_integer_functions , MESA_shader_integer_functions , GLL, GLC, x , 30, 2016) EXT(MESA_texture_signed_rgba, EXT_texture_snorm , GLL, GLC, x , x , 2009) EXT(MESA_window_pos , dummy_true , GLL, x , x , x , 2000) -- 2.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] configure.ac: require libdrm_amdgpu 2.4.76 for Vega
Reviewed-by: Samuel PitoisetOn 03/29/2017 08:23 PM, Marek Olšák wrote: From: Marek Olšák --- configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index ab9a91e..70885fb 100644 --- a/configure.ac +++ b/configure.ac @@ -67,21 +67,21 @@ OPENCL_VERSION=1 AC_SUBST([OPENCL_VERSION]) # The idea is that libdrm is distributed as one cohesive package, even # though it is composed of multiple libraries. However some drivers # may have different version requirements than others. This list # codifies which drivers need which version of libdrm. Any libdrm # version dependencies in non-driver-specific code should be reflected # in the first entry. LIBDRM_REQUIRED=2.4.75 LIBDRM_RADEON_REQUIRED=2.4.71 -LIBDRM_AMDGPU_REQUIRED=2.4.63 +LIBDRM_AMDGPU_REQUIRED=2.4.76 LIBDRM_INTEL_REQUIRED=2.4.75 LIBDRM_NVVIEUX_REQUIRED=2.4.66 LIBDRM_NOUVEAU_REQUIRED=2.4.66 LIBDRM_FREEDRENO_REQUIRED=2.4.74 LIBDRM_VC4_REQUIRED=2.4.69 LIBDRM_ETNAVIV_REQUIRED=2.4.74 dnl Versions for external dependencies DRI2PROTO_REQUIRED=2.8 DRI3PROTO_REQUIRED=1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv/cmd_buffer: fix dynamic state leak
Reviewed-by: Jason EkstrandOn Wed, Mar 29, 2017 at 12:11 PM, wrote: > From: Craig Stout > > anv_state_pool_alloc requires a matching free, whereas > anv_state_stream_alloc will be cleaned up on finish. > > Applies only to 13.0 branch. > x > https://bugs.freedesktop.org/show_bug.cgi?id=100365 > --- > src/intel/vulkan/anv_private.h | 12 > src/intel/vulkan/genX_cmd_buffer.c | 32 > 2 files changed, 28 insertions(+), 16 deletions(-) > > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_ > private.h > index dd67508..12a6aa1 100644 > --- a/src/intel/vulkan/anv_private.h > +++ b/src/intel/vulkan/anv_private.h > @@ -765,6 +765,18 @@ _anv_combine_address(struct anv_batch *batch, void > *location, >__state; \ > }) > > +#define anv_state_stream_emit(stream, cmd, align, ...) > \ > + ({ > \ > + const uint32_t __size = __anv_cmd_length(cmd) * 4; > \ > + struct anv_state __state = anv_state_stream_alloc((stream), > __size, align); \ > + struct cmd __template = {__VA_ARGS__}; > \ > + __anv_cmd_pack(cmd)(NULL, __state.map, &__template); > \ > + VG(VALGRIND_CHECK_MEM_IS_DEFINED(__state.map, > __anv_cmd_length(cmd) * 4)); \ > + if (!(stream)->block_pool->device->info.has_llc) >\ > + anv_state_clflush(__state); > \ > + __state; > \ > + }) > + > #define GEN7_MOCS (struct GEN7_MEMORY_OBJECT_CONTROL_STATE) { \ > .GraphicsDataTypeGFDT= 0, \ > .LLCCacheabilityControlLLCCC = 0, \ > diff --git a/src/intel/vulkan/genX_cmd_buffer.c > b/src/intel/vulkan/genX_cmd_buffer.c > index 45fefc9..33db7ce 100644 > --- a/src/intel/vulkan/genX_cmd_buffer.c > +++ b/src/intel/vulkan/genX_cmd_buffer.c > @@ -1367,26 +1367,26 @@ flush_compute_descriptor_set(struct > anv_cmd_buffer *cmd_buffer) > const uint32_t slm_size = encode_slm_size(GEN_GEN, > prog_data->total_shared); > > struct anv_state state = > - anv_state_pool_emit(>dynamic_state_pool, > - GENX(INTERFACE_DESCRIPTOR_DATA), 64, > - .KernelStartPointer = pipeline->cs_simd, > - .BindingTablePointer = surfaces.offset, > - .BindingTableEntryCount = 0, > - .SamplerStatePointer = samplers.offset, > - .SamplerCount = 0, > + anv_state_stream_emit(_buffer->dynamic_state_stream, > +GENX(INTERFACE_DESCRIPTOR_DATA), 64, > +.KernelStartPointer = pipeline->cs_simd, > +.BindingTablePointer = surfaces.offset, > +.BindingTableEntryCount = 0, > +.SamplerStatePointer = samplers.offset, > +.SamplerCount = 0, > #if !GEN_IS_HASWELL > - .ConstantURBEntryReadOffset = 0, > +.ConstantURBEntryReadOffset = 0, > #endif > - .ConstantURBEntryReadLength = > - cs_prog_data->push.per_thread.regs, > +.ConstantURBEntryReadLength = > + cs_prog_data->push.per_thread.regs, > #if GEN_GEN >= 8 || GEN_IS_HASWELL > - .CrossThreadConstantDataReadLength = > - cs_prog_data->push.cross_thread.regs, > +.CrossThreadConstantDataReadLength = > + cs_prog_data->push.cross_thread.regs, > #endif > - .BarrierEnable = cs_prog_data->uses_barrier, > - .SharedLocalMemorySize = slm_size, > - .NumberofThreadsinGPGPUThreadGroup = > - cs_prog_data->threads); > +.BarrierEnable = cs_prog_data->uses_barrier, > +.SharedLocalMemorySize = slm_size, > +.NumberofThreadsinGPGPUThreadGroup = > + cs_prog_data->threads); > > uint32_t size = GENX(INTERFACE_DESCRIPTOR_DATA_length) * > sizeof(uint32_t); > anv_batch_emit(_buffer->batch, > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 100259] [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'
https://bugs.freedesktop.org/show_bug.cgi?id=100259 --- Comment #10 from ovarieg...@yahoo.com --- It turns out in my case this is an issue with using pkgconf (Which worked previously) instead of pkg-config. It builds fine with pkg-config. I'd prefer to keep this open until the pkgconf devs have a chance to take a look, but it can be closed again if someone finds that preferable. As for my friend, apparently he was trying to use the perl pkg-config from openbsd in gentoo (Don't ask...). -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: move to using nir clip/cull merge pass.
Acked-by: Bas NieuwenhuizenOn Wed, Mar 29, 2017 at 7:14 AM, Dave Airlie wrote: > From: Dave Airlie > > Doing this before tessellation makes doing some bits of > tessellation a bit cleaner. It also cleans up a bit of the > llvm generator code. > > Signed-off-by: Dave Airlie > --- > src/amd/common/ac_nir_to_llvm.c | 144 > ++-- > src/amd/vulkan/radv_pipeline.c | 1 + > 2 files changed, 36 insertions(+), 109 deletions(-) > > diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c > index f164d8f..78602fd 100644 > --- a/src/amd/common/ac_nir_to_llvm.c > +++ b/src/amd/common/ac_nir_to_llvm.c > @@ -144,8 +144,6 @@ struct nir_to_llvm_context { > int num_locals; > LLVMValueRef *locals; > bool has_ddxy; > - uint8_t num_input_clips; > - uint8_t num_input_culls; > uint8_t num_output_clips; > uint8_t num_output_culls; > > @@ -170,12 +168,9 @@ static unsigned > shader_io_get_unique_index(gl_varying_slot slot) > return 0; > if (slot == VARYING_SLOT_PSIZ) > return 1; > - if (slot == VARYING_SLOT_CLIP_DIST0 || > - slot == VARYING_SLOT_CULL_DIST0) > + if (slot == VARYING_SLOT_CLIP_DIST0) > return 2; > - if (slot == VARYING_SLOT_CLIP_DIST1 || > - slot == VARYING_SLOT_CULL_DIST1) > - return 3; > + /* 3 is reserved for clip dist as well */ > if (slot >= VARYING_SLOT_VAR0 && slot <= VARYING_SLOT_VAR31) > return 4 + (slot - VARYING_SLOT_VAR0); > unreachable("illegal slot in get unique index\n"); > @@ -2195,7 +2190,6 @@ load_gs_input(struct nir_to_llvm_context *ctx, > unsigned param, vtx_offset_param; > LLVMValueRef value[4], result; > unsigned vertex_index; > - unsigned cull_offset = 0; > radv_get_deref_offset(ctx, >variables[0]->deref, > false, _index, > _index, _index); > @@ -2205,13 +2199,11 @@ load_gs_input(struct nir_to_llvm_context *ctx, > LLVMConstInt(ctx->i32, 4, false), ""); > > param = > shader_io_get_unique_index(instr->variables[0]->var->data.location); > - if (instr->variables[0]->var->data.location == > VARYING_SLOT_CULL_DIST0) > - cull_offset += ctx->num_input_clips; > for (unsigned i = 0; i < instr->num_components; i++) { > > args[0] = ctx->esgs_ring; > args[1] = vtx_offset; > - args[2] = LLVMConstInt(ctx->i32, (param * 4 + i + const_index > + cull_offset) * 256, false); > + args[2] = LLVMConstInt(ctx->i32, (param * 4 + i + > const_index) * 256, false); > args[3] = ctx->i32zero; > args[4] = ctx->i32one; /* OFFEN */ > args[5] = ctx->i32zero; /* IDXEN */ > @@ -2366,8 +2358,7 @@ visit_store_var(struct nir_to_llvm_context *ctx, > > value = llvm_extract_elem(ctx, src, chan); > > - if (instr->variables[0]->var->data.location == > VARYING_SLOT_CLIP_DIST0 || > - instr->variables[0]->var->data.location == > VARYING_SLOT_CULL_DIST0) > + if (instr->variables[0]->var->data.compact) > stride = 1; > if (indir_index) { > unsigned count = glsl_count_attribute_slots( > @@ -3143,7 +3134,7 @@ visit_emit_vertex(struct nir_to_llvm_context *ctx, > LLVMValueRef gs_next_vertex; > LLVMValueRef can_emit, kill; > int idx; > - int clip_cull_slot = -1; > + > assert(instr->const_index[0] == 0); > /* Write vertex attribute values to GSVS ring */ > gs_next_vertex = LLVMBuildLoad(ctx->builder, > @@ -3175,27 +3166,11 @@ visit_emit_vertex(struct nir_to_llvm_context *ctx, > if (!(ctx->output_mask & (1ull << i))) > continue; > > - if (i == VARYING_SLOT_CLIP_DIST1 || > - i == VARYING_SLOT_CULL_DIST1) > - continue; > - > - if (i == VARYING_SLOT_CLIP_DIST0 || > - i == VARYING_SLOT_CULL_DIST0) { > + if (i == VARYING_SLOT_CLIP_DIST0) { > /* pack clip and cull into a single set of slots */ > - if (clip_cull_slot == -1) { > - clip_cull_slot = idx; > - if (ctx->num_output_clips + > ctx->num_output_culls > 4) > - slot_inc = 2; > - } else { > - slot = clip_cull_slot; > - slot_inc = 0; > - } > - if
Re: [Mesa-dev] [PATCH] winsys/amdgpu: remove AMDGPU_INFO_NUM_EVICTIONS
Reviewed-by: Marek OlšákMarek On Wed, Mar 29, 2017 at 9:06 PM, Samuel Pitoiset wrote: > This is now exposed with libdrm_amdgpu 2.4.76. > > Signed-off-by: Samuel Pitoiset > --- > src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 > 1 file changed, 4 deletions(-) > > diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c > b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c > index 37e0140311..39a05d0f02 100644 > --- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c > +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c > @@ -59,10 +59,6 @@ > #define CIK__PIPE_CONFIG__ADDR_SURF_P16_32X32_8X16 16 > #define CIK__PIPE_CONFIG__ADDR_SURF_P16_32X32_16X16 17 > > -#ifndef AMDGPU_INFO_NUM_EVICTIONS > -#define AMDGPU_INFO_NUM_EVICTIONS 0x18 > -#endif > - > static struct util_hash_table *dev_tab = NULL; > static mtx_t dev_tab_mutex = _MTX_INITIALIZER_NP; > > -- > 2.12.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [RFC v3] mesa/glthread: Call unmarshal_batch directly in glthread_finish
2017-03-29 21:17 GMT+02:00 Thomas Helland: > 2017-03-29 19:35 GMT+02:00 Bartosz Tomczyk : >> I would be very grateful if someone could help with testing performance >> impact of this change. >> > > Currently prepping some tests on my HTPC, which is a bit CPU-bound. > I'll report back in about an hour or so. > My HTPC has a RX460, i3-6100 combination, so I was thinking the low number of threads on the processor could be impacted by the threaded dispatch. However, I've tested Talos Principle, Dota 2, and Metro Last Light, and none of these show any regressions as could be seen in Michael Larabels tests on phoronix back in mid-March. My system is probably to GPU-limited for any possible bottleneck to show. I'll see if I can get my workstation up and running. It has an RX480, and FX-8320 combination. So weaker cores, and stronger graphics card. Hopefully I will be able to reproduce things there. >> On Wed, Mar 29, 2017 at 7:31 PM, Bartosz Tomczyk >> wrote: >>> >>> Call it directly when batch queue is empty. This avoids costly thread >>> synchronisation. With this fix games that previously regressed >>> with mesa_glthread=true like xonotic or grid autosport. >>> --- >>> src/mesa/main/glthread.c | 47 >>> ++- >>> 1 file changed, 34 insertions(+), 13 deletions(-) >>> >>> diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c >>> index 06115b916d..faf42c2b89 100644 >>> --- a/src/mesa/main/glthread.c >>> +++ b/src/mesa/main/glthread.c >>> @@ -194,16 +194,12 @@ _mesa_glthread_restore_dispatch(struct gl_context >>> *ctx) >>> } >>> } >>> >>> -void >>> -_mesa_glthread_flush_batch(struct gl_context *ctx) >>> +static void >>> +_mesa_glthread_flush_batch_locked(struct gl_context *ctx) >>> { >>> struct glthread_state *glthread = ctx->GLThread; >>> - struct glthread_batch *batch; >>> - >>> - if (!glthread) >>> - return; >>> - >>> - batch = glthread->batch; >>> + struct glthread_batch *batch = glthread->batch; >>> + >>> if (!batch->used) >>>return; >>> >>> @@ -223,10 +219,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx) >>>return; >>> } >>> >>> - pthread_mutex_lock(>mutex); >>> *glthread->batch_queue_tail = batch; >>> glthread->batch_queue_tail = >next; >>> pthread_cond_broadcast(>new_work); >>> + >>> +} >>> +void >>> +_mesa_glthread_flush_batch(struct gl_context *ctx) >>> +{ >>> + struct glthread_state *glthread = ctx->GLThread; >>> + struct glthread_batch *batch; >>> + >>> + if (!glthread) >>> + return; >>> + >>> + batch = glthread->batch; >>> + if (!batch->used) >>> + return; >>> + >>> + pthread_mutex_lock(>mutex); >>> + _mesa_glthread_flush_batch_locked(ctx); >>> pthread_mutex_unlock(>mutex); >>> } >>> >>> @@ -252,12 +264,21 @@ _mesa_glthread_finish(struct gl_context *ctx) >>> if (pthread_self() == glthread->thread) >>>return; >>> >>> - _mesa_glthread_flush_batch(ctx); >>> - >>> pthread_mutex_lock(>mutex); >>> >>> - while (glthread->batch_queue || glthread->busy) >>> - pthread_cond_wait(>work_done, >mutex); >>> + if (!(glthread->batch_queue || glthread->busy)) { >>> + if (glthread->batch && glthread->batch->used) { >>> + struct _glapi_table *dispatch = _glapi_get_dispatch(); >>> + glthread_unmarshal_batch(ctx, glthread->batch); >>> + _glapi_set_dispatch(dispatch); >>> + glthread_allocate_batch(ctx); >>> + } >>> + } >>> + else { >>> + _mesa_glthread_flush_batch_locked(ctx); >>> + while (glthread->batch_queue || glthread->busy) >>> + pthread_cond_wait(>work_done, >mutex); >>> + } >>> >>> pthread_mutex_unlock(>mutex); >>> } >>> -- >>> 2.12.2 >>> >> >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/7] intel: tools: add aubinator_error_decode tool
This is pretty much the same tool as what i-g-t has, only with a more fancy decoding of the instructions/registers. It also doesn't support anything before gen4. Signed-off-by: Lionel Landwerlin--- src/intel/Makefile.tools.am | 20 +- src/intel/common/gen_decoder.c | 10 + src/intel/common/gen_decoder.h | 1 + src/intel/tools/.gitignore | 1 + src/intel/tools/aubinator_error_decode.c | 783 +++ 5 files changed, 814 insertions(+), 1 deletion(-) create mode 100644 src/intel/tools/aubinator_error_decode.c diff --git a/src/intel/Makefile.tools.am b/src/intel/Makefile.tools.am index 245bd03eef..a3a917d50e 100644 --- a/src/intel/Makefile.tools.am +++ b/src/intel/Makefile.tools.am @@ -19,7 +19,9 @@ # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS # IN THE SOFTWARE. -noinst_PROGRAMS += tools/aubinator +noinst_PROGRAMS += \ + tools/aubinator \ + tools/aubinator_error_decode tools_aubinator_SOURCES = \ tools/aubinator.c \ @@ -41,3 +43,19 @@ tools_aubinator_LDADD = \ $(EXPAT_LIBS) \ $(ZLIB_LIBS) \ -lm + + +tools_aubinator_error_decode_SOURCES = \ + tools/aubinator_error_decode.c + +tools_aubinator_error_decode_LDADD = \ + common/libintel_common.la \ + $(top_builddir)/src/util/libmesautil.la \ + $(aubinator_DEPS) \ + $(EXPAT_LIBS) \ + $(ZLIB_LIBS) + +tools_aubinator_error_decode_CFLAGS = \ + $(AM_CFLAGS) \ + $(EXPAT_CFLAGS) \ + $(ZLIB_CFLAGS) diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c index 1c3246f265..3af472caef 100644 --- a/src/intel/common/gen_decoder.c +++ b/src/intel/common/gen_decoder.c @@ -112,6 +112,16 @@ gen_spec_find_register(struct gen_spec *spec, uint32_t offset) return NULL; } +struct gen_group * +gen_spec_find_register_by_name(struct gen_spec *spec, const char *name) +{ + for (int i = 0; i < spec->nregisters; i++) + if (strcmp(spec->registers[i]->name, name) == 0) + return spec->registers[i]; + + return NULL; +} + struct gen_enum * gen_spec_find_enum(struct gen_spec *spec, const char *name) { diff --git a/src/intel/common/gen_decoder.h b/src/intel/common/gen_decoder.h index 1c41de80a4..936b052455 100644 --- a/src/intel/common/gen_decoder.h +++ b/src/intel/common/gen_decoder.h @@ -45,6 +45,7 @@ struct gen_spec *gen_spec_load_from_path(const struct gen_device_info *devinfo, uint32_t gen_spec_get_gen(struct gen_spec *spec); struct gen_group *gen_spec_find_instruction(struct gen_spec *spec, const uint32_t *p); struct gen_group *gen_spec_find_register(struct gen_spec *spec, uint32_t offset); +struct gen_group *gen_spec_find_register_by_name(struct gen_spec *spec, const char *name); int gen_group_get_length(struct gen_group *group, const uint32_t *p); const char *gen_group_get_name(struct gen_group *group); uint32_t gen_group_get_opcode(struct gen_group *group); diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore index 0c80a6fed2..27437f9eef 100644 --- a/src/intel/tools/.gitignore +++ b/src/intel/tools/.gitignore @@ -1 +1,2 @@ /aubinator +/aubinator_error_decode diff --git a/src/intel/tools/aubinator_error_decode.c b/src/intel/tools/aubinator_error_decode.c new file mode 100644 index 00..a477086cd8 --- /dev/null +++ b/src/intel/tools/aubinator_error_decode.c @@ -0,0 +1,783 @@ +/* + * Copyright © 2007-2017 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + *Eric Anholt + *Carl Worth + *Chris Wilson + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include
[Mesa-dev] [PATCH 2/7] intel: genxml: add GFX_ARB_ERROR_RPT register
Signed-off-by: Lionel Landwerlin--- src/intel/genxml/gen6.xml | 12 src/intel/genxml/gen7.xml | 12 src/intel/genxml/gen75.xml | 13 + src/intel/genxml/gen8.xml | 18 ++ src/intel/genxml/gen9.xml | 18 ++ 5 files changed, 73 insertions(+) diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml index 3ec13cd8fc..02ed465c5d 100644 --- a/src/intel/genxml/gen6.xml +++ b/src/intel/genxml/gen6.xml @@ -2075,4 +2075,16 @@ + + + + + + + + + + + + diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml index d79aad9d14..ba9c8e8154 100644 --- a/src/intel/genxml/gen7.xml +++ b/src/intel/genxml/gen7.xml @@ -2653,4 +2653,16 @@ + + + + + + + + + + + + diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml index 18481f1f50..979f1e3ee2 100644 --- a/src/intel/genxml/gen75.xml +++ b/src/intel/genxml/gen75.xml @@ -3076,4 +3076,17 @@ + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml index b6af98a194..91573ae73a 100644 --- a/src/intel/genxml/gen8.xml +++ b/src/intel/genxml/gen8.xml @@ -3330,4 +3330,22 @@ + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml index b4dc6c4966..448ac6c8ab 100644 --- a/src/intel/genxml/gen9.xml +++ b/src/intel/genxml/gen9.xml @@ -3614,4 +3614,22 @@ + + + + + + + + + + + + + + + + + + -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/7] intel: genxml: add ACTHD registers
Signed-off-by: Lionel Landwerlin--- src/intel/genxml/gen8.xml | 16 src/intel/genxml/gen9.xml | 16 2 files changed, 32 insertions(+) diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml index 91573ae73a..be54748876 100644 --- a/src/intel/genxml/gen8.xml +++ b/src/intel/genxml/gen8.xml @@ -3348,4 +3348,20 @@ + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml index 448ac6c8ab..7509e49236 100644 --- a/src/intel/genxml/gen9.xml +++ b/src/intel/genxml/gen9.xml @@ -3632,4 +3632,20 @@ + + + + + + + + + + + + + + + + -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/7] intel: genxml: add INSTDONE registers
Signed-off-by: Lionel Landwerlin--- src/intel/genxml/gen6.xml | 110 + src/intel/genxml/gen7.xml | 64 ++ src/intel/genxml/gen75.xml | 71 + src/intel/genxml/gen8.xml | 71 + src/intel/genxml/gen9.xml | 71 + 5 files changed, 387 insertions(+) diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml index 33969d937e..3ec13cd8fc 100644 --- a/src/intel/genxml/gen6.xml +++ b/src/intel/genxml/gen6.xml @@ -1965,4 +1965,114 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml index f46dae7ce0..d79aad9d14 100644 --- a/src/intel/genxml/gen7.xml +++ b/src/intel/genxml/gen7.xml @@ -2546,6 +2546,70 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml index 7fe9b02d6e..18481f1f50 100644 --- a/src/intel/genxml/gen75.xml +++ b/src/intel/genxml/gen75.xml @@ -2954,6 +2954,77 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml index 0ebf2aa9c0..b6af98a194 100644 --- a/src/intel/genxml/gen8.xml +++ b/src/intel/genxml/gen8.xml @@ -3214,6 +3214,77 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml index 79fad000b2..b4dc6c4966 100644 --- a/src/intel/genxml/gen9.xml +++ b/src/intel/genxml/gen9.xml @@ -3491,6 +3491,77 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/7] intel: genxml: add gen7 ERR_INT register
Signed-off-by: Lionel Landwerlin--- src/intel/genxml/gen7.xml | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml index ba9c8e8154..08307b3506 100644 --- a/src/intel/genxml/gen7.xml +++ b/src/intel/genxml/gen7.xml @@ -2665,4 +2665,15 @@ + + + + + + + + + + + -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/7] Aubinator error decode
Hi, This series introduces a slightly enhanced version of intel_error_decode. Most Mesa developers working on the i965/vulkan drivers may have to deal with hangs related to a specific workload. Having the complete decoding of the instruction stream is quite useful. With the Anv driver genxml files where introduces and we have used them successfully in aubinator to look at .aub files. With this change we can apply the same error states reported by the kernel driver. Cheers, Lionel Landwerlin (7): intel: genxml: add INSTDONE registers intel: genxml: add GFX_ARB_ERROR_RPT register intel: genxml: add ACTHD registers intel: genxml: add gen7 ERR_INT register intel: genxml: add FAULT_REG register intel: genxml: add RING_BUFFER_CTL registers intel: tools: add aubinator_error_decode tool src/intel/Makefile.tools.am | 20 +- src/intel/common/gen_decoder.c | 10 + src/intel/common/gen_decoder.h | 1 + src/intel/genxml/gen6.xml| 210 + src/intel/genxml/gen7.xml| 175 +++ src/intel/genxml/gen75.xml | 202 src/intel/genxml/gen8.xml| 197 src/intel/genxml/gen9.xml| 197 src/intel/tools/.gitignore | 1 + src/intel/tools/aubinator_error_decode.c | 783 +++ 10 files changed, 1795 insertions(+), 1 deletion(-) create mode 100644 src/intel/tools/aubinator_error_decode.c -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/7] intel: genxml: add RING_BUFFER_CTL registers
Signed-off-by: Lionel Landwerlin--- src/intel/genxml/gen6.xml | 40 +++ src/intel/genxml/gen7.xml | 40 +++ src/intel/genxml/gen75.xml | 54 src/intel/genxml/gen8.xml | 69 ++ src/intel/genxml/gen9.xml | 69 ++ 5 files changed, 272 insertions(+) diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml index 99683ceed5..5083f074a1 100644 --- a/src/intel/genxml/gen6.xml +++ b/src/intel/genxml/gen6.xml @@ -2135,4 +2135,44 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml index cbd5bbbf5a..ada8f74396 100644 --- a/src/intel/genxml/gen7.xml +++ b/src/intel/genxml/gen7.xml @@ -2724,4 +2724,44 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml index 9137e6f460..50d6d8d8aa 100644 --- a/src/intel/genxml/gen75.xml +++ b/src/intel/genxml/gen75.xml @@ -3153,4 +3153,58 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml index 8835cb99f7..1390fe68c1 100644 --- a/src/intel/genxml/gen8.xml +++ b/src/intel/genxml/gen8.xml @@ -3387,4 +3387,73 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml index 26e6459e4d..4bf0fb6199 100644 --- a/src/intel/genxml/gen9.xml +++ b/src/intel/genxml/gen9.xml @@ -3671,4 +3671,73 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/7] intel: genxml: add FAULT_REG register
Signed-off-by: Lionel Landwerlin--- src/intel/genxml/gen6.xml | 48 ++ src/intel/genxml/gen7.xml | 48 ++ src/intel/genxml/gen75.xml | 64 ++ src/intel/genxml/gen8.xml | 23 + src/intel/genxml/gen9.xml | 23 + 5 files changed, 206 insertions(+) diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml index 02ed465c5d..99683ceed5 100644 --- a/src/intel/genxml/gen6.xml +++ b/src/intel/genxml/gen6.xml @@ -2087,4 +2087,52 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml index 08307b3506..cbd5bbbf5a 100644 --- a/src/intel/genxml/gen7.xml +++ b/src/intel/genxml/gen7.xml @@ -2676,4 +2676,52 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml index 979f1e3ee2..9137e6f460 100644 --- a/src/intel/genxml/gen75.xml +++ b/src/intel/genxml/gen75.xml @@ -3089,4 +3089,68 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml index be54748876..8835cb99f7 100644 --- a/src/intel/genxml/gen8.xml +++ b/src/intel/genxml/gen8.xml @@ -3364,4 +3364,27 @@ + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml index 7509e49236..26e6459e4d 100644 --- a/src/intel/genxml/gen9.xml +++ b/src/intel/genxml/gen9.xml @@ -3648,4 +3648,27 @@ + + + + + + + + + + + + + + + + + + + + + + + -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st: Add cubeMapFace parameter to st_finalize_texture.
I'm OK with this patch. Marek On Wed, Mar 29, 2017 at 12:57 PM, Nicolai Hähnlewrote: > Hi Michal, > > thanks for the patch. That piglit test actually fails on radeonsi as well. > > > On 28.03.2017 22:39, Michal Srb wrote: >> >> st_finalize_texture always accesses image at face 0, but it may not be set >> if we are working with cubemap that had other face set. >> >> This fixes crash in piglit >> same-attachment-glFramebufferTexture2D-GL_DEPTH_STENCIL_ATTACHMENT. > > > Please make sure commit messages are wrapped to <75 characters. > > Also: > > Cc: mesa-sta...@lists.freedesktop.org > > >> --- >> Hi, this is my attempt to fix crash in piglit test >> same-attachment-glFramebufferTexture2D-GL_DEPTH_STENCIL_ATTACHMENT ran with >> LIBGL_ALWAYS_INDIRECT=1. >> I am not sure if it is the right approach. From what I found online >> rendering into a face of a cube texture that doesn't have all faces set >> would be invalid, but the test passes with other drivers, so maybe it's ok. >> This makes it pass with software rendering as well. > > > I actually don't see anything in the spec that would require texture > completeness. That makes sense, since rendering into one image of a texture > doesn't imply using sampler state. So allowing the test to pass is good. > > The flip-side is that this means calling st_finalize_texture at all may not > be the right thing to do in the FBO code (except perhaps as an opportunistic > optimization). After all, we could have a messed up situation where there > are incompatible mip level in a texture, and we render to one of them > anyway. > > Cleaning that up would be quite involved. I think this fix is fine for now, > since it does improve the situation: > > Reviewed-by: Nicolai Hähnle > > Let's see if there are any other opinions... > > Cheers, > Nicolai > > > >> >> src/gallium/state_trackers/dri/dri2.c| 2 +- >> src/mesa/state_tracker/st_atom_image.c | 2 +- >> src/mesa/state_tracker/st_atom_texture.c | 2 +- >> src/mesa/state_tracker/st_cb_fbo.c | 2 +- >> src/mesa/state_tracker/st_cb_texture.c | 5 +++-- >> src/mesa/state_tracker/st_cb_texture.h | 3 ++- >> src/mesa/state_tracker/st_gen_mipmap.c | 2 +- >> 7 files changed, 10 insertions(+), 8 deletions(-) >> >> diff --git a/src/gallium/state_trackers/dri/dri2.c >> b/src/gallium/state_trackers/dri/dri2.c >> index b50e096..ed6004f 100644 >> --- a/src/gallium/state_trackers/dri/dri2.c >> +++ b/src/gallium/state_trackers/dri/dri2.c >> @@ -1808,7 +1808,7 @@ dri2_interop_export_object(__DRIcontext *_ctx, >> return MESA_GLINTEROP_INVALID_MIP_LEVEL; >>} >> >> - if (!st_finalize_texture(ctx, st->pipe, obj)) { >> + if (!st_finalize_texture(ctx, st->pipe, obj, 0)) { >> mtx_unlock(>Shared->Mutex); >> return MESA_GLINTEROP_OUT_OF_RESOURCES; >>} >> diff --git a/src/mesa/state_tracker/st_atom_image.c >> b/src/mesa/state_tracker/st_atom_image.c >> index 5dd2cd6..4101552 100644 >> --- a/src/mesa/state_tracker/st_atom_image.c >> +++ b/src/mesa/state_tracker/st_atom_image.c >> @@ -64,7 +64,7 @@ st_bind_images(struct st_context *st, struct gl_program >> *prog, >>struct pipe_image_view *img = [i]; >> >>if (!_mesa_is_image_unit_valid(st->ctx, u) || >> - !st_finalize_texture(st->ctx, st->pipe, u->TexObj) || >> + !st_finalize_texture(st->ctx, st->pipe, u->TexObj, 0) || >>!stObj->pt) { >> memset(img, 0, sizeof(*img)); >> continue; >> diff --git a/src/mesa/state_tracker/st_atom_texture.c >> b/src/mesa/state_tracker/st_atom_texture.c >> index 92023e0..5b481ec 100644 >> --- a/src/mesa/state_tracker/st_atom_texture.c >> +++ b/src/mesa/state_tracker/st_atom_texture.c >> @@ -73,7 +73,7 @@ update_single_texture(struct st_context *st, >> } >> stObj = st_texture_object(texObj); >> >> - retval = st_finalize_texture(ctx, st->pipe, texObj); >> + retval = st_finalize_texture(ctx, st->pipe, texObj, 0); >> if (!retval) { >>/* out of mem */ >>return GL_FALSE; >> diff --git a/src/mesa/state_tracker/st_cb_fbo.c >> b/src/mesa/state_tracker/st_cb_fbo.c >> index 78433bf..dce4239 100644 >> --- a/src/mesa/state_tracker/st_cb_fbo.c >> +++ b/src/mesa/state_tracker/st_cb_fbo.c >> @@ -488,7 +488,7 @@ st_render_texture(struct gl_context *ctx, >> struct st_renderbuffer *strb = st_renderbuffer(rb); >> struct pipe_resource *pt; >> >> - if (!st_finalize_texture(ctx, pipe, att->Texture)) >> + if (!st_finalize_texture(ctx, pipe, att->Texture, att->CubeMapFace)) >>return; >> >> pt = st_get_texobj_resource(att->Texture); >> diff --git a/src/mesa/state_tracker/st_cb_texture.c >> b/src/mesa/state_tracker/st_cb_texture.c >> index bc6f108..1b486d7 100644 >> --- a/src/mesa/state_tracker/st_cb_texture.c >> +++ b/src/mesa/state_tracker/st_cb_texture.c >> @@ -2434,7 +2434,8 @@ copy_image_data_to_texture(struct st_context *st, >> GLboolean >>
Re: [Mesa-dev] [PATCH] [RFC v3] mesa/glthread: Call unmarshal_batch directly in glthread_finish
This patch helps against the massive performance drop of glthread with Two Worlds. The performance boost in Civ5 is not hurt by this patch. It looks good. Some trivial comments in the patch: On Wed, Mar 29, 2017 at 7:35 PM, Bartosz Tomczykwrote: > I would be very grateful if someone could help with testing performance > impact of this change. > > On Wed, Mar 29, 2017 at 7:31 PM, Bartosz Tomczyk > wrote: >> >> Call it directly when batch queue is empty. This avoids costly thread >> synchronisation. With this fix games that previously regressed >> with mesa_glthread=true like xonotic or grid autosport. >> --- >> src/mesa/main/glthread.c | 47 >> ++- >> 1 file changed, 34 insertions(+), 13 deletions(-) >> >> diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c >> index 06115b916d..faf42c2b89 100644 >> --- a/src/mesa/main/glthread.c >> +++ b/src/mesa/main/glthread.c >> @@ -194,16 +194,12 @@ _mesa_glthread_restore_dispatch(struct gl_context >> *ctx) >> } >> } >> >> -void >> -_mesa_glthread_flush_batch(struct gl_context *ctx) >> +static void >> +_mesa_glthread_flush_batch_locked(struct gl_context *ctx) >> { >> struct glthread_state *glthread = ctx->GLThread; >> - struct glthread_batch *batch; >> - >> - if (!glthread) >> - return; >> - >> - batch = glthread->batch; >> + struct glthread_batch *batch = glthread->batch; >> + Trailing whitespace. >> if (!batch->used) >>return; >> >> @@ -223,10 +219,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx) >>return; >> } >> >> - pthread_mutex_lock(>mutex); >> *glthread->batch_queue_tail = batch; >> glthread->batch_queue_tail = >next; >> pthread_cond_broadcast(>new_work); >> + >> +} Move the the bracket one line up. Thanks edmondo >> +void >> +_mesa_glthread_flush_batch(struct gl_context *ctx) >> +{ >> + struct glthread_state *glthread = ctx->GLThread; >> + struct glthread_batch *batch; >> + >> + if (!glthread) >> + return; >> + >> + batch = glthread->batch; >> + if (!batch->used) >> + return; >> + >> + pthread_mutex_lock(>mutex); >> + _mesa_glthread_flush_batch_locked(ctx); >> pthread_mutex_unlock(>mutex); >> } >> >> @@ -252,12 +264,21 @@ _mesa_glthread_finish(struct gl_context *ctx) >> if (pthread_self() == glthread->thread) >>return; >> >> - _mesa_glthread_flush_batch(ctx); >> - >> pthread_mutex_lock(>mutex); >> >> - while (glthread->batch_queue || glthread->busy) >> - pthread_cond_wait(>work_done, >mutex); >> + if (!(glthread->batch_queue || glthread->busy)) { >> + if (glthread->batch && glthread->batch->used) { >> + struct _glapi_table *dispatch = _glapi_get_dispatch(); >> + glthread_unmarshal_batch(ctx, glthread->batch); >> + _glapi_set_dispatch(dispatch); >> + glthread_allocate_batch(ctx); >> + } >> + } >> + else { >> + _mesa_glthread_flush_batch_locked(ctx); >> + while (glthread->batch_queue || glthread->busy) >> + pthread_cond_wait(>work_done, >mutex); >> + } >> >> pthread_mutex_unlock(>mutex); >> } >> -- >> 2.12.2 >> > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 3/3] mesa: expose KHR_no_error for GL
The series looks good to me except the "==" -> "&" in patch 2. The patches have no effect without the GLX extension, right? Marek On Tue, Mar 28, 2017 at 6:35 AM, Timothy Arceriwrote: > There ES is no support for now as this requires > EGL_KHR_create_context_no_error to be implemented. > --- > src/mesa/main/extensions_table.h | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/mesa/main/extensions_table.h > b/src/mesa/main/extensions_table.h > index ec71791..4439731 100644 > --- a/src/mesa/main/extensions_table.h > +++ b/src/mesa/main/extensions_table.h > @@ -294,20 +294,21 @@ EXT(IBM_texture_mirrored_repeat , dummy_true > > EXT(INGR_blend_func_separate, EXT_blend_func_separate > , GLL, x , x , x , 1999) > > EXT(INTEL_conservative_rasterization, > INTEL_conservative_rasterization , x , GLC, x , 31, 2013) > EXT(INTEL_performance_query , INTEL_performance_query > , GLL, GLC, x , ES2, 2013) > > EXT(KHR_blend_equation_advanced , KHR_blend_equation_advanced > , GLL, GLC, x , ES2, 2014) > EXT(KHR_blend_equation_advanced_coherent, > KHR_blend_equation_advanced_coherent , GLL, GLC, x , ES2, 2014) > EXT(KHR_context_flush_control , dummy_true > , GLL, GLC, x , ES2, 2014) > EXT(KHR_debug , dummy_true > , GLL, GLC, 11, ES2, 2012) > +EXT(KHR_no_error, dummy_true > , GLL, GLC, x , x , 2015) > EXT(KHR_robust_buffer_access_behavior , > ARB_robust_buffer_access_behavior , GLL, GLC, x , ES2, 2014) > EXT(KHR_robustness , KHR_robustness > , GLL, GLC, x , ES2, 2012) > EXT(KHR_texture_compression_astc_hdr, > KHR_texture_compression_astc_hdr , GLL, GLC, x , ES2, 2012) > EXT(KHR_texture_compression_astc_ldr, > KHR_texture_compression_astc_ldr , GLL, GLC, x , ES2, 2012) > EXT(KHR_texture_compression_astc_sliced_3d , > KHR_texture_compression_astc_sliced_3d , GLL, GLC, x , ES2, 2015) > > EXT(MESA_pack_invert, MESA_pack_invert > , GLL, GLC, x , x , 2002) > EXT(MESA_shader_integer_functions , MESA_shader_integer_functions > , GLL, GLC, x , 30, 2016) > EXT(MESA_texture_signed_rgba, EXT_texture_snorm > , GLL, GLC, x , x , 2009) > EXT(MESA_window_pos , dummy_true > , GLL, x , x , x , 2000) > -- > 2.9.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 100424] X hang (in kernel) after some event in Serious Sam Fusion using radv. 4.9/amd-staging-4.9
https://bugs.freedesktop.org/show_bug.cgi?id=100424 --- Comment #4 from Darren Salt--- … okay, it's looking like the Steam overlay has a lot to do with this problem. (Tested with current Mesa git, but the same LLVM as before.) -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 25/25] radeonsi: enable ARB_sparse_buffer
For patches 13-25: Reviewed-by: Marek OlšákI think the series will also need a newer libdrm than the one required by configure.ac, but my latest configure.ac patch for Vega should address that. Marek On Tue, Mar 28, 2017 at 11:12 AM, Nicolai Hähnle wrote: > From: Nicolai Hähnle > > TODO add features.txt and ChangeLog > > v2: > - fill in DRM version requirement > - disable on SI due to CP DMA faults > --- > src/gallium/drivers/radeonsi/si_pipe.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/src/gallium/drivers/radeonsi/si_pipe.c > b/src/gallium/drivers/radeonsi/si_pipe.c > index 277fa28..9096f16 100644 > --- a/src/gallium/drivers/radeonsi/si_pipe.c > +++ b/src/gallium/drivers/radeonsi/si_pipe.c > @@ -461,20 +461,30 @@ static int si_get_param(struct pipe_screen* pscreen, > enum pipe_cap param) > > case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY: > case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY: > case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY: > /* SI doesn't support unaligned loads. > * CIK needs DRM 2.50.0 on radeon. */ > return sscreen->b.chip_class == SI || >(sscreen->b.info.drm_major == 2 && > sscreen->b.info.drm_minor < 50); > > + case PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE: > + /* Disable on SI due to VM faults in CP DMA. Enable once these > +* faults are mitigated in software. > +*/ > + if (sscreen->b.chip_class >= CIK && > + sscreen->b.info.drm_major == 3 && > + sscreen->b.info.drm_minor >= 13) > + return RADEON_SPARSE_PAGE_SIZE; > + return 0; > + > /* Unsupported features. */ > case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY: > case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT: > case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS: > case PIPE_CAP_USER_VERTEX_BUFFERS: > case PIPE_CAP_FAKE_SW_MSAA: > case PIPE_CAP_TEXTURE_GATHER_OFFSETS: > case PIPE_CAP_VERTEXID_NOBASE: > case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES: > case PIPE_CAP_TGSI_VOTE: > -- > 2.9.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 2/2] anv: Query the kernel for reset status
When a client causes a GPU hang (or experiences issues due to a hang in another client) we want to let it know as soon as possible. In particular, if it submits work with a fence and calls vkWaitForFences or vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be able to trust the results of that rendering. In order to provide this guarantee, we have to ask the kernel for context status in a few key locations. --- src/intel/vulkan/anv_device.c | 114 + src/intel/vulkan/anv_gem.c | 17 ++ src/intel/vulkan/anv_private.h | 5 ++ src/intel/vulkan/genX_query.c | 11 ++-- 4 files changed, 107 insertions(+), 40 deletions(-) diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 5f0d00f..bc3be23 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -884,8 +884,6 @@ anv_device_submit_simple_batch(struct anv_device *device, struct anv_bo bo, *exec_bos[1]; VkResult result = VK_SUCCESS; uint32_t size; - int64_t timeout; - int ret; /* Kernel driver requires 8 byte aligned batch length */ size = align_u32(batch->next - batch->start, 8); @@ -925,14 +923,7 @@ anv_device_submit_simple_batch(struct anv_device *device, if (result != VK_SUCCESS) goto fail; - timeout = INT64_MAX; - ret = anv_gem_wait(device, bo.gem_handle, ); - if (ret != 0) { - /* We don't know the real error. */ - device->lost = true; - result = vk_errorf(VK_ERROR_DEVICE_LOST, "execbuf2 failed: %m"); - goto fail; - } + result = anv_device_wait(device, , INT64_MAX); fail: anv_bo_pool_free(>batch_bo_pool, ); @@ -1264,6 +1255,58 @@ anv_device_execbuf(struct anv_device *device, return VK_SUCCESS; } +VkResult +anv_device_query_status(struct anv_device *device) +{ + /* This isn't likely as most of the callers of this function already check +* for it. However, it doesn't hurt to check and it potentially lets us +* avoid an ioctl. +*/ + if (unlikely(device->lost)) + return VK_ERROR_DEVICE_LOST; + + uint32_t active, pending; + int ret = anv_gem_gpu_get_reset_stats(device, , ); + if (ret == -1) { + /* We don't know the real error. */ + device->lost = true; + return vk_errorf(VK_ERROR_DEVICE_LOST, "get_reset_stats failed: %m"); + } + + if (active) { + device->lost = true; + return vk_errorf(VK_ERROR_DEVICE_LOST, + "GPU hung on one of our command buffers"); + } else if (pending) { + device->lost = true; + return vk_errorf(VK_ERROR_DEVICE_LOST, + "GPU hung with commands in-flight"); + } + + return VK_SUCCESS; +} + +VkResult +anv_device_wait(struct anv_device *device, struct anv_bo *bo, +int64_t timeout) +{ + int ret = anv_gem_wait(device, bo->gem_handle, ); + if (ret == -1 && errno == ETIME) { + return VK_TIMEOUT; + } else if (ret == -1) { + /* We don't know the real error. */ + device->lost = true; + return vk_errorf(VK_ERROR_DEVICE_LOST, "gem wait failed: %m"); + } + + /* Query for device status after the wait. If the BO we're waiting on got +* caught in a GPU hang we don't want to return VK_SUCCESS to the client +* because it clearly doesn't have valid data. Yes, this most likely means +* an ioctl, but we just did an ioctl to wait so it's no great loss. +*/ + return anv_device_query_status(device); +} + VkResult anv_QueueSubmit( VkQueue _queue, uint32_tsubmitCount, @@ -1273,10 +1316,17 @@ VkResult anv_QueueSubmit( ANV_FROM_HANDLE(anv_queue, queue, _queue); ANV_FROM_HANDLE(anv_fence, fence, _fence); struct anv_device *device = queue->device; - if (unlikely(device->lost)) - return VK_ERROR_DEVICE_LOST; - VkResult result = VK_SUCCESS; + /* Query for device status prior to submitting. Technically, we don't need +* to do this. However, if we have a client that's submitting piles of +* garbage, we would rather break as early as possible to keep the GPU +* hanging contained. If we don't check here, we'll either be waiting for +* the kernel to kick us or we'll have to wait until the client waits on a +* fence before we actually know whether or not we've hung. +*/ + VkResult result = anv_device_query_status(device); + if (result != VK_SUCCESS) + return result; /* We lock around QueueSubmit for three main reasons: * @@ -1802,9 +1852,6 @@ VkResult anv_GetFenceStatus( if (unlikely(device->lost)) return VK_ERROR_DEVICE_LOST; - int64_t t = 0; - int ret; - switch (fence->state) { case ANV_FENCE_STATE_RESET: /* If it hasn't even been sent off to the GPU yet, it's not ready */ @@ -1814,15 +1861,18 @@ VkResult anv_GetFenceStatus( /* It's been signaled, return success */ return VK_SUCCESS; - case
[Mesa-dev] [Bug 100259] [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'
https://bugs.freedesktop.org/show_bug.cgi?id=100259 ovarieg...@yahoo.com changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #9 from ovarieg...@yahoo.com --- I am still experiencing this issue with the current mesa git master in both my multilib 64 bit system and my clean 32-bit chroot, both with Slackware installed. I also have the latest libdrm git master installed. A friend who runs gentoo also experienced this issue. drivers/dri2/.libs/platform_drm.o: In function `get_back_bo': platform_drm.c:(.text+0x1d4): undefined reference to `gbm_bo_create_with_modifiers' collect2: error: ld returned 1 exit status libtool: error: error: relink 'libEGL.la' with the above command before installing it make[4]: *** [Makefile:910: install-libLTLIBRARIES] Error 1 make[4]: Leaving directory '/tmp/SBo/mesa/src/egl' make[3]: *** [Makefile:1385: install-am] Error 2 make[3]: Leaving directory '/tmp/SBo/mesa/src/egl' make[2]: *** [Makefile:852: install-recursive] Error 1 make[2]: Leaving directory '/tmp/SBo/mesa/src' make[1]: *** [Makefile:1009: install] Error 2 make[1]: Leaving directory '/tmp/SBo/mesa/src' make: *** [Makefile:643: install-recursive] Error 1 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [RFC v3] mesa/glthread: Call unmarshal_batch directly in glthread_finish
2017-03-29 19:35 GMT+02:00 Bartosz Tomczyk: > I would be very grateful if someone could help with testing performance > impact of this change. > Currently prepping some tests on my HTPC, which is a bit CPU-bound. I'll report back in about an hour or so. > On Wed, Mar 29, 2017 at 7:31 PM, Bartosz Tomczyk > wrote: >> >> Call it directly when batch queue is empty. This avoids costly thread >> synchronisation. With this fix games that previously regressed >> with mesa_glthread=true like xonotic or grid autosport. >> --- >> src/mesa/main/glthread.c | 47 >> ++- >> 1 file changed, 34 insertions(+), 13 deletions(-) >> >> diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c >> index 06115b916d..faf42c2b89 100644 >> --- a/src/mesa/main/glthread.c >> +++ b/src/mesa/main/glthread.c >> @@ -194,16 +194,12 @@ _mesa_glthread_restore_dispatch(struct gl_context >> *ctx) >> } >> } >> >> -void >> -_mesa_glthread_flush_batch(struct gl_context *ctx) >> +static void >> +_mesa_glthread_flush_batch_locked(struct gl_context *ctx) >> { >> struct glthread_state *glthread = ctx->GLThread; >> - struct glthread_batch *batch; >> - >> - if (!glthread) >> - return; >> - >> - batch = glthread->batch; >> + struct glthread_batch *batch = glthread->batch; >> + >> if (!batch->used) >>return; >> >> @@ -223,10 +219,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx) >>return; >> } >> >> - pthread_mutex_lock(>mutex); >> *glthread->batch_queue_tail = batch; >> glthread->batch_queue_tail = >next; >> pthread_cond_broadcast(>new_work); >> + >> +} >> +void >> +_mesa_glthread_flush_batch(struct gl_context *ctx) >> +{ >> + struct glthread_state *glthread = ctx->GLThread; >> + struct glthread_batch *batch; >> + >> + if (!glthread) >> + return; >> + >> + batch = glthread->batch; >> + if (!batch->used) >> + return; >> + >> + pthread_mutex_lock(>mutex); >> + _mesa_glthread_flush_batch_locked(ctx); >> pthread_mutex_unlock(>mutex); >> } >> >> @@ -252,12 +264,21 @@ _mesa_glthread_finish(struct gl_context *ctx) >> if (pthread_self() == glthread->thread) >>return; >> >> - _mesa_glthread_flush_batch(ctx); >> - >> pthread_mutex_lock(>mutex); >> >> - while (glthread->batch_queue || glthread->busy) >> - pthread_cond_wait(>work_done, >mutex); >> + if (!(glthread->batch_queue || glthread->busy)) { >> + if (glthread->batch && glthread->batch->used) { >> + struct _glapi_table *dispatch = _glapi_get_dispatch(); >> + glthread_unmarshal_batch(ctx, glthread->batch); >> + _glapi_set_dispatch(dispatch); >> + glthread_allocate_batch(ctx); >> + } >> + } >> + else { >> + _mesa_glthread_flush_batch_locked(ctx); >> + while (glthread->batch_queue || glthread->busy) >> + pthread_cond_wait(>work_done, >mutex); >> + } >> >> pthread_mutex_unlock(>mutex); >> } >> -- >> 2.12.2 >> > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] anv/cmd_buffer: fix host memory leak
From: Craig Stoutpush_constants must be free'd. https://bugs.freedesktop.org/show_bug.cgi?id=100452 --- src/intel/vulkan/anv_cmd_buffer.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_buffer.c index 909bee2..c65eba2 100644 --- a/src/intel/vulkan/anv_cmd_buffer.c +++ b/src/intel/vulkan/anv_cmd_buffer.c @@ -120,7 +120,12 @@ anv_cmd_state_reset(struct anv_cmd_buffer *cmd_buffer) cmd_buffer->batch.status = VK_SUCCESS; memset(>descriptors, 0, sizeof(state->descriptors)); - memset(>push_constants, 0, sizeof(state->push_constants)); + for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++) { + if (state->push_constants[i] != NULL) { + vk_free(_buffer->pool->alloc, state->push_constants[i]); + state->push_constants[i] = NULL; + } + } memset(state->binding_tables, 0, sizeof(state->binding_tables)); memset(state->samplers, 0, sizeof(state->samplers)); @@ -193,6 +198,9 @@ static VkResult anv_create_cmd_buffer( cmd_buffer->batch.status = VK_SUCCESS; + for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++) { + cmd_buffer->state.push_constants[i] = NULL; + } cmd_buffer->_loader_data.loaderMagic = ICD_LOADER_MAGIC; cmd_buffer->device = device; cmd_buffer->pool = pool; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] anv/cmd_buffer: fix dynamic state leak
From: Craig Stoutanv_state_pool_alloc requires a matching free, whereas anv_state_stream_alloc will be cleaned up on finish. Applies only to 13.0 branch. x https://bugs.freedesktop.org/show_bug.cgi?id=100365 --- src/intel/vulkan/anv_private.h | 12 src/intel/vulkan/genX_cmd_buffer.c | 32 2 files changed, 28 insertions(+), 16 deletions(-) diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index dd67508..12a6aa1 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -765,6 +765,18 @@ _anv_combine_address(struct anv_batch *batch, void *location, __state; \ }) +#define anv_state_stream_emit(stream, cmd, align, ...) \ + ({ \ + const uint32_t __size = __anv_cmd_length(cmd) * 4; \ + struct anv_state __state = anv_state_stream_alloc((stream), __size, align); \ + struct cmd __template = {__VA_ARGS__}; \ + __anv_cmd_pack(cmd)(NULL, __state.map, &__template); \ + VG(VALGRIND_CHECK_MEM_IS_DEFINED(__state.map, __anv_cmd_length(cmd) * 4)); \ + if (!(stream)->block_pool->device->info.has_llc) \ + anv_state_clflush(__state); \ + __state; \ + }) + #define GEN7_MOCS (struct GEN7_MEMORY_OBJECT_CONTROL_STATE) { \ .GraphicsDataTypeGFDT= 0, \ .LLCCacheabilityControlLLCCC = 0, \ diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 45fefc9..33db7ce 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -1367,26 +1367,26 @@ flush_compute_descriptor_set(struct anv_cmd_buffer *cmd_buffer) const uint32_t slm_size = encode_slm_size(GEN_GEN, prog_data->total_shared); struct anv_state state = - anv_state_pool_emit(>dynamic_state_pool, - GENX(INTERFACE_DESCRIPTOR_DATA), 64, - .KernelStartPointer = pipeline->cs_simd, - .BindingTablePointer = surfaces.offset, - .BindingTableEntryCount = 0, - .SamplerStatePointer = samplers.offset, - .SamplerCount = 0, + anv_state_stream_emit(_buffer->dynamic_state_stream, +GENX(INTERFACE_DESCRIPTOR_DATA), 64, +.KernelStartPointer = pipeline->cs_simd, +.BindingTablePointer = surfaces.offset, +.BindingTableEntryCount = 0, +.SamplerStatePointer = samplers.offset, +.SamplerCount = 0, #if !GEN_IS_HASWELL - .ConstantURBEntryReadOffset = 0, +.ConstantURBEntryReadOffset = 0, #endif - .ConstantURBEntryReadLength = - cs_prog_data->push.per_thread.regs, +.ConstantURBEntryReadLength = + cs_prog_data->push.per_thread.regs, #if GEN_GEN >= 8 || GEN_IS_HASWELL - .CrossThreadConstantDataReadLength = - cs_prog_data->push.cross_thread.regs, +.CrossThreadConstantDataReadLength = + cs_prog_data->push.cross_thread.regs, #endif - .BarrierEnable = cs_prog_data->uses_barrier, - .SharedLocalMemorySize = slm_size, - .NumberofThreadsinGPGPUThreadGroup = - cs_prog_data->threads); +.BarrierEnable = cs_prog_data->uses_barrier, +.SharedLocalMemorySize = slm_size, +.NumberofThreadsinGPGPUThreadGroup = + cs_prog_data->threads); uint32_t size = GENX(INTERFACE_DESCRIPTOR_DATA_length) * sizeof(uint32_t); anv_batch_emit(_buffer->batch, -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] winsys/amdgpu: remove AMDGPU_INFO_NUM_EVICTIONS
This is now exposed with libdrm_amdgpu 2.4.76. Signed-off-by: Samuel Pitoiset--- src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 1 file changed, 4 deletions(-) diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c index 37e0140311..39a05d0f02 100644 --- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c @@ -59,10 +59,6 @@ #define CIK__PIPE_CONFIG__ADDR_SURF_P16_32X32_8X16 16 #define CIK__PIPE_CONFIG__ADDR_SURF_P16_32X32_16X16 17 -#ifndef AMDGPU_INFO_NUM_EVICTIONS -#define AMDGPU_INFO_NUM_EVICTIONS 0x18 -#endif - static struct util_hash_table *dev_tab = NULL; static mtx_t dev_tab_mutex = _MTX_INITIALIZER_NP; -- 2.12.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] anv: add support for allocating more than 1 block of memory
Looking over the patch, I think I've convinced myself that it's correct. (I honestly wasn't expecting to come to that conclusion without more iteration.) That said, this raises some interesting questions. I added Kristian to the Cc in case he has any input. 1. Should we do powers of two or linear. I'm still a fan of powers of two. 2. Should block pools even have a block size at all? We could just make every block pool allow any power-of-two size from 4 KiB up to. say, 1 MiB and then make the block size part of the state pool or stream that's allocating from it. At the moment, I like this idea, but I've given it very little thought. 3. If we go with the idea in 2. should we still call it block_pool? I think we can keep the name but it doesn't it as well as it once did. Thanks for working on this! I'm sorry it's taken so long to respond. Every time I've looked at it, my brain hasn't been in the right state to think about lock-free code. :-/ On Wed, Mar 15, 2017 at 5:05 AM, Juan A. Suarez Romerowrote: > Current Anv allocator assign memory in terms of a fixed block size. > > But there can be cases where this block is not enough for a memory > request, and thus several blocks must be assigned in a row. > > This commit adds support for specifying how many blocks of memory must > be assigned. > > This fixes a number dEQP-VK.pipeline.render_to_image.* tests that crash. > > v2: lock-free free-list is not handled correctly (Jason) > --- > src/intel/vulkan/anv_allocator.c | 81 +++--- > > src/intel/vulkan/anv_batch_chain.c | 4 +- > src/intel/vulkan/anv_private.h | 7 +++- > 3 files changed, 66 insertions(+), 26 deletions(-) > > diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_ > allocator.c > index 45c663b..3924551 100644 > --- a/src/intel/vulkan/anv_allocator.c > +++ b/src/intel/vulkan/anv_allocator.c > @@ -257,7 +257,8 @@ anv_block_pool_init(struct anv_block_pool *pool, > pool->device = device; > anv_bo_init(>bo, 0, 0); > pool->block_size = block_size; > - pool->free_list = ANV_FREE_LIST_EMPTY; > + for (uint32_t i = 0; i < ANV_MAX_BLOCKS; i++) > + pool->free_list[i] = ANV_FREE_LIST_EMPTY; > pool->back_free_list = ANV_FREE_LIST_EMPTY; > > pool->fd = memfd_create("block pool", MFD_CLOEXEC); > @@ -500,30 +501,35 @@ fail: > > static uint32_t > anv_block_pool_alloc_new(struct anv_block_pool *pool, > - struct anv_block_state *pool_state) > + struct anv_block_state *pool_state, > + uint32_t n_blocks) > Maybe have this take a size rather than n_blocks? It's only ever called by stuff in the block pool so the caller can do the multiplication. It would certainly make some of the math below easier. > { > struct anv_block_state state, old, new; > > while (1) { > - state.u64 = __sync_fetch_and_add(_state->u64, > pool->block_size); > - if (state.next < state.end) { > + state.u64 = __sync_fetch_and_add(_state->u64, n_blocks * > pool->block_size); > + if (state.next > state.end) { > + futex_wait(_state->end, state.end); > + continue; > + } else if ((state.next + (n_blocks - 1) * pool->block_size) < > state.end) { > First off, please keep the if's in the same order unless we have a reason to re-arrange them. It would make this way easier to review. :-) Second, I think this would be much easier to read as: if (state.next + size <= state.end) { /* Success */ } else if (state.next <= state.end) { /* Our block is the one that crosses the line */ } else { /* Wait like everyone else */ } > assert(pool->map); > return state.next; > - } else if (state.next == state.end) { > - /* We allocated the first block outside the pool, we have to > grow it. > - * pool_state->next acts a mutex: threads who try to allocate > now will > - * get block indexes above the current limit and hit futex_wait > - * below. */ > - new.next = state.next + pool->block_size; > + } else { > + /* We allocated the firsts blocks outside the pool, we have to > grow > + * it. pool_state->next acts a mutex: threads who try to allocate > + * now will get block indexes above the current limit and hit > + * futex_wait below. > + */ > + new.next = state.next + n_blocks * pool->block_size; > new.end = anv_block_pool_grow(pool, pool_state); > + /* We assume that just growing once the pool is enough to fulfil > the > + * memory requirements > + */ > I think this is probably a reasonable assumption. That said, it wouldn't hurt to add a size parameter to block_pool_grow but I don't know that it's needed. > assert(new.end >= new.next && new.end % pool->block_size == 0); > old.u64 = __sync_lock_test_and_set(_state->u64, new.u64); >
[Mesa-dev] [PATCH] configure.ac: require libdrm_amdgpu 2.4.76 for Vega
From: Marek Olšák--- configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index ab9a91e..70885fb 100644 --- a/configure.ac +++ b/configure.ac @@ -67,21 +67,21 @@ OPENCL_VERSION=1 AC_SUBST([OPENCL_VERSION]) # The idea is that libdrm is distributed as one cohesive package, even # though it is composed of multiple libraries. However some drivers # may have different version requirements than others. This list # codifies which drivers need which version of libdrm. Any libdrm # version dependencies in non-driver-specific code should be reflected # in the first entry. LIBDRM_REQUIRED=2.4.75 LIBDRM_RADEON_REQUIRED=2.4.71 -LIBDRM_AMDGPU_REQUIRED=2.4.63 +LIBDRM_AMDGPU_REQUIRED=2.4.76 LIBDRM_INTEL_REQUIRED=2.4.75 LIBDRM_NVVIEUX_REQUIRED=2.4.66 LIBDRM_NOUVEAU_REQUIRED=2.4.66 LIBDRM_FREEDRENO_REQUIRED=2.4.74 LIBDRM_VC4_REQUIRED=2.4.69 LIBDRM_ETNAVIV_REQUIRED=2.4.74 dnl Versions for external dependencies DRI2PROTO_REQUIRED=2.8 DRI3PROTO_REQUIRED=1.0 -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: remove support for predicates from TGSI
[resend with snipped bits as it's too big] A couple comments inline. [snip] > --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c > +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c > @@ -746,39 +746,30 @@ static void lp_exec_default(struct lp_exec_mask *mask, > } > > > /* stores val into an address pointed to by dst_ptr. > * mask->exec_mask is used to figure out which bits of val > * should be stored into the address > * (0 means don't store this bit, 1 means do store). > */ > static void lp_exec_mask_store(struct lp_exec_mask *mask, > struct lp_build_context *bld_store, > - LLVMValueRef pred, > LLVMValueRef val, > LLVMValueRef dst_ptr) > { > LLVMBuilderRef builder = mask->bld->gallivm->builder; > + LLVMValueRef pred = mask->has_mask ? mask->exec_mask : NULL; Calling this "pred" now seems to be somewhat of a misnomer (wasn't all that great before because it then included exec_mask but it's worse now). > > assert(lp_check_value(bld_store->type, val)); > assert(LLVMGetTypeKind(LLVMTypeOf(dst_ptr)) == LLVMPointerTypeKind); > assert(LLVMGetElementType(LLVMTypeOf(dst_ptr)) == LLVMTypeOf(val)); > > - /* Mix the predicate and execution mask */ > - if (mask->has_mask) { > - if (pred) { > - pred = LLVMBuildAnd(builder, pred, mask->exec_mask, ""); > - } else { > - pred = mask->exec_mask; > - } > - } > - > if (pred) { >LLVMValueRef res, dst; > >dst = LLVMBuildLoad(builder, dst_ptr, ""); >res = lp_build_select(bld_store, pred, val, dst); >LLVMBuildStore(builder, res, dst_ptr); > } else >LLVMBuildStore(builder, val, dst_ptr); > } > > @@ -1029,36 +1020,26 @@ build_gather(struct lp_build_tgsi_context *bld_base, > > > /** > * Scatter/store vector. > */ > static void > emit_mask_scatter(struct lp_build_tgsi_soa_context *bld, >LLVMValueRef base_ptr, >LLVMValueRef indexes, >LLVMValueRef values, > - struct lp_exec_mask *mask, > - LLVMValueRef pred) > + struct lp_exec_mask *mask) > { > struct gallivm_state *gallivm = bld->bld_base.base.gallivm; > LLVMBuilderRef builder = gallivm->builder; > unsigned i; > - > - /* Mix the predicate and execution mask */ > - if (mask->has_mask) { > - if (pred) { > - pred = LLVMBuildAnd(builder, pred, mask->exec_mask, ""); > - } > - else { > - pred = mask->exec_mask; > - } > - } > + LLVMValueRef pred = mask->has_mask ? mask->exec_mask : NULL; same here. > diff --git a/src/gallium/include/pipe/p_shader_tokens.h > b/src/gallium/include/pipe/p_shader_tokens.h > index 6a3fb98..87d2d92 100644 > --- a/src/gallium/include/pipe/p_shader_tokens.h > +++ b/src/gallium/include/pipe/p_shader_tokens.h > @@ -62,21 +62,20 @@ struct tgsi_token > > enum tgsi_file_type { > TGSI_FILE_NULL, > TGSI_FILE_CONSTANT, > TGSI_FILE_INPUT, > TGSI_FILE_OUTPUT, > TGSI_FILE_TEMPORARY, > TGSI_FILE_SAMPLER, > TGSI_FILE_ADDRESS, > TGSI_FILE_IMMEDIATE, > - TGSI_FILE_PREDICATE, > TGSI_FILE_SYSTEM_VALUE, > TGSI_FILE_IMAGE, > TGSI_FILE_SAMPLER_VIEW, > TGSI_FILE_BUFFER, > TGSI_FILE_MEMORY, > TGSI_FILE_COUNT, /**< how many TGSI_FILE_ types */ > }; > > > #define TGSI_WRITEMASK_NONE 0x00 > @@ -609,34 +608,31 @@ struct tgsi_property_data { > > /** > * Opcode is the operation code to execute. A given operation defines the > * semantics how the source registers (if any) are interpreted and what is > * written to the destination registers (if any) as a result of execution. > * > * NumDstRegs and NumSrcRegs is the number of destination and source > registers, > * respectively. For a given operation code, those numbers are fixed and are > * present here only for convenience. > * > - * If Predicate is TRUE, tgsi_instruction_predicate token immediately > follows. > - * > * Saturate controls how are final results in destination registers modified. > */ > > struct tgsi_instruction > { > unsigned Type : 4; /* TGSI_TOKEN_TYPE_INSTRUCTION */ > unsigned NrTokens : 8; /* UINT */ > unsigned Opcode : 8; /* TGSI_OPCODE_ */ > unsigned Saturate : 1; /* BOOL */ > unsigned NumDstRegs : 2; /* UINT */ > unsigned NumSrcRegs : 4; /* UINT */ > - unsigned Predicate : 1; /* BOOL */ > unsigned Label : 1; > unsigned Texture: 1; > unsigned Memory : 1; > unsigned Padding: 1; The Padding doesn't match. So, we still have code which uses this - however this code is only used for some testing, otherwise we translate this d3d9 stuff away like everybody else. Maybe it's time to ditch this stuff then - clearly no other drivers are ever going to support
[Mesa-dev] [PATCH 6/9] radeonsi: handle incompatible DCC formats in resource_copy_region
From: Marek OlšákRequired because of later commits. --- src/gallium/drivers/radeonsi/si_blit.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_blit.c b/src/gallium/drivers/radeonsi/si_blit.c index bc5c2d6..ded8beb 100644 --- a/src/gallium/drivers/radeonsi/si_blit.c +++ b/src/gallium/drivers/radeonsi/si_blit.c @@ -934,20 +934,25 @@ void si_resource_copy_region(struct pipe_context *ctx, src_templ.format = PIPE_FORMAT_R32G32B32A32_UINT; break; default: fprintf(stderr, "Unhandled format %s with blocksize %u\n", util_format_short_name(src->format), blocksize); assert(0); } } } + vi_dcc_disable_if_incompatible_format(>b, dst, dst_level, + dst_templ.format); + vi_dcc_disable_if_incompatible_format(>b, src, src_level, + src_templ.format); + /* Initialize the surface. */ dst_view = r600_create_surface_custom(ctx, dst, _templ, dst_width, dst_height); /* Initialize the sampler view. */ src_view = si_create_sampler_view_custom(ctx, src, _templ, src_width0, src_height0, src_force_level); u_box_3d(dstx, dsty, dstz, abs(src_box->width), abs(src_box->height), -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/9] gallium/radeon: s/dcc_disable/disable_dcc/
From: Marek Olšák--- src/gallium/drivers/radeon/r600_pipe_common.h | 2 +- src/gallium/drivers/radeon/r600_texture.c | 4 ++-- src/gallium/drivers/radeonsi/si_blit.c| 10 +- src/gallium/drivers/radeonsi/si_state.c | 2 +- 4 files changed, 9 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 53fce50..035ab1c 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -784,21 +784,21 @@ void r600_texture_get_cmask_info(struct r600_common_screen *rscreen, struct r600_texture *rtex, struct r600_cmask_info *out); bool r600_init_flushed_depth_texture(struct pipe_context *ctx, struct pipe_resource *texture, struct r600_texture **staging); void r600_print_texture_info(struct r600_texture *rtex, FILE *f); struct pipe_resource *r600_texture_create(struct pipe_screen *screen, const struct pipe_resource *templ); bool vi_dcc_formats_compatible(enum pipe_format format1, enum pipe_format format2); -void vi_dcc_disable_if_incompatible_format(struct r600_common_context *rctx, +void vi_disable_dcc_if_incompatible_format(struct r600_common_context *rctx, struct pipe_resource *tex, unsigned level, enum pipe_format view_format); struct pipe_surface *r600_create_surface_custom(struct pipe_context *pipe, struct pipe_resource *texture, const struct pipe_surface *templ, unsigned width, unsigned height); unsigned r600_translate_colorswap(enum pipe_format format, bool do_endian_swap); void vi_separate_dcc_start_query(struct pipe_context *ctx, struct r600_texture *tex); diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c index 94024c8..783f50c 100644 --- a/src/gallium/drivers/radeon/r600_texture.c +++ b/src/gallium/drivers/radeon/r600_texture.c @@ -1733,21 +1733,21 @@ bool vi_dcc_formats_compatible(enum pipe_format format1, return false; type1 = vi_get_dcc_channel_type(desc1); type2 = vi_get_dcc_channel_type(desc2); return type1 != dcc_channel_incompatible && type2 != dcc_channel_incompatible && type1 == type2; } -void vi_dcc_disable_if_incompatible_format(struct r600_common_context *rctx, +void vi_disable_dcc_if_incompatible_format(struct r600_common_context *rctx, struct pipe_resource *tex, unsigned level, enum pipe_format view_format) { struct r600_texture *rtex = (struct r600_texture *)tex; if (vi_dcc_enabled(rtex, level) && !vi_dcc_formats_compatible(tex->format, view_format)) if (!r600_texture_disable_dcc(rctx, (struct r600_texture*)tex)) rctx->decompress_dcc(>b, rtex); @@ -1769,21 +1769,21 @@ struct pipe_surface *r600_create_surface_custom(struct pipe_context *pipe, pipe_reference_init(>base.reference, 1); pipe_resource_reference(>base.texture, texture); surface->base.context = pipe; surface->base.format = templ->format; surface->base.width = width; surface->base.height = height; surface->base.u = templ->u; if (texture->target != PIPE_BUFFER) - vi_dcc_disable_if_incompatible_format(rctx, texture, + vi_disable_dcc_if_incompatible_format(rctx, texture, templ->u.tex.level, templ->format); return >base; } static struct pipe_surface *r600_create_surface(struct pipe_context *pipe, struct pipe_resource *tex, const struct pipe_surface *templ) { diff --git a/src/gallium/drivers/radeonsi/si_blit.c b/src/gallium/drivers/radeonsi/si_blit.c index ded8beb..06a28f4 100644 --- a/src/gallium/drivers/radeonsi/si_blit.c +++ b/src/gallium/drivers/radeonsi/si_blit.c @@ -934,23 +934,23 @@ void si_resource_copy_region(struct pipe_context *ctx, src_templ.format = PIPE_FORMAT_R32G32B32A32_UINT; break; default: fprintf(stderr, "Unhandled format %s with blocksize %u\n",
[Mesa-dev] [PATCH 8/9] radeonsi: decompress DCC in set_framebuffer_state instead of create_surface
From: Marek Olšákfor threaded gallium, which can't use pipe_context in create_surface --- src/gallium/drivers/radeon/r600_pipe_common.h | 8 +++ src/gallium/drivers/radeon/r600_texture.c | 33 +++ src/gallium/drivers/radeonsi/si_state.c | 26 + 3 files changed, 62 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 035ab1c..c9cb586 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -276,20 +276,21 @@ struct r600_surface { struct pipe_surface base; bool color_initialized; bool depth_initialized; /* Misc. color flags. */ bool alphatest_bypass; bool export_16bpc; bool color_is_int8; bool color_is_int10; + bool dcc_incompatible; /* Color registers. */ unsigned cb_color_info; unsigned cb_color_base; unsigned cb_color_view; unsigned cb_color_size; /* R600 only */ unsigned cb_color_dim; /* EG only */ unsigned cb_color_pitch;/* EG and later */ unsigned cb_color_slice;/* EG and later */ unsigned cb_color_attrib; /* EG and later */ @@ -784,20 +785,27 @@ void r600_texture_get_cmask_info(struct r600_common_screen *rscreen, struct r600_texture *rtex, struct r600_cmask_info *out); bool r600_init_flushed_depth_texture(struct pipe_context *ctx, struct pipe_resource *texture, struct r600_texture **staging); void r600_print_texture_info(struct r600_texture *rtex, FILE *f); struct pipe_resource *r600_texture_create(struct pipe_screen *screen, const struct pipe_resource *templ); bool vi_dcc_formats_compatible(enum pipe_format format1, enum pipe_format format2); +bool vi_dcc_formats_are_incompatible(struct pipe_resource *tex, +unsigned level, +enum pipe_format view_format); +void vi_disable_dcc_if_incompatible_flag(struct r600_common_context *rctx, +struct pipe_resource *tex, +unsigned level, +bool dcc_incompatible); void vi_disable_dcc_if_incompatible_format(struct r600_common_context *rctx, struct pipe_resource *tex, unsigned level, enum pipe_format view_format); struct pipe_surface *r600_create_surface_custom(struct pipe_context *pipe, struct pipe_resource *texture, const struct pipe_surface *templ, unsigned width, unsigned height); unsigned r600_translate_colorswap(enum pipe_format format, bool do_endian_swap); void vi_separate_dcc_start_query(struct pipe_context *ctx, diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c index 783f50c..1191a74 100644 --- a/src/gallium/drivers/radeon/r600_texture.c +++ b/src/gallium/drivers/radeon/r600_texture.c @@ -1733,59 +1733,82 @@ bool vi_dcc_formats_compatible(enum pipe_format format1, return false; type1 = vi_get_dcc_channel_type(desc1); type2 = vi_get_dcc_channel_type(desc2); return type1 != dcc_channel_incompatible && type2 != dcc_channel_incompatible && type1 == type2; } +bool vi_dcc_formats_are_incompatible(struct pipe_resource *tex, +unsigned level, +enum pipe_format view_format) +{ + struct r600_texture *rtex = (struct r600_texture *)tex; + + return vi_dcc_enabled(rtex, level) && + !vi_dcc_formats_compatible(tex->format, view_format); +} + +void vi_disable_dcc_if_incompatible_flag(struct r600_common_context *rctx, +struct pipe_resource *tex, +unsigned level, +bool dcc_incompatible) +{ + struct r600_texture *rtex = (struct r600_texture *)tex; + + if (vi_dcc_enabled(rtex, level) && dcc_incompatible) + if (!r600_texture_disable_dcc(rctx, (struct r600_texture*)tex)) + rctx->decompress_dcc(>b, rtex); +} + +/* This can't be merged with the above function, because + * vi_dcc_formats_compatible should be called only when DCC is enabled. */ void vi_disable_dcc_if_incompatible_format(struct
[Mesa-dev] [PATCH 9/9] radeonsi: decompress DCC in set_sampler_view instead of create_sampler_view
From: Marek Olšák--- src/gallium/drivers/radeonsi/si_descriptors.c | 14 +++--- src/gallium/drivers/radeonsi/si_pipe.h| 1 + src/gallium/drivers/radeonsi/si_state.c | 7 --- 3 files changed, 16 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index 8010e59..9b1d1f4 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -422,47 +422,55 @@ static void si_set_sampler_view(struct si_context *sctx, struct si_sampler_views *views = >samplers[shader].views; struct si_sampler_view *rview = (struct si_sampler_view*)view; struct si_descriptors *descs = si_sampler_descriptors(sctx, shader); uint32_t *desc = descs->list + slot * 16; if (views->views[slot] == view && !disallow_early_out) return; if (view) { struct r600_texture *rtex = (struct r600_texture *)view->texture; + bool is_buffer = rtex->resource.b.b.target == PIPE_BUFFER; + + if (unlikely(!is_buffer && rview->dcc_incompatible)) { + vi_disable_dcc_if_incompatible_flag(>b, + >resource.b.b, + view->u.tex.first_level, + rview->dcc_incompatible); + rview->dcc_incompatible = false; + } assert(rtex); /* views with texture == NULL aren't supported */ pipe_sampler_view_reference(>views[slot], view); memcpy(desc, rview->state, 8*4); - if (rtex->resource.b.b.target == PIPE_BUFFER) { + if (is_buffer) { rtex->resource.bind_history |= PIPE_BIND_SAMPLER_VIEW; si_set_buf_desc_address(>resource, view->u.buf.offset, desc + 4); } else { bool is_separate_stencil = rtex->db_compatible && rview->is_stencil_sampler; si_set_mutable_tex_desc_fields(rtex, rview->base_level_info, rview->base_level, rview->base.u.tex.first_level, rview->block_width, is_separate_stencil, desc); } - if (rtex->resource.b.b.target != PIPE_BUFFER && - rtex->fmask.size) { + if (!is_buffer && rtex->fmask.size) { memcpy(desc + 8, rview->fmask_state, 8*4); } else { /* Disable FMASK and bind sampler state in [12:15]. */ memcpy(desc + 8, null_texture_descriptor, 4*4); if (views->sampler_states[slot]) memcpy(desc + 12, views->sampler_states[slot]->val, 4*4); diff --git a/src/gallium/drivers/radeonsi/si_pipe.h b/src/gallium/drivers/radeonsi/si_pipe.h index 617ec20..d1a8393 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.h +++ b/src/gallium/drivers/radeonsi/si_pipe.h @@ -120,20 +120,21 @@ struct si_blend_color { struct si_sampler_view { struct pipe_sampler_viewbase; /* [0..7] = image descriptor * [4..7] = buffer descriptor */ uint32_tstate[8]; uint32_tfmask_state[8]; const struct radeon_surf_level *base_level_info; unsignedbase_level; unsignedblock_width; bool is_stencil_sampler; + bool dcc_incompatible; }; #define SI_SAMPLER_STATE_MAGIC 0x34f1c35a struct si_sampler_state { #ifdef DEBUG unsignedmagic; #endif uint32_tval[4]; }; diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 39b9152..23b6473 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -3185,23 +3185,24 @@ si_create_sampler_view_custom(struct pipe_context *ctx, case PIPE_FORMAT_X24S8_UINT: case PIPE_FORMAT_S8X24_UINT: case PIPE_FORMAT_X32_S8X24_UINT: pipe_format = PIPE_FORMAT_S8_UINT;
[Mesa-dev] [PATCH 3/9] gallium/radeon: formalize that r600_query_hw_add_result doesn't need a context
From: Marek Olšák--- src/gallium/drivers/radeon/r600_perfcounter.c | 2 +- src/gallium/drivers/radeon/r600_query.c | 13 +++-- src/gallium/drivers/radeon/r600_query.h | 2 +- 3 files changed, 9 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_perfcounter.c b/src/gallium/drivers/radeon/r600_perfcounter.c index bf24aab..48f609b 100644 --- a/src/gallium/drivers/radeon/r600_perfcounter.c +++ b/src/gallium/drivers/radeon/r600_perfcounter.c @@ -189,21 +189,21 @@ static void r600_pc_query_emit_stop(struct r600_common_context *ctx, } static void r600_pc_query_clear_result(struct r600_query_hw *hwquery, union pipe_query_result *result) { struct r600_query_pc *query = (struct r600_query_pc *)hwquery; memset(result, 0, sizeof(result->batch[0]) * query->num_counters); } -static void r600_pc_query_add_result(struct r600_common_context *ctx, +static void r600_pc_query_add_result(struct r600_common_screen *rscreen, struct r600_query_hw *hwquery, void *buffer, union pipe_query_result *result) { struct r600_query_pc *query = (struct r600_query_pc *)hwquery; uint64_t *results = buffer; unsigned i, j; for (i = 0; i < query->num_counters; ++i) { struct r600_pc_counter *counter = >counters[i]; diff --git a/src/gallium/drivers/radeon/r600_query.c b/src/gallium/drivers/radeon/r600_query.c index e269c39..b4e36c8 100644 --- a/src/gallium/drivers/radeon/r600_query.c +++ b/src/gallium/drivers/radeon/r600_query.c @@ -508,21 +508,21 @@ static struct r600_query_ops query_hw_ops = { }; static void r600_query_hw_do_emit_start(struct r600_common_context *ctx, struct r600_query_hw *query, struct r600_resource *buffer, uint64_t va); static void r600_query_hw_do_emit_stop(struct r600_common_context *ctx, struct r600_query_hw *query, struct r600_resource *buffer, uint64_t va); -static void r600_query_hw_add_result(struct r600_common_context *ctx, +static void r600_query_hw_add_result(struct r600_common_screen *rscreen, struct r600_query_hw *, void *buffer, union pipe_query_result *result); static void r600_query_hw_clear_result(struct r600_query_hw *, union pipe_query_result *); static struct r600_query_hw_ops query_hw_default_hw_ops = { .prepare_buffer = r600_query_hw_prepare_buffer, .emit_start = r600_query_hw_do_emit_start, .emit_stop = r600_query_hw_do_emit_stop, .clear_result = r600_query_hw_clear_result, @@ -1030,26 +1030,26 @@ static unsigned r600_query_read_result(void *map, unsigned start_index, unsigned end = (uint64_t)current_result[end_index] | (uint64_t)current_result[end_index+1] << 32; if (!test_status_bit || ((start & 0x8000UL) && (end & 0x8000UL))) { return end - start; } return 0; } -static void r600_query_hw_add_result(struct r600_common_context *ctx, +static void r600_query_hw_add_result(struct r600_common_screen *rscreen, struct r600_query_hw *query, void *buffer, union pipe_query_result *result) { - unsigned max_rbs = ctx->screen->info.num_render_backends; + unsigned max_rbs = rscreen->info.num_render_backends; switch (query->b.type) { case PIPE_QUERY_OCCLUSION_COUNTER: { for (unsigned i = 0; i < max_rbs; ++i) { unsigned results_base = i * 16; result->u64 += r600_query_read_result(buffer + results_base, 0, 2, true); } break; } @@ -1085,21 +1085,21 @@ static void r600_query_hw_add_result(struct r600_common_context *ctx, r600_query_read_result(buffer, 2, 6, true); result->so_statistics.primitives_storage_needed += r600_query_read_result(buffer, 0, 4, true); break; case PIPE_QUERY_SO_OVERFLOW_PREDICATE: result->b = result->b || r600_query_read_result(buffer, 2, 6, true) != r600_query_read_result(buffer, 0, 4, true); break; case PIPE_QUERY_PIPELINE_STATISTICS: - if (ctx->chip_class >= EVERGREEN) { + if (rscreen->chip_class >= EVERGREEN) {
[Mesa-dev] [PATCH 0/9] RadeonSI cleanups
General cleanups and cleanups in preparation for threaded gallium. Please review. Thanks, Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/9] radeonsi: remove a workaround for inexact *8_SNORM blits
From: Marek OlšákAll tests pass on Fiji now. This prevents DCC disablement due to incompatible DCC formats due to the fallback. --- src/gallium/drivers/radeonsi/si_blit.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_blit.c b/src/gallium/drivers/radeonsi/si_blit.c index 0466f19..bc5c2d6 100644 --- a/src/gallium/drivers/radeonsi/si_blit.c +++ b/src/gallium/drivers/radeonsi/si_blit.c @@ -888,23 +888,21 @@ void si_resource_copy_region(struct pipe_context *ctx, sbox.x = util_format_get_nblocksx(src->format, src_box->x); sbox.y = util_format_get_nblocksy(src->format, src_box->y); sbox.z = src_box->z; sbox.width = util_format_get_nblocksx(src->format, src_box->width); sbox.height = util_format_get_nblocksy(src->format, src_box->height); sbox.depth = src_box->depth; src_box = src_force_level = src_level; - } else if (!util_blitter_is_copy_supported(sctx->blitter, dst, src) || - /* also *8_SNORM has precision issues, use UNORM instead */ - util_format_is_snorm8(src->format)) { + } else if (!util_blitter_is_copy_supported(sctx->blitter, dst, src)) { if (util_format_is_subsampled_422(src->format)) { src_templ.format = PIPE_FORMAT_R8G8B8A8_UINT; dst_templ.format = PIPE_FORMAT_R8G8B8A8_UINT; dst_width = util_format_get_nblocksx(dst->format, dst_width); src_width0 = util_format_get_nblocksx(src->format, src_width0); dstx = util_format_get_nblocksx(dst->format, dstx); sbox = *src_box; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/9] gallium/radeon: add and use a new helper vi_dcc_enabled
From: Marek Olšák--- src/gallium/drivers/radeon/r600_pipe_common.h | 6 ++ src/gallium/drivers/radeon/r600_texture.c | 11 +-- src/gallium/drivers/radeonsi/si_blit.c| 6 ++ src/gallium/drivers/radeonsi/si_descriptors.c | 5 ++--- src/gallium/drivers/radeonsi/si_state.c | 2 +- 5 files changed, 16 insertions(+), 14 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 3516884..53fce50 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -941,20 +941,26 @@ r600_get_sampler_view_priority(struct r600_resource *res) return RADEON_PRIO_SAMPLER_TEXTURE; } static inline bool r600_can_sample_zs(struct r600_texture *tex, bool stencil_sampler) { return (stencil_sampler && tex->can_sample_s) || (!stencil_sampler && tex->can_sample_z); } +static inline bool +vi_dcc_enabled(struct r600_texture *tex, unsigned level) +{ + return tex->dcc_offset && level < tex->surface.num_dcc_levels; +} + #define COMPUTE_DBG(rscreen, fmt, args...) \ do { \ if ((rscreen->b.debug_flags & DBG_COMPUTE)) fprintf(stderr, fmt, ##args); \ } while (0); #define R600_ERR(fmt, args...) \ fprintf(stderr, "EE %s:%d %s - " fmt, __FILE__, __LINE__, __func__, ##args) /* For MSAA sample positions. */ #define FILL_SREG(s0x, s0y, s1x, s1y, s2x, s2y, s3x, s3y) \ diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c index ec7a325..94024c8 100644 --- a/src/gallium/drivers/radeon/r600_texture.c +++ b/src/gallium/drivers/radeon/r600_texture.c @@ -65,22 +65,22 @@ bool r600_prepare_for_dma_blit(struct r600_common_context *rctx, * When dst is linear, the DB->CB copy preserves HTILE. * When dst is tiled, the 3D path must be used to update HTILE. */ if (rsrc->is_depth || rdst->is_depth) return false; /* DCC as: * src: Use the 3D path. DCC decompression is expensive. * dst: Use the 3D path to compress the pixels with DCC. */ - if ((rsrc->dcc_offset && src_level < rsrc->surface.num_dcc_levels) || - (rdst->dcc_offset && dst_level < rdst->surface.num_dcc_levels)) + if (vi_dcc_enabled(rsrc, src_level) || + vi_dcc_enabled(rdst, dst_level)) return false; /* CMASK as: * src: Both texture and SDMA paths need decompression. Use SDMA. * dst: If overwriting the whole texture, discard CMASK and use *SDMA. Otherwise, use the 3D path. */ if (rdst->cmask.size && rdst->dirty_level_mask & (1 << dst_level)) { /* The CMASK clear is only enabled for the first level. */ assert(dst_level == 0); @@ -1740,22 +1740,21 @@ bool vi_dcc_formats_compatible(enum pipe_format format1, type1 == type2; } void vi_dcc_disable_if_incompatible_format(struct r600_common_context *rctx, struct pipe_resource *tex, unsigned level, enum pipe_format view_format) { struct r600_texture *rtex = (struct r600_texture *)tex; - if (rtex->dcc_offset && - level < rtex->surface.num_dcc_levels && + if (vi_dcc_enabled(rtex, level) && !vi_dcc_formats_compatible(tex->format, view_format)) if (!r600_texture_disable_dcc(rctx, (struct r600_texture*)tex)) rctx->decompress_dcc(>b, rtex); } struct pipe_surface *r600_create_surface_custom(struct pipe_context *pipe, struct pipe_resource *texture, const struct pipe_surface *templ, unsigned width, unsigned height) { @@ -2307,21 +2306,21 @@ static bool vi_get_fast_clear_parameters(enum pipe_format surface_format, return true; } void vi_dcc_clear_level(struct r600_common_context *rctx, struct r600_texture *rtex, unsigned level, unsigned clear_value) { struct pipe_resource *dcc_buffer; uint64_t dcc_offset; - assert(rtex->dcc_offset && level < rtex->surface.num_dcc_levels); + assert(vi_dcc_enabled(rtex, level)); if (rtex->dcc_separate_buffer) { dcc_buffer = >dcc_separate_buffer->b.b; dcc_offset = 0; } else { dcc_buffer = >resource.b.b; dcc_offset = rtex->dcc_offset; } dcc_offset += rtex->surface.level[level].dcc_offset; @@ -2478,21 +2477,21 @@ void evergreen_do_fast_color_clear(struct r600_common_context *rctx, /* Stoney can't do
[Mesa-dev] [PATCH 1/9] gallium/util: use const in u_index_modify helpers
From: Marek Olšák--- src/gallium/auxiliary/util/u_index_modify.c | 6 +++--- src/gallium/auxiliary/util/u_index_modify.h | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/gallium/auxiliary/util/u_index_modify.c b/src/gallium/auxiliary/util/u_index_modify.c index 7b072b2..d86be24 100644 --- a/src/gallium/auxiliary/util/u_index_modify.c +++ b/src/gallium/auxiliary/util/u_index_modify.c @@ -20,21 +20,21 @@ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE * USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include "pipe/p_context.h" #include "util/u_index_modify.h" #include "util/u_inlines.h" /* Ubyte indices. */ void util_shorten_ubyte_elts_to_userptr(struct pipe_context *context, - struct pipe_index_buffer *ib, + const struct pipe_index_buffer *ib, unsigned add_transfer_flags, int index_bias, unsigned start, unsigned count, void *out) { struct pipe_transfer *src_transfer = NULL; const unsigned char *in_map; unsigned short *out_map = out; unsigned i; @@ -55,21 +55,21 @@ void util_shorten_ubyte_elts_to_userptr(struct pipe_context *context, out_map++; } if (src_transfer) pipe_buffer_unmap(context, src_transfer); } /* Ushort indices. */ void util_rebuild_ushort_elts_to_userptr(struct pipe_context *context, -struct pipe_index_buffer *ib, +const struct pipe_index_buffer *ib, unsigned add_transfer_flags, int index_bias, unsigned start, unsigned count, void *out) { struct pipe_transfer *in_transfer = NULL; const unsigned short *in_map; unsigned short *out_map = out; unsigned i; @@ -89,21 +89,21 @@ void util_rebuild_ushort_elts_to_userptr(struct pipe_context *context, out_map++; } if (in_transfer) pipe_buffer_unmap(context, in_transfer); } /* Uint indices. */ void util_rebuild_uint_elts_to_userptr(struct pipe_context *context, - struct pipe_index_buffer *ib, + const struct pipe_index_buffer *ib, unsigned add_transfer_flags, int index_bias, unsigned start, unsigned count, void *out) { struct pipe_transfer *in_transfer = NULL; const unsigned int *in_map; unsigned int *out_map = out; unsigned i; diff --git a/src/gallium/auxiliary/util/u_index_modify.h b/src/gallium/auxiliary/util/u_index_modify.h index 0cfc189..d009199 100644 --- a/src/gallium/auxiliary/util/u_index_modify.h +++ b/src/gallium/auxiliary/util/u_index_modify.h @@ -21,32 +21,32 @@ * USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef UTIL_INDEX_MODIFY_H #define UTIL_INDEX_MODIFY_H struct pipe_context; struct pipe_resource; struct pipe_index_buffer; void util_shorten_ubyte_elts_to_userptr(struct pipe_context *context, - struct pipe_index_buffer *ib, + const struct pipe_index_buffer *ib, unsigned add_transfer_flags, int index_bias, unsigned start, unsigned count, void *out); void util_rebuild_ushort_elts_to_userptr(struct pipe_context *context, -struct pipe_index_buffer *ib, +const struct pipe_index_buffer *ib, unsigned add_transfer_flags, int index_bias, unsigned start, unsigned count, void *out); void util_rebuild_uint_elts_to_userptr(struct pipe_context *context, - struct pipe_index_buffer *ib, + const struct pipe_index_buffer *ib, unsigned add_transfer_flags, int index_bias, unsigned start, unsigned count, void *out); #endif -- 2.7.4 ___ mesa-dev mailing list
[Mesa-dev] [PATCH 2/9] radeonsi: don't make a copy of pipe_index_buffer in draw_vbo
From: Marek Olšák--- src/gallium/drivers/radeonsi/si_state_draw.c | 59 +--- 1 file changed, 27 insertions(+), 32 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 1ff1547..6882ff4 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -991,21 +991,22 @@ void si_ce_post_draw_synchronization(struct si_context *sctx) radeon_emit(sctx->b.gfx.cs, 0); sctx->ce_need_synchronization = false; } } void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info) { struct si_context *sctx = (struct si_context *)ctx; struct si_state_rasterizer *rs = sctx->queued.named.rasterizer; - struct pipe_index_buffer ib = {}; + const struct pipe_index_buffer *ib = >index_buffer; + struct pipe_index_buffer ib_tmp; /* for index buffer uploads only */ unsigned mask, dirty_tex_counter, rast_prim; if (likely(!info->indirect)) { /* SI-CI treat instance_count==0 as instance_count==1. There is * no workaround for indirect draws, but we can at least skip * direct draws. */ if (unlikely(!info->instance_count)) return; @@ -1076,78 +1077,72 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info) sctx->do_update_shaders = true; } } if (sctx->do_update_shaders && !si_update_shaders(sctx)) return; if (!si_upload_graphics_shader_descriptors(sctx)) return; - if (info->indexed) { - /* Initialize the index buffer struct. */ - pipe_resource_reference(, sctx->index_buffer.buffer); - ib.user_buffer = sctx->index_buffer.user_buffer; - ib.index_size = sctx->index_buffer.index_size; - ib.offset = sctx->index_buffer.offset; + ib_tmp.buffer = NULL; + if (info->indexed) { /* Translate or upload, if needed. */ /* 8-bit indices are supported on VI. */ - if (sctx->b.chip_class <= CIK && ib.index_size == 1) { - struct pipe_resource *out_buffer = NULL; - unsigned out_offset, start, count, start_offset, size; + if (sctx->b.chip_class <= CIK && ib->index_size == 1) { + unsigned start, count, start_offset, size; void *ptr; si_get_draw_start_count(sctx, info, , ); start_offset = start * 2; size = count * 2; u_upload_alloc(ctx->stream_uploader, start_offset, size, si_optimal_tcc_alignment(sctx, size), - _offset, _buffer, ); - if (!out_buffer) { - pipe_resource_reference(, NULL); + _tmp.offset, _tmp.buffer, ); + if (!ib_tmp.buffer) return; - } - util_shorten_ubyte_elts_to_userptr(>b.b, , 0, 0, - ib.offset + start, + util_shorten_ubyte_elts_to_userptr(>b.b, ib, 0, 0, + ib->offset + start, count, ptr); - pipe_resource_reference(, NULL); - ib.user_buffer = NULL; - ib.buffer = out_buffer; /* info->start will be added by the drawing code */ - ib.offset = out_offset - start_offset; - ib.index_size = 2; - } else if (ib.user_buffer && !ib.buffer) { + ib_tmp.offset -= start_offset; + ib_tmp.index_size = 2; + ib = _tmp; + } else if (ib->user_buffer && !ib->buffer) { unsigned start, count, start_offset; si_get_draw_start_count(sctx, info, , ); - start_offset = start * ib.index_size; + start_offset = start * ib->index_size; u_upload_data(ctx->stream_uploader, start_offset, - count * ib.index_size, + count * ib->index_size, sctx->screen->b.info.tcc_cache_line_size, - (char*)ib.user_buffer + start_offset, - , );
Re: [Mesa-dev] [PATCH v2 1/2] anv: Add support for 48-bit addresses
On Wed, Mar 29, 2017 at 8:59 AM, Kristian H. Kristensenwrote: > Jason Ekstrand writes: > > > This commit adds support for using the full 48-bit address space on > > Broadwell and newer hardware. Thanks to certain limitations, not all > > objects can be placed above the 32-bit boundary. In particular, general > > and state base address need to live within 32 bits. (See also > > Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.) In order > > to handle this, we add a supports_48bit_address field to anv_bo and only > > set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set. We set the bit > > for all client-allocated memory objects but leave it false for > > driver-allocated objects. While this is more conservative than needed, > > all driver allocations should easily fit in the first 32 bits of address > > space and keeps things simple because we don't have to think about > > whether or not any given one of our allocation data structures will be > > used in a 48-bit-unsafe way. > > --- > > src/intel/vulkan/anv_allocator.c | 10 -- > > src/intel/vulkan/anv_batch_chain.c | 14 ++ > > src/intel/vulkan/anv_device.c | 4 +++- > > src/intel/vulkan/anv_gem.c | 18 ++ > > src/intel/vulkan/anv_intel.c | 2 +- > > src/intel/vulkan/anv_private.h | 29 +++-- > > 6 files changed, 67 insertions(+), 10 deletions(-) > > > > diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_ > allocator.c > > index 45c663b..88c9c13 100644 > > --- a/src/intel/vulkan/anv_allocator.c > > +++ b/src/intel/vulkan/anv_allocator.c > > @@ -255,7 +255,7 @@ anv_block_pool_init(struct anv_block_pool *pool, > > assert(util_is_power_of_two(block_size)); > > > > pool->device = device; > > - anv_bo_init(>bo, 0, 0); > > + anv_bo_init(>bo, 0, 0, false); > > pool->block_size = block_size; > > pool->free_list = ANV_FREE_LIST_EMPTY; > > pool->back_free_list = ANV_FREE_LIST_EMPTY; > > @@ -475,7 +475,13 @@ anv_block_pool_grow(struct anv_block_pool *pool, > struct anv_block_state *state) > > * values back into pool. */ > > pool->map = map + center_bo_offset; > > pool->center_bo_offset = center_bo_offset; > > - anv_bo_init(>bo, gem_handle, size); > > + > > + /* Block pool BOs are marked as not supporting 48-bit addresses > because > > +* they are used to back STATE_BASE_ADDRESS. > > +* > > +* See also anv_bo::supports_48bit_address. > > +*/ > > + anv_bo_init(>bo, gem_handle, size, false); > > pool->bo.map = map; > > > > done: > > diff --git a/src/intel/vulkan/anv_batch_chain.c > b/src/intel/vulkan/anv_batch_chain.c > > index 5d7abc6..b098e4b 100644 > > --- a/src/intel/vulkan/anv_batch_chain.c > > +++ b/src/intel/vulkan/anv_batch_chain.c > > @@ -979,7 +979,8 @@ anv_execbuf_finish(struct anv_execbuf *exec, > > } > > > > static VkResult > > -anv_execbuf_add_bo(struct anv_execbuf *exec, > > +anv_execbuf_add_bo(struct anv_device *device, > > + struct anv_execbuf *exec, > > struct anv_bo *bo, > > struct anv_reloc_list *relocs, > > const VkAllocationCallbacks *alloc) > > @@ -1039,6 +1040,10 @@ anv_execbuf_add_bo(struct anv_execbuf *exec, > >obj->flags = bo->is_winsys_bo ? EXEC_OBJECT_WRITE : 0; > >obj->rsvd1 = 0; > >obj->rsvd2 = 0; > > + > > + if (device->instance->physicalDevice.supports_48bit_addresses && > > + bo->supports_48bit_address) > > + obj->flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS; > > } > > > > if (relocs != NULL && obj->relocation_count == 0) { > > @@ -1052,7 +1057,7 @@ anv_execbuf_add_bo(struct anv_execbuf *exec, > >for (size_t i = 0; i < relocs->num_relocs; i++) { > > /* A quick sanity check on relocations */ > > assert(relocs->relocs[i].offset < bo->size); > > - anv_execbuf_add_bo(exec, relocs->reloc_bos[i], NULL, alloc); > > + anv_execbuf_add_bo(device, exec, relocs->reloc_bos[i], NULL, > alloc); > >} > > } > > > > @@ -1264,7 +1269,8 @@ anv_cmd_buffer_execbuf(struct anv_device *device, > > adjust_relocations_from_state_pool(ss_pool, > _buffer->surface_relocs, > >cmd_buffer->last_ss_pool_center); > > VkResult result = > > - anv_execbuf_add_bo(, _pool->bo, > _buffer->surface_relocs, > > + anv_execbuf_add_bo(device, , _pool->bo, > > + _buffer->surface_relocs, > > _buffer->pool->alloc); > > if (result != VK_SUCCESS) > >return result; > > @@ -1277,7 +1283,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device, > >adjust_relocations_to_state_pool(ss_pool, &(*bbo)->bo, > &(*bbo)->relocs, > > cmd_buffer->last_ss_pool_ > center); > > > > - anv_execbuf_add_bo(, &(*bbo)->bo,
[Mesa-dev] [PATCH v2 2/2] anv: Query the kernel for reset status
When a client causes a GPU hang (or experiences issues due to a hang in another client) we want to let it know as soon as possible. In particular, if it submits work with a fence and calls vkWaitForFences or vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be able to trust the results of that rendering. In order to provide this guarantee, we have to ask the kernel for context status in a few key locations. v2 (Jason Ekstrand): - Slight restructuring and much better error logging --- src/intel/vulkan/anv_device.c | 114 + src/intel/vulkan/anv_gem.c | 17 ++ src/intel/vulkan/anv_private.h | 5 ++ src/intel/vulkan/genX_query.c | 11 ++-- 4 files changed, 107 insertions(+), 40 deletions(-) diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 5f0d00f..109a2a1 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -884,8 +884,6 @@ anv_device_submit_simple_batch(struct anv_device *device, struct anv_bo bo, *exec_bos[1]; VkResult result = VK_SUCCESS; uint32_t size; - int64_t timeout; - int ret; /* Kernel driver requires 8 byte aligned batch length */ size = align_u32(batch->next - batch->start, 8); @@ -925,14 +923,7 @@ anv_device_submit_simple_batch(struct anv_device *device, if (result != VK_SUCCESS) goto fail; - timeout = INT64_MAX; - ret = anv_gem_wait(device, bo.gem_handle, ); - if (ret != 0) { - /* We don't know the real error. */ - device->lost = true; - result = vk_errorf(VK_ERROR_DEVICE_LOST, "execbuf2 failed: %m"); - goto fail; - } + result = anv_device_wait(device, , INT64_MAX); fail: anv_bo_pool_free(>batch_bo_pool, ); @@ -1264,6 +1255,58 @@ anv_device_execbuf(struct anv_device *device, return VK_SUCCESS; } +VkResult +anv_device_query_status(struct anv_device *device) +{ + /* This isn't likely as most of the callers of this function already check +* for it. However, it doesn't hurt to check and it potentially lets us +* avoid an ioctl. +*/ + if (unlikely(device->lost)) + return VK_ERROR_DEVICE_LOST; + + uint32_t active, pending; + int ret = anv_gem_gpu_get_reset_stats(device, , ); + if (ret == -1) { + /* We don't know the real error. */ + device->lost = true; + return vk_errorf(VK_ERROR_DEVICE_LOST, "get_reset_stats failed: %m"); + } + + if (active) { + device->lost = true; + return vk_errorf(VK_ERROR_DEVICE_LOST, + "GPU hung on one of our command buffers"); + } else if (pending) { + device->lost = true; + return vk_errorf(VK_ERROR_DEVICE_LOST, + "GPU hung with commands in-flight"); + } + + return VK_SUCCESS; +} + +VkResult +anv_device_wait(struct anv_device *device, struct anv_bo *bo, +int64_t timeout) +{ + int ret = anv_gem_wait(device, bo->gem_handle, ); + if (ret == -1 && errno == ETIME) { + return VK_TIMEOUT; + } else if (ret == -1) { + /* We don't know the real error. */ + device->lost = true; + return vk_errorf(VK_ERROR_DEVICE_LOST, "gem wait failed: %m"); + } + + /* Query for device status after the wait. If the BO we're waiting on got +* caught in a GPU hang we don't want to return VK_SUCCESS to the client +* because it clearly doesn't have valid data. Yes, this most likely means +* an ioctl, but we just did an ioctl to wait so it's no great loss. +*/ + return anv_device_query_status(device); +} + VkResult anv_QueueSubmit( VkQueue _queue, uint32_tsubmitCount, @@ -1273,10 +1316,17 @@ VkResult anv_QueueSubmit( ANV_FROM_HANDLE(anv_queue, queue, _queue); ANV_FROM_HANDLE(anv_fence, fence, _fence); struct anv_device *device = queue->device; - if (unlikely(device->lost)) - return VK_ERROR_DEVICE_LOST; - VkResult result = VK_SUCCESS; + /* Query for device status prior to submitting. Technically, we don't need +* to do this. However, if we have a client that's submitting piles of +* garbage, we would rather break as early as possible to keep the GPU +* hanging contained. If we don't check here, we'll either be waiting for +* the kernel to kick us or we'll have to wait until the client waits on a +* fence before we actually know whether or not we've hung. +*/ + VkResult result = anv_device_query_status(device); + if (!result) + return result; /* We lock around QueueSubmit for three main reasons: * @@ -1802,9 +1852,6 @@ VkResult anv_GetFenceStatus( if (unlikely(device->lost)) return VK_ERROR_DEVICE_LOST; - int64_t t = 0; - int ret; - switch (fence->state) { case ANV_FENCE_STATE_RESET: /* If it hasn't even been sent off to the GPU yet, it's not ready */ @@ -1814,15 +1861,18 @@ VkResult anv_GetFenceStatus( /* It's been
Re: [Mesa-dev] [PATCH] [RFC v3] mesa/glthread: Call unmarshal_batch directly in glthread_finish
I would be very grateful if someone could help with testing performance impact of this change. On Wed, Mar 29, 2017 at 7:31 PM, Bartosz Tomczyk < bartosz.tomczy...@gmail.com> wrote: > Call it directly when batch queue is empty. This avoids costly thread > synchronisation. With this fix games that previously regressed > with mesa_glthread=true like xonotic or grid autosport. > --- > src/mesa/main/glthread.c | 47 ++ > - > 1 file changed, 34 insertions(+), 13 deletions(-) > > diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c > index 06115b916d..faf42c2b89 100644 > --- a/src/mesa/main/glthread.c > +++ b/src/mesa/main/glthread.c > @@ -194,16 +194,12 @@ _mesa_glthread_restore_dispatch(struct gl_context > *ctx) > } > } > > -void > -_mesa_glthread_flush_batch(struct gl_context *ctx) > +static void > +_mesa_glthread_flush_batch_locked(struct gl_context *ctx) > { > struct glthread_state *glthread = ctx->GLThread; > - struct glthread_batch *batch; > - > - if (!glthread) > - return; > - > - batch = glthread->batch; > + struct glthread_batch *batch = glthread->batch; > + > if (!batch->used) >return; > > @@ -223,10 +219,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx) >return; > } > > - pthread_mutex_lock(>mutex); > *glthread->batch_queue_tail = batch; > glthread->batch_queue_tail = >next; > pthread_cond_broadcast(>new_work); > + > +} > +void > +_mesa_glthread_flush_batch(struct gl_context *ctx) > +{ > + struct glthread_state *glthread = ctx->GLThread; > + struct glthread_batch *batch; > + > + if (!glthread) > + return; > + > + batch = glthread->batch; > + if (!batch->used) > + return; > + > + pthread_mutex_lock(>mutex); > + _mesa_glthread_flush_batch_locked(ctx); > pthread_mutex_unlock(>mutex); > } > > @@ -252,12 +264,21 @@ _mesa_glthread_finish(struct gl_context *ctx) > if (pthread_self() == glthread->thread) >return; > > - _mesa_glthread_flush_batch(ctx); > - > pthread_mutex_lock(>mutex); > > - while (glthread->batch_queue || glthread->busy) > - pthread_cond_wait(>work_done, >mutex); > + if (!(glthread->batch_queue || glthread->busy)) { > + if (glthread->batch && glthread->batch->used) { > + struct _glapi_table *dispatch = _glapi_get_dispatch(); > + glthread_unmarshal_batch(ctx, glthread->batch); > + _glapi_set_dispatch(dispatch); > + glthread_allocate_batch(ctx); > + } > + } > + else { > + _mesa_glthread_flush_batch_locked(ctx); > + while (glthread->batch_queue || glthread->busy) > + pthread_cond_wait(>work_done, >mutex); > + } > > pthread_mutex_unlock(>mutex); > } > -- > 2.12.2 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] [RFC v3] mesa/glthread: Call unmarshal_batch directly in glthread_finish
Call it directly when batch queue is empty. This avoids costly thread synchronisation. With this fix games that previously regressed with mesa_glthread=true like xonotic or grid autosport. --- src/mesa/main/glthread.c | 47 ++- 1 file changed, 34 insertions(+), 13 deletions(-) diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c index 06115b916d..faf42c2b89 100644 --- a/src/mesa/main/glthread.c +++ b/src/mesa/main/glthread.c @@ -194,16 +194,12 @@ _mesa_glthread_restore_dispatch(struct gl_context *ctx) } } -void -_mesa_glthread_flush_batch(struct gl_context *ctx) +static void +_mesa_glthread_flush_batch_locked(struct gl_context *ctx) { struct glthread_state *glthread = ctx->GLThread; - struct glthread_batch *batch; - - if (!glthread) - return; - - batch = glthread->batch; + struct glthread_batch *batch = glthread->batch; + if (!batch->used) return; @@ -223,10 +219,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx) return; } - pthread_mutex_lock(>mutex); *glthread->batch_queue_tail = batch; glthread->batch_queue_tail = >next; pthread_cond_broadcast(>new_work); + +} +void +_mesa_glthread_flush_batch(struct gl_context *ctx) +{ + struct glthread_state *glthread = ctx->GLThread; + struct glthread_batch *batch; + + if (!glthread) + return; + + batch = glthread->batch; + if (!batch->used) + return; + + pthread_mutex_lock(>mutex); + _mesa_glthread_flush_batch_locked(ctx); pthread_mutex_unlock(>mutex); } @@ -252,12 +264,21 @@ _mesa_glthread_finish(struct gl_context *ctx) if (pthread_self() == glthread->thread) return; - _mesa_glthread_flush_batch(ctx); - pthread_mutex_lock(>mutex); - while (glthread->batch_queue || glthread->busy) - pthread_cond_wait(>work_done, >mutex); + if (!(glthread->batch_queue || glthread->busy)) { + if (glthread->batch && glthread->batch->used) { + struct _glapi_table *dispatch = _glapi_get_dispatch(); + glthread_unmarshal_batch(ctx, glthread->batch); + _glapi_set_dispatch(dispatch); + glthread_allocate_batch(ctx); + } + } + else { + _mesa_glthread_flush_batch_locked(ctx); + while (glthread->batch_queue || glthread->busy) + pthread_cond_wait(>work_done, >mutex); + } pthread_mutex_unlock(>mutex); } -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 05/25] gallium: add sparse buffer interface and capability
On 29.03.2017 16:27, Marek Olšák wrote: On Wed, Mar 29, 2017 at 12:26 PM, Nicolai Hähnlewrote: On 28.03.2017 21:46, Marek Olšák wrote: On Tue, Mar 28, 2017 at 11:11 AM, Nicolai Hähnle wrote: From: Nicolai Hähnle TODO fill out caps in all drivers v2: - explain the resource_commit interface in more detail --- src/gallium/docs/source/context.rst | 25 + src/gallium/docs/source/screen.rst | 3 +++ src/gallium/include/pipe/p_context.h | 13 + src/gallium/include/pipe/p_defines.h | 2 ++ 4 files changed, 43 insertions(+) diff --git a/src/gallium/docs/source/context.rst b/src/gallium/docs/source/context.rst index a053193..5949ff2 100644 --- a/src/gallium/docs/source/context.rst +++ b/src/gallium/docs/source/context.rst @@ -611,20 +611,45 @@ for both regular textures as well as for framebuffers read via FBFETCH. .. _memory_barrier: memory_barrier %%% This function flushes caches according to which of the PIPE_BARRIER_* flags are set. +.. _resource_commit: + +resource_commit +%%% + +This function changes the commit state of a part of a sparse resource. Sparse +resources are created by setting the ``PIPE_RESOURCE_FLAG_SPARSE`` flag when +calling ``resource_create``. Initially, sparse resources only reserve a virtual +memory region that is not backed by memory (i.e., it is uncommitted). The +``resource_commit`` function can be called to commit or uncommit parts (or all) +of a resource. The driver manages the underlying backing memory. + +The contents of newly committed memory regions are undefined. Calling this +function to commit an already committed memory region is allowed and leaves its +content unchanged. Similarly, calling this function to uncommit an already +uncommitted memory region is allowed. + +For buffers, the given box must be aligned to multiples of +``PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE``. As an exception to this rule, if the size +of the buffer is not a multiple of the page size, changing the commit state of +the last (partial) page requires a box that ends at the end of the buffer +(i.e., box->x + box->width == buffer->width0). + + + .. _pipe_transfer: PIPE_TRANSFER ^ These flags control the behavior of a transfer object. ``PIPE_TRANSFER_READ`` Resource contents read back (or accessed directly) at transfer create time. diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index 00c9503..8759639 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -369,20 +369,23 @@ The integer capabilities: opcode to retrieve the current value in the framebuffer. * ``PIPE_CAP_TGSI_MUL_ZERO_WINS``: Whether TGSI shaders support the ``TGSI_PROPERTY_MUL_ZERO_WINS`` shader property. * ``PIPE_CAP_DOUBLES``: Whether double precision floating-point operations are supported. * ``PIPE_CAP_INT64``: Whether 64-bit integer operations are supported. * ``PIPE_CAP_INT64_DIVMOD``: Whether 64-bit integer division/modulo operations are supported. * ``PIPE_CAP_TGSI_TEX_TXF_LZ``: Whether TEX_LZ and TXF_LZ opcodes are supported. +* ``PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE``: The page size of sparse buffers in + bytes, or 0 if sparse buffers are not supported. The page size must be at + most 64KB. .. _pipe_capf: PIPE_CAPF_* The floating-point capabilities are: * ``PIPE_CAPF_MAX_LINE_WIDTH``: The maximum width of a regular line. diff --git a/src/gallium/include/pipe/p_context.h b/src/gallium/include/pipe/p_context.h index a29fff5..4d5535b 100644 --- a/src/gallium/include/pipe/p_context.h +++ b/src/gallium/include/pipe/p_context.h @@ -578,20 +578,33 @@ struct pipe_context { * Flush any pending framebuffer writes and invalidate texture caches. */ void (*texture_barrier)(struct pipe_context *, unsigned flags); /** * Flush caches according to flags. */ void (*memory_barrier)(struct pipe_context *, unsigned flags); /** +* Change the commitment status of a part of the given resource, which must +* have been created with the PIPE_RESOURCE_FLAG_SPARSE bit. +* +* \param level The texture level whose commitment should be changed. +* \param box The region of the resource whose commitment should be changed. +* \param commit Whether memory should be committed or un-committed. +* +* \return false if out of memory, true on success. +*/ + bool (*resource_commit)(struct pipe_context *, struct pipe_resource *, + unsigned level, struct pipe_box *box, bool commit); I wonder what the behavior for threaded gallium should be. Possibilities: 1) Sync the context thread and execute directly. 2) Ignore the return value, always return true, and execute it asynchronously. If the "false" return value is very unlikely, I may use the second approach. "false" here means
Re: [Mesa-dev] [PATCH] [RFC v2] mesa/glthread: Call unmarshal_batch directly in glthread_finish when batch queue is empty.
On Wed, Mar 29, 2017 at 9:11 AM, Bartosz Tomczykwrote: > This avoids costly thread synchronisation. With this fix games that > previously regressed with mesa_glthread=true like xonotic or grid autosport. > Could someone test if games that benefit from glthread didn't regress? > --- > src/mesa/main/glthread.c | 49 > +--- > 1 file changed, 34 insertions(+), 15 deletions(-) > > diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c > index 06115b916d..eef7202f01 100644 > --- a/src/mesa/main/glthread.c > +++ b/src/mesa/main/glthread.c > @@ -194,18 +194,11 @@ _mesa_glthread_restore_dispatch(struct gl_context *ctx) > } > } > > -void > -_mesa_glthread_flush_batch(struct gl_context *ctx) > +static void > +_mesa_glthread_flush_batch_no_lock(struct gl_context *ctx) > { > struct glthread_state *glthread = ctx->GLThread; > - struct glthread_batch *batch; > - > - if (!glthread) > - return; > - > - batch = glthread->batch; > - if (!batch->used) > - return; > + struct glthread_batch *batch = glthread->batch; > > /* Immediately reallocate a new batch, since the next marshalled call > would > * just do it. > @@ -223,10 +216,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx) >return; > } > > - pthread_mutex_lock(>mutex); > *glthread->batch_queue_tail = batch; > glthread->batch_queue_tail = >next; > pthread_cond_broadcast(>new_work); > + > +} > +void > +_mesa_glthread_flush_batch(struct gl_context *ctx) > +{ > + struct glthread_state *glthread = ctx->GLThread; > + struct glthread_batch *batch; > + > + if (!glthread) > + return; > + > + batch = glthread->batch; > + if (!batch->used) > + return; > + > + pthread_mutex_lock(>mutex); > + _mesa_glthread_flush_batch_no_lock(ctx); > pthread_mutex_unlock(>mutex); > } > > @@ -252,12 +261,22 @@ _mesa_glthread_finish(struct gl_context *ctx) > if (pthread_self() == glthread->thread) >return; > > - _mesa_glthread_flush_batch(ctx); > - > pthread_mutex_lock(>mutex); > > - while (glthread->batch_queue || glthread->busy) > - pthread_cond_wait(>work_done, >mutex); > + if (!(glthread->batch_queue || glthread->busy)) > + { > + if (glthread->batch && glthread->batch->used) > + { > + glthread_unmarshal_batch(ctx, glthread->batch); > + } Please follow the existing style of putting the braces on the same line as the if and else. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: remove support for predicates from TGSI
On 29 March 2017 at 16:51, Marek Olšákwrote: > From: Marek Olšák > > Neved used. > --- > src/gallium/auxiliary/gallivm/lp_bld_limits.h | 4 - > src/gallium/auxiliary/gallivm/lp_bld_tgsi.h| 2 - > src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c| 46 --- > src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c | 6 +- > src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c| 137 ++- > src/gallium/auxiliary/tgsi/tgsi_build.c| 66 - > src/gallium/auxiliary/tgsi/tgsi_build.h| 3 - > src/gallium/auxiliary/tgsi/tgsi_dump.c | 24 > src/gallium/auxiliary/tgsi/tgsi_exec.c | 59 > src/gallium/auxiliary/tgsi/tgsi_exec.h | 7 - > src/gallium/auxiliary/tgsi/tgsi_parse.c| 4 - > src/gallium/auxiliary/tgsi/tgsi_parse.h| 1 - > src/gallium/auxiliary/tgsi/tgsi_sanity.c | 1 - > src/gallium/auxiliary/tgsi/tgsi_strings.c | 1 - > src/gallium/auxiliary/tgsi/tgsi_text.c | 37 - > src/gallium/auxiliary/tgsi/tgsi_ureg.c | 84 +--- > src/gallium/auxiliary/tgsi/tgsi_ureg.h | 149 > + > src/gallium/docs/source/screen.rst | 1 - > src/gallium/drivers/freedreno/freedreno_screen.c | 2 - > src/gallium/drivers/i915/i915_fpc.h| 1 - > src/gallium/drivers/i915/i915_screen.c | 2 - > .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 10 +- > src/gallium/drivers/nouveau/nv30/nv30_screen.c | 2 - > src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 - > src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 - > src/gallium/drivers/r300/r300_screen.c | 4 - > src/gallium/drivers/r600/r600_pipe.c | 2 - > src/gallium/drivers/r600/r600_shader.c | 4 - > src/gallium/drivers/radeonsi/si_pipe.c | 1 - > src/gallium/drivers/svga/svga_screen.c | 6 - > src/gallium/drivers/vc4/vc4_screen.c | 2 - > src/gallium/drivers/virgl/virgl_screen.c | 2 - > src/gallium/include/pipe/p_defines.h | 1 - > src/gallium/include/pipe/p_shader_tokens.h | 19 --- > src/gallium/state_trackers/nine/nine_shader.c | 18 +-- > 35 files changed, 23 insertions(+), 689 deletions(-) > Quick grep for PIPE_SHADER_CAP_MAX_PREDS shows one instance in the etnaviv driver. Jose, Brian - you might want to check if nothing is using it on your end. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [RFC v2] mesa/glthread: Call unmarshal_batch directly in glthread_finish when batch queue is empty.
On 29.03.2017 18:11, Bartosz Tomczyk wrote: This avoids costly thread synchronisation. With this fix games that previously regressed with mesa_glthread=true like xonotic or grid autosport. Could someone test if games that benefit from glthread didn't regress? Please make sure the commit message is wrapped to 75 characters. The approach seems like a good idea: if the current thread is going to wait anyway, we might as well do any pending work locally to avoid context switch overhead. Would be nice to see some benchmarks, but this should mostly be a win -- the only reason I could imagine why it might not be is cache effects, and those could go either way. --- src/mesa/main/glthread.c | 49 +--- 1 file changed, 34 insertions(+), 15 deletions(-) diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c index 06115b916d..eef7202f01 100644 --- a/src/mesa/main/glthread.c +++ b/src/mesa/main/glthread.c @@ -194,18 +194,11 @@ _mesa_glthread_restore_dispatch(struct gl_context *ctx) } } -void -_mesa_glthread_flush_batch(struct gl_context *ctx) +static void +_mesa_glthread_flush_batch_no_lock(struct gl_context *ctx) A better and more idiomatic name for this function would be _mesa_glthread_flush_batch_locked. { struct glthread_state *glthread = ctx->GLThread; - struct glthread_batch *batch; - - if (!glthread) - return; - - batch = glthread->batch; - if (!batch->used) - return; + struct glthread_batch *batch = glthread->batch; /* Immediately reallocate a new batch, since the next marshalled call would * just do it. @@ -223,10 +216,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx) return; } - pthread_mutex_lock(>mutex); *glthread->batch_queue_tail = batch; glthread->batch_queue_tail = >next; pthread_cond_broadcast(>new_work); + +} +void +_mesa_glthread_flush_batch(struct gl_context *ctx) +{ + struct glthread_state *glthread = ctx->GLThread; + struct glthread_batch *batch; + + if (!glthread) + return; + + batch = glthread->batch; + if (!batch->used) + return; + + pthread_mutex_lock(>mutex); + _mesa_glthread_flush_batch_no_lock(ctx); pthread_mutex_unlock(>mutex); } @@ -252,12 +261,22 @@ _mesa_glthread_finish(struct gl_context *ctx) if (pthread_self() == glthread->thread) return; - _mesa_glthread_flush_batch(ctx); - pthread_mutex_lock(>mutex); - while (glthread->batch_queue || glthread->busy) - pthread_cond_wait(>work_done, >mutex); + if (!(glthread->batch_queue || glthread->busy)) + { + if (glthread->batch && glthread->batch->used) + { + glthread_unmarshal_batch(ctx, glthread->batch); You _must_ reset the api dispatch afterwards; otherwise, your change here effectively disables glthread forever. To be on the safe side, I think you need to save the current dispatch in a temp variable and then reset after unmarshalling. Cheers, Nicolai + } + glthread_allocate_batch(ctx); + } + else + { + _mesa_glthread_flush_batch_no_lock(ctx); + while (glthread->batch_queue || glthread->busy) + pthread_cond_wait(>work_done, >mutex); + } pthread_mutex_unlock(>mutex); } -- Lerne, wie die Welt wirklich ist, Aber vergiss niemals, wie sie sein sollte. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] [RFC v2] mesa/glthread: Call unmarshal_batch directly in glthread_finish when batch queue is empty.
This avoids costly thread synchronisation. With this fix games that previously regressed with mesa_glthread=true like xonotic or grid autosport. Could someone test if games that benefit from glthread didn't regress? --- src/mesa/main/glthread.c | 49 +--- 1 file changed, 34 insertions(+), 15 deletions(-) diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c index 06115b916d..eef7202f01 100644 --- a/src/mesa/main/glthread.c +++ b/src/mesa/main/glthread.c @@ -194,18 +194,11 @@ _mesa_glthread_restore_dispatch(struct gl_context *ctx) } } -void -_mesa_glthread_flush_batch(struct gl_context *ctx) +static void +_mesa_glthread_flush_batch_no_lock(struct gl_context *ctx) { struct glthread_state *glthread = ctx->GLThread; - struct glthread_batch *batch; - - if (!glthread) - return; - - batch = glthread->batch; - if (!batch->used) - return; + struct glthread_batch *batch = glthread->batch; /* Immediately reallocate a new batch, since the next marshalled call would * just do it. @@ -223,10 +216,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx) return; } - pthread_mutex_lock(>mutex); *glthread->batch_queue_tail = batch; glthread->batch_queue_tail = >next; pthread_cond_broadcast(>new_work); + +} +void +_mesa_glthread_flush_batch(struct gl_context *ctx) +{ + struct glthread_state *glthread = ctx->GLThread; + struct glthread_batch *batch; + + if (!glthread) + return; + + batch = glthread->batch; + if (!batch->used) + return; + + pthread_mutex_lock(>mutex); + _mesa_glthread_flush_batch_no_lock(ctx); pthread_mutex_unlock(>mutex); } @@ -252,12 +261,22 @@ _mesa_glthread_finish(struct gl_context *ctx) if (pthread_self() == glthread->thread) return; - _mesa_glthread_flush_batch(ctx); - pthread_mutex_lock(>mutex); - while (glthread->batch_queue || glthread->busy) - pthread_cond_wait(>work_done, >mutex); + if (!(glthread->batch_queue || glthread->busy)) + { + if (glthread->batch && glthread->batch->used) + { + glthread_unmarshal_batch(ctx, glthread->batch); + } + glthread_allocate_batch(ctx); + } + else + { + _mesa_glthread_flush_batch_no_lock(ctx); + while (glthread->batch_queue || glthread->busy) + pthread_cond_wait(>work_done, >mutex); + } pthread_mutex_unlock(>mutex); } -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gbm/dri: Flush after unmap
Hi, Emil, On 03/29/2017 02:34 PM, Emil Velikov wrote: > On 29 March 2017 at 13:02, Thomas Hellstromwrote: >> Hi, Emil, >> >> On 03/29/2017 01:30 PM, Emil Velikov wrote: >>> Hi Thomas, >>> >>> On 28 March 2017 at 20:39, Thomas Hellstrom wrote: Drivers may queue dma operations on the context at unmap time so we need to flush to make sure the data gets to the bo. Ideally the application would take care of this, but since there appears to be no exported gbm flush functionality we need to explicitly flush at unmap time. This fixes a problem where kmscube on vmwgfx in rgba textured mode would render using an uninitialized texture rather than the intended rgba pattern. >>> I haven't checked but the issue should not be restricted to vmwgfx, right ? >>> >>> Perhaps we should add the following >>> Fixes: 8aeb6d768b4 ("gbm: Add map/unmap functions") >>> CC: >> Unfortunately I've, perhaps a bit prematurely, already pushed the fix. >> Is there a way to get it >> into stable after push? >> > Adding mesa-stable@ to the CC list should do it. Check out the > instructions for more examples. > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mesa3d.org_submittingpatches.html-23nominations=DwIBaQ=uilaK90D4TOVoH58JNXRgQ=wnSlgOCqfpNS4d02vP68_E9q2BNMCwfD2OZ_6dCFVQQ=BrXsoWQ8oh4YpiBU4MHB3Ajw6fCc8eSvWV1W36tTgt0=FQVDFEI-7Yq6wpypsxCCS-KRkWVaGhtGF3RuN4ZepGY= > Ok. I'll try the option of forwarding the commit id to mesa-stable... > Signed-off-by: Thomas Hellstrom --- src/gbm/backends/dri/gbm_dri.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c index ac7ede8..6c2244c 100644 --- a/src/gbm/backends/dri/gbm_dri.c +++ b/src/gbm/backends/dri/gbm_dri.c @@ -243,7 +243,7 @@ struct dri_extension_match { }; static struct dri_extension_match dri_core_extensions[] = { - { __DRI2_FLUSH, 1, offsetof(struct gbm_dri_device, flush) }, + { __DRI2_FLUSH, 4, offsetof(struct gbm_dri_device, flush) }, >>> Currently the classic nouveau, radeon/r200 and i915 drivers do not >>> support v4 of the extension. >>> As-is this will 'break' them... if they ever worked to begin with. >>> >>> One solution is to bail out (return -ENOSYS or similar) in map/unmap >>> API of the when the DRI module is too old. >>> Just some ^^ food for thought. >> Hmm. Is there even a use-case for gbm with those drivers? If so we >> should perhaps make them up-to-date with the flush extension. >> > Of the above: > > - nouveau: Does not support DRI_IMAGE, thus it doesn't work even > before the patch. > - i915: I have some untested ancient patches. Will see if I can rebase > + send out. > - radeons: ?? > > If someone reports an issue we can ask them to write/test some code, I guess > ;-) Indeed. It looks like gbm is mostly used together with KMS anyway... /Thomas > > -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/2] st/mesa: EGLImageTarget* error handling
On 29.03.2017 14:28, Philipp Zabel wrote: On Wed, 2017-03-29 at 13:01 +0200, Nicolai Hähnle wrote: On 29.03.2017 09:44, Philipp Zabel wrote: Stop trying to specify texture or renderbuffer objects for unsupported EGL images. Generate the error codes specified in the OES_EGL_image extension. EGLImageTargetTexture2D and EGLImageTargetRenderbuffer would call the pipe driver's create_surface callback without ever checking that the given EGL image is actually compatible with the chosen target texture or renderbuffer. This patch adds a call to the pipe driver's is_format_supported callback and generates an INVALID_OPERATION error for unsupported EGL images. If the EGL image handle does not describe a valid EGL image, an INVALID_VALUE error is generated. Signed-off-by: Philipp ZabelReviewed-by: Nicolai Hähnle --- v2: fixed get_surface to actually use the usage and error parameters The v2 usually goes above :) Ok, I'll remember that next time. Do you need someone to commit this for you? Yes, please. Done. regards Philipp -- Lerne, wie die Welt wirklich ist, Aber vergiss niemals, wie sie sein sollte. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/2] anv: Add support for 48-bit addresses
Jason Ekstrandwrites: > This commit adds support for using the full 48-bit address space on > Broadwell and newer hardware. Thanks to certain limitations, not all > objects can be placed above the 32-bit boundary. In particular, general > and state base address need to live within 32 bits. (See also > Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.) In order > to handle this, we add a supports_48bit_address field to anv_bo and only > set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set. We set the bit > for all client-allocated memory objects but leave it false for > driver-allocated objects. While this is more conservative than needed, > all driver allocations should easily fit in the first 32 bits of address > space and keeps things simple because we don't have to think about > whether or not any given one of our allocation data structures will be > used in a 48-bit-unsafe way. > --- > src/intel/vulkan/anv_allocator.c | 10 -- > src/intel/vulkan/anv_batch_chain.c | 14 ++ > src/intel/vulkan/anv_device.c | 4 +++- > src/intel/vulkan/anv_gem.c | 18 ++ > src/intel/vulkan/anv_intel.c | 2 +- > src/intel/vulkan/anv_private.h | 29 +++-- > 6 files changed, 67 insertions(+), 10 deletions(-) > > diff --git a/src/intel/vulkan/anv_allocator.c > b/src/intel/vulkan/anv_allocator.c > index 45c663b..88c9c13 100644 > --- a/src/intel/vulkan/anv_allocator.c > +++ b/src/intel/vulkan/anv_allocator.c > @@ -255,7 +255,7 @@ anv_block_pool_init(struct anv_block_pool *pool, > assert(util_is_power_of_two(block_size)); > > pool->device = device; > - anv_bo_init(>bo, 0, 0); > + anv_bo_init(>bo, 0, 0, false); > pool->block_size = block_size; > pool->free_list = ANV_FREE_LIST_EMPTY; > pool->back_free_list = ANV_FREE_LIST_EMPTY; > @@ -475,7 +475,13 @@ anv_block_pool_grow(struct anv_block_pool *pool, struct > anv_block_state *state) > * values back into pool. */ > pool->map = map + center_bo_offset; > pool->center_bo_offset = center_bo_offset; > - anv_bo_init(>bo, gem_handle, size); > + > + /* Block pool BOs are marked as not supporting 48-bit addresses because > +* they are used to back STATE_BASE_ADDRESS. > +* > +* See also anv_bo::supports_48bit_address. > +*/ > + anv_bo_init(>bo, gem_handle, size, false); > pool->bo.map = map; > > done: > diff --git a/src/intel/vulkan/anv_batch_chain.c > b/src/intel/vulkan/anv_batch_chain.c > index 5d7abc6..b098e4b 100644 > --- a/src/intel/vulkan/anv_batch_chain.c > +++ b/src/intel/vulkan/anv_batch_chain.c > @@ -979,7 +979,8 @@ anv_execbuf_finish(struct anv_execbuf *exec, > } > > static VkResult > -anv_execbuf_add_bo(struct anv_execbuf *exec, > +anv_execbuf_add_bo(struct anv_device *device, > + struct anv_execbuf *exec, > struct anv_bo *bo, > struct anv_reloc_list *relocs, > const VkAllocationCallbacks *alloc) > @@ -1039,6 +1040,10 @@ anv_execbuf_add_bo(struct anv_execbuf *exec, >obj->flags = bo->is_winsys_bo ? EXEC_OBJECT_WRITE : 0; >obj->rsvd1 = 0; >obj->rsvd2 = 0; > + > + if (device->instance->physicalDevice.supports_48bit_addresses && > + bo->supports_48bit_address) > + obj->flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS; > } > > if (relocs != NULL && obj->relocation_count == 0) { > @@ -1052,7 +1057,7 @@ anv_execbuf_add_bo(struct anv_execbuf *exec, >for (size_t i = 0; i < relocs->num_relocs; i++) { > /* A quick sanity check on relocations */ > assert(relocs->relocs[i].offset < bo->size); > - anv_execbuf_add_bo(exec, relocs->reloc_bos[i], NULL, alloc); > + anv_execbuf_add_bo(device, exec, relocs->reloc_bos[i], NULL, alloc); >} > } > > @@ -1264,7 +1269,8 @@ anv_cmd_buffer_execbuf(struct anv_device *device, > adjust_relocations_from_state_pool(ss_pool, _buffer->surface_relocs, >cmd_buffer->last_ss_pool_center); > VkResult result = > - anv_execbuf_add_bo(, _pool->bo, _buffer->surface_relocs, > + anv_execbuf_add_bo(device, , _pool->bo, > + _buffer->surface_relocs, > _buffer->pool->alloc); > if (result != VK_SUCCESS) >return result; > @@ -1277,7 +1283,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device, >adjust_relocations_to_state_pool(ss_pool, &(*bbo)->bo, &(*bbo)->relocs, > cmd_buffer->last_ss_pool_center); > > - anv_execbuf_add_bo(, &(*bbo)->bo, &(*bbo)->relocs, > + anv_execbuf_add_bo(device, , &(*bbo)->bo, &(*bbo)->relocs, > _buffer->pool->alloc); > } > > diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c > index 4e4fa19..f9d04ee 100644 > ---
Re: [Mesa-dev] [PATCH v2 1/2] anv: Add support for 48-bit addresses
On Wed, Mar 29, 2017 at 08:36:36AM -0700, Jason Ekstrand wrote: >On Wed, Mar 29, 2017 at 1:51 AM, Chris Wilson ><[1]ch...@chris-wilson.co.uk> wrote: > > On Tue, Mar 28, 2017 at 05:41:12PM -0700, Jason Ekstrand wrote: > > This commit adds support for using the full 48-bit address space on > > Broadwell and newer hardware. Thanks to certain limitations, not all > > objects can be placed above the 32-bit boundary. In particular, > general > > and state base address need to live within 32 bits. (See also > > Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.) In order > > to handle this, we add a supports_48bit_address field to anv_bo and > only > > set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set. We set the > bit > > for all client-allocated memory objects but leave it false for > > driver-allocated objects. While this is more conservative than > needed, > > all driver allocations should easily fit in the first 32 bits of > address > > space and keeps things simple because we don't have to think about > > whether or not any given one of our allocation data structures will be > > used in a 48-bit-unsafe way. > > --- > > static VkResult > > -anv_execbuf_add_bo(struct anv_execbuf *exec, > > +anv_execbuf_add_bo(struct anv_device *device, > > + struct anv_execbuf *exec, > > struct anv_bo *bo, > > struct anv_reloc_list *relocs, > > const VkAllocationCallbacks *alloc) > > @@ -1039,6 +1040,10 @@ anv_execbuf_add_bo(struct anv_execbuf *exec, > > obj->flags = bo->is_winsys_bo ? EXEC_OBJECT_WRITE : 0; > > obj->rsvd1 = 0; > > obj->rsvd2 = 0; > > + > > + if (device->instance->physicalDevice.supports_48bit_addresses > && > > + bo->supports_48bit_address) > > + obj->flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS; > > Don't set bo->supports_48bit_address when > !device->instance->physicalDevice.supports_48bit_addresses? My guess is > that flagging bo is a rarer task than add_bo(), and it looks like you > already have device available in the callers of bo_init(true). > >I thought a bout making that change right before I sent it but decided I >marginally liked this better. I'm happy to change it if you'd like. You're also the maintainer, you have to live with it so pick whichever you find easier to read and less likely to get in the way of future changes:) > > diff --git a/src/intel/vulkan/anv_private.h > b/src/intel/vulkan/anv_private.h > > index 27c887c..425e376 100644 > > --- a/src/intel/vulkan/anv_private.h > > +++ b/src/intel/vulkan/anv_private.h > > @@ -299,11 +299,34 @@ struct anv_bo { > > * writing to them and synchronize uses on other rings (eg if the > display > > * server uses the blitter ring). > > */ > > - bool is_winsys_bo; > > + bool is_winsys_bo:1; > > + > > + /* Whether or not this BO supports having a 48-bit address. Not > all > > + * buffers support arbitrary 48-bit addresses. In particular, we > need to > > + * be careful with general and instruction state buffers because > we set the > > + * size in STATE_BASE_ADDRESS to 0xf (the maximum) even though > the BO > > + * is most likely significantly smaller. If we let the kernel > place it > > + * anywhere it wants, it will default to placing it as high up the > address > > + * space as possible, the range specified by STATE_BASE_ADDRESS > will > > + * over-flow the 48-bit address range, and the GPU will hang. In > order to > > + * avoid this problem, we tell the kernel that the buffer does not > support > > + * 48-bit addresses, and it places the buffer at a 32-bit > address. While > > + * this solution is probably overkill, it is effective. > > How about just setting the field to the bo->size? You must know the bo > already at that point so that you can set the relocation target. > >Actually, we don't. We have a pointer to a thing that claims to be a BO >but the actual GEM handle and size aren't known until execbuf time. (Yes, >that's a bit weird but there are good reasons for it and it's not likely >to change. When we stop doing relocations, there's a separate plan for >how to handle that.) Hmm. I honestly didn't expect that. Another thing you can do is to use execobject.size = 4GiB for those buffers. The kernel will then allocate it 4GiB of space in the GTT, it's feels overkill though. Just limiting them to the low 4GiB shouldn't be restrictive. I may have to check that we do allocate those from the bottom -- iirc, we don't require any
Re: [Mesa-dev] [PATCH] [RFC] mesa/glthread: Call unmarshal_batch directly in glthread_finish when batch queue is empty.
Please ignore above patch. On Wed, Mar 29, 2017 at 5:48 PM, Bartosz Tomczyk < bartosz.tomczy...@gmail.com> wrote: > This avoids costly thread synchronisation. With this fix games that > previously regressed with mesa_glthread=true like xonotic or grid autosport. > Could someone test if games that benefit from glthread didn't regress? > --- > src/mesa/main/glthread.c | 17 + > 1 file changed, 13 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c > index 06115b916d..d46288c242 100644 > --- a/src/mesa/main/glthread.c > +++ b/src/mesa/main/glthread.c > @@ -252,12 +252,21 @@ _mesa_glthread_finish(struct gl_context *ctx) > if (pthread_self() == glthread->thread) >return; > > - _mesa_glthread_flush_batch(ctx); > - > pthread_mutex_lock(>mutex); > > - while (glthread->batch_queue || glthread->busy) > - pthread_cond_wait(>work_done, >mutex); > + if (!(glthread->batch_queue || glthread->busy)) > + { > + if (glthread->batch && glthread->batch->used) > + { > + glthread_unmarshal_batch(ctx, glthread->batch); > + } > + glthread_allocate_batch(ctx); > + } > + else > + { > + while (glthread->batch_queue || glthread->busy) > + pthread_cond_wait(>work_done, >mutex); > + } > > pthread_mutex_unlock(>mutex); > } > -- > 2.12.2 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] [RFC] mesa/glthread: Call unmarshal_batch directly in glthread_finish when batch queue is empty.
This avoids costly thread synchronisation. With this fix games that previously regressed with mesa_glthread=true like xonotic or grid autosport. Could someone test if games that benefit from glthread didn't regress? --- src/mesa/main/glthread.c | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c index 06115b916d..d46288c242 100644 --- a/src/mesa/main/glthread.c +++ b/src/mesa/main/glthread.c @@ -252,12 +252,21 @@ _mesa_glthread_finish(struct gl_context *ctx) if (pthread_self() == glthread->thread) return; - _mesa_glthread_flush_batch(ctx); - pthread_mutex_lock(>mutex); - while (glthread->batch_queue || glthread->busy) - pthread_cond_wait(>work_done, >mutex); + if (!(glthread->batch_queue || glthread->busy)) + { + if (glthread->batch && glthread->batch->used) + { + glthread_unmarshal_batch(ctx, glthread->batch); + } + glthread_allocate_batch(ctx); + } + else + { + while (glthread->batch_queue || glthread->busy) + pthread_cond_wait(>work_done, >mutex); + } pthread_mutex_unlock(>mutex); } -- 2.12.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/2] anv: Add support for 48-bit addresses
On Wed, Mar 29, 2017 at 1:51 AM, Chris Wilsonwrote: > On Tue, Mar 28, 2017 at 05:41:12PM -0700, Jason Ekstrand wrote: > > This commit adds support for using the full 48-bit address space on > > Broadwell and newer hardware. Thanks to certain limitations, not all > > objects can be placed above the 32-bit boundary. In particular, general > > and state base address need to live within 32 bits. (See also > > Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.) In order > > to handle this, we add a supports_48bit_address field to anv_bo and only > > set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set. We set the bit > > for all client-allocated memory objects but leave it false for > > driver-allocated objects. While this is more conservative than needed, > > all driver allocations should easily fit in the first 32 bits of address > > space and keeps things simple because we don't have to think about > > whether or not any given one of our allocation data structures will be > > used in a 48-bit-unsafe way. > > --- > > static VkResult > > -anv_execbuf_add_bo(struct anv_execbuf *exec, > > +anv_execbuf_add_bo(struct anv_device *device, > > + struct anv_execbuf *exec, > > struct anv_bo *bo, > > struct anv_reloc_list *relocs, > > const VkAllocationCallbacks *alloc) > > @@ -1039,6 +1040,10 @@ anv_execbuf_add_bo(struct anv_execbuf *exec, > >obj->flags = bo->is_winsys_bo ? EXEC_OBJECT_WRITE : 0; > >obj->rsvd1 = 0; > >obj->rsvd2 = 0; > > + > > + if (device->instance->physicalDevice.supports_48bit_addresses && > > + bo->supports_48bit_address) > > + obj->flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS; > > Don't set bo->supports_48bit_address when > !device->instance->physicalDevice.supports_48bit_addresses? My guess is > that flagging bo is a rarer task than add_bo(), and it looks like you > already have device available in the callers of bo_init(true). > I thought a bout making that change right before I sent it but decided I marginally liked this better. I'm happy to change it if you'd like. > > } > > > > if (relocs != NULL && obj->relocation_count == 0) { > > @@ -1052,7 +1057,7 @@ anv_execbuf_add_bo(struct anv_execbuf *exec, > >for (size_t i = 0; i < relocs->num_relocs; i++) { > > /* A quick sanity check on relocations */ > > assert(relocs->relocs[i].offset < bo->size); > > - anv_execbuf_add_bo(exec, relocs->reloc_bos[i], NULL, alloc); > > + anv_execbuf_add_bo(device, exec, relocs->reloc_bos[i], NULL, > alloc); > >} > > } > > > > @@ -1264,7 +1269,8 @@ anv_cmd_buffer_execbuf(struct anv_device *device, > > adjust_relocations_from_state_pool(ss_pool, > _buffer->surface_relocs, > >cmd_buffer->last_ss_pool_center); > > VkResult result = > > - anv_execbuf_add_bo(, _pool->bo, > _buffer->surface_relocs, > > + anv_execbuf_add_bo(device, , _pool->bo, > > + _buffer->surface_relocs, > > _buffer->pool->alloc); > > if (result != VK_SUCCESS) > >return result; > > @@ -1277,7 +1283,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device, > >adjust_relocations_to_state_pool(ss_pool, &(*bbo)->bo, > &(*bbo)->relocs, > > cmd_buffer->last_ss_pool_ > center); > > > > - anv_execbuf_add_bo(, &(*bbo)->bo, &(*bbo)->relocs, > > + anv_execbuf_add_bo(device, , &(*bbo)->bo, &(*bbo)->relocs, > > _buffer->pool->alloc); > > } > > > > diff --git a/src/intel/vulkan/anv_device.c > b/src/intel/vulkan/anv_device.c > > index 4e4fa19..f9d04ee 100644 > > --- a/src/intel/vulkan/anv_device.c > > +++ b/src/intel/vulkan/anv_device.c > > @@ -149,6 +149,8 @@ anv_physical_device_init(struct anv_physical_device > *device, > >goto fail; > > } > > > > + device->supports_48bit_addresses = anv_gem_supports_48b_ > addresses(fd); > > + > > if (!anv_device_get_cache_uuid(device->uuid)) { > >result = vk_errorf(VK_ERROR_INITIALIZATION_FAILED, > > "cannot generate UUID"); > > @@ -1396,7 +1398,7 @@ anv_bo_init_new(struct anv_bo *bo, struct > anv_device *device, uint64_t size) > > if (!gem_handle) > >return vk_error(VK_ERROR_OUT_OF_DEVICE_MEMORY); > > > > - anv_bo_init(bo, gem_handle, size); > > + anv_bo_init(bo, gem_handle, size, true); > > > > return VK_SUCCESS; > > } > > diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c > > index 0dde6d9..3d45243 100644 > > --- a/src/intel/vulkan/anv_gem.c > > +++ b/src/intel/vulkan/anv_gem.c > > @@ -301,6 +301,24 @@ anv_gem_get_aperture(int fd, uint64_t *size) > > return 0; > > } > > > > +bool > > +anv_gem_supports_48b_addresses(int fd) > > +{ > > + struct drm_i915_gem_exec_object2 obj = { > > + .flags =
Re: [Mesa-dev] [Request for Comments] - Port documentation to Markdown
Hi Jean, On 8 March 2017 at 16:12, Brian Paulwrote: >> >One thing that I would prefer so not see if heavy things like >> Bootstrap. >> >We definitely don't need it, I think writing our own few lines of CSS >> >(which can be inspired by anything you want) is better. We have more >> >than enough people who know how to do it (myself included), it will >> be >> >cleaner (we won't need to include the whole forest to get our tree) >> and >> >much easier to fix when there's a bug. >> >> >> I would tend to agree but I don't care too much about those details so >> long as it's maintainable. My primary concern is that while a lot of >> random developers in the community are liable to have brushed into CSS a >> time or two, most probably won't know bootstrap. > > > Yeah, I can's stress that too much. The site has to be easily maintainable > by the developers. I, for one, don't know much about websites beyond html > and a little CSS. If you create a new website infrastructure and then > disappear after a few months we need to be able to take over. Also, we > can't funnel documentation updates through a handful of people that know a > complex system. > Have you had some time to look into this ? It would be great if we can get things rolling, even if not perfect. Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/2] nir: Add support for 8 and 16-bit types
On Wed, Mar 29, 2017 at 12:41 AM, Eduardo Lima Mitevwrote: > Both patches need rebase, but look fine otherwise. > The first has already landed (I think). The second definitely needs rebasing. Yesterday, I rebased it on top of the other two constant_expressions fixup patches I sent out: https://patchwork.freedesktop.org/series/21244/ It would be nice if that series landed first as it cleans things up substantially. > Series is: > > Reviewed-by: Eduardo Lima Mitev > Thanks! > On 03/09/2017 11:05 PM, Jason Ekstrand wrote: > > This tiny series adds support in NIR for 8 and 16-bit types. In > > particular, it now supports int8_t, uint8_t, int16_t, uint16_t, and > > float16_t. No 8-bit floating-point type is supported because 8-bit float > > would be stupid. > > > > These patches have been tested in Jenkins but no 8 or 16-bit code has > been > > run through it yet. Even if we're people don't want to land the second > > patch (due to not having a vertical slice), I'd like to land the first > > refactor patch. > > > > Jason Ekstrand (2): > > nir/constant_expressions: Refactor helper functions > > nir: Add support for 8 and 16-bit types > > > > src/compiler/nir/nir.h | 4 ++ > > src/compiler/nir/nir_constant_expressions.py | 67 > +--- > > src/compiler/nir/nir_opcodes.py | 6 ++- > > 3 files changed, 51 insertions(+), 26 deletions(-) > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3] i965: expose BRW_OPCODE_[F32TO16/F16TO32] name on gen8+
Thanks. That looks good. Reviewed-by: Matt Turner___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3] i965: expose BRW_OPCODE_[F32TO16/F16TO32] name on gen8+
Technically those hw operations are only available on gen7, as gen8+ support the conversion on the MOV. But, when using the builder to implement nir operations (example: nir_op_fquantize2f16), it is not needed to do the gen check. This check is done later, on the final emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the specific operation accordingly. So in the middle, during optimization phases those hw operations can be around for gen8+ too. Without this patch, several (at least 95) vulkan-cts quantize tests crashes when using INTEL_DEBUG=optimizer. For example: dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert v2: simplify the code using GEN_GE (Ilia Mirkin) v3: tweak brw_instruction_name instead of changing opcode_descs table, that is used for validation (Matt Turner) --- Im not really proud of the comment, but I hope it explains well why it is needed. Comments are welcome. src/intel/compiler/brw_shader.cpp | 9 + 1 file changed, 9 insertions(+) diff --git a/src/intel/compiler/brw_shader.cpp b/src/intel/compiler/brw_shader.cpp index bfaa5e7..73bbc93 100644 --- a/src/intel/compiler/brw_shader.cpp +++ b/src/intel/compiler/brw_shader.cpp @@ -157,6 +157,15 @@ brw_instruction_name(const struct gen_device_info *devinfo, enum opcode op) if (devinfo->gen >= 6 && op == BRW_OPCODE_DO) return "do"; + /* The following conversion opcodes doesn't exist on Gen8+, but we use + * then to mark that we want to do the conversion. + */ + if (devinfo->gen > 7 && op == BRW_OPCODE_F32TO16) + return "f32to16"; + + if (devinfo->gen > 7 && op == BRW_OPCODE_F16TO32) + return "f16to32"; + assert(brw_opcode_desc(devinfo, op)->name); return brw_opcode_desc(devinfo, op)->name; case FS_OPCODE_FB_WRITE: -- 2.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: expose BRW_OPCODE_[F32TO16/F16TO32] opcode_descs on gen8+
On 29/03/17 16:15, Matt Turner wrote: > On Wed, Mar 29, 2017 at 4:47 AM, Alejandro Piñeiro> wrote: >> Technically those hw operations are only available on gen7, as gen8+ >> support the conversion on the MOV. But, when using the builder to >> implement nir operations (example: nir_op_fquantize2f16), it is not >> needed to do the gen check. This check is done later, on the final >> emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the >> specific operation accordingly. >> >> So in the middle, during optimization phases those hw operations can >> be around for gen8+ too. >> >> Without this patch, several (at least 95) vulkan-cts quantize tests >> crashes when using INTEL_DEBUG=optimizer. For example: >> dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert >> --- >> src/intel/compiler/brw_eu.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c >> index 77400c1..bff37d7 100644 >> --- a/src/intel/compiler/brw_eu.c >> +++ b/src/intel/compiler/brw_eu.c >> @@ -499,10 +499,10 @@ static const struct opcode_desc opcode_descs[128] = { >>.name = "csel",.nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN8), >> }, >> [BRW_OPCODE_F32TO16] = { >> - .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75, >> + .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75 | GEN8 >> | GEN9, >> }, >> [BRW_OPCODE_F16TO32] = { >> - .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75, >> + .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75 | GEN8 >> | GEN9, >> }, > This table is for hardware information, used by brw_eu_validate.c. > Since these opcodes do not exist on Gen8+, we should not add that to > the table. > > I assume that the crashes you are referring to are assertion failures > in brw_instruction_name() -- assert(brw_opcode_desc(devinfo, > op)->name) > > If that's the case, there's an identical case immediately above. We > use BRW_OPCODE_DO in the backend IRs, but that opcode is not used on > Gen6+. I would add two more cases for f32to16 and f16to32 there. Ok, thanks for the hints. I would work on a v3 of the patch. > Perhaps we should not use BRW_OPCODE_* for operations used in the > backend IR that may not actually exist as a real opcode in hardware. > Not sure. Yes, at first I found it somewhat counter-intuitive, so I checked just in case, and it is happening (or happening something really similar) with several other hw opcodes. The alternative would be create a new kind of opcode, having hw_opcode and _opcode. But I don't think that it is worth so such effort, and it is okish to just remember that there are still a lot happening after calling bld.emit(opcode, ...). BR ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 05/25] gallium: add sparse buffer interface and capability
On Wed, Mar 29, 2017 at 12:26 PM, Nicolai Hähnlewrote: > On 28.03.2017 21:46, Marek Olšák wrote: >> >> On Tue, Mar 28, 2017 at 11:11 AM, Nicolai Hähnle >> wrote: >>> >>> From: Nicolai Hähnle >>> >>> TODO fill out caps in all drivers >>> >>> v2: >>> - explain the resource_commit interface in more detail >>> --- >>> src/gallium/docs/source/context.rst | 25 + >>> src/gallium/docs/source/screen.rst | 3 +++ >>> src/gallium/include/pipe/p_context.h | 13 + >>> src/gallium/include/pipe/p_defines.h | 2 ++ >>> 4 files changed, 43 insertions(+) >>> >>> diff --git a/src/gallium/docs/source/context.rst >>> b/src/gallium/docs/source/context.rst >>> index a053193..5949ff2 100644 >>> --- a/src/gallium/docs/source/context.rst >>> +++ b/src/gallium/docs/source/context.rst >>> @@ -611,20 +611,45 @@ for both regular textures as well as for >>> framebuffers read via FBFETCH. >>> .. _memory_barrier: >>> >>> memory_barrier >>> %%% >>> >>> This function flushes caches according to which of the PIPE_BARRIER_* >>> flags >>> are set. >>> >>> >>> >>> +.. _resource_commit: >>> + >>> +resource_commit >>> +%%% >>> + >>> +This function changes the commit state of a part of a sparse resource. >>> Sparse >>> +resources are created by setting the ``PIPE_RESOURCE_FLAG_SPARSE`` flag >>> when >>> +calling ``resource_create``. Initially, sparse resources only reserve a >>> virtual >>> +memory region that is not backed by memory (i.e., it is uncommitted). >>> The >>> +``resource_commit`` function can be called to commit or uncommit parts >>> (or all) >>> +of a resource. The driver manages the underlying backing memory. >>> + >>> +The contents of newly committed memory regions are undefined. Calling >>> this >>> +function to commit an already committed memory region is allowed and >>> leaves its >>> +content unchanged. Similarly, calling this function to uncommit an >>> already >>> +uncommitted memory region is allowed. >>> + >>> +For buffers, the given box must be aligned to multiples of >>> +``PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE``. As an exception to this rule, if >>> the size >>> +of the buffer is not a multiple of the page size, changing the commit >>> state of >>> +the last (partial) page requires a box that ends at the end of the >>> buffer >>> +(i.e., box->x + box->width == buffer->width0). >>> + >>> + >>> + >>> .. _pipe_transfer: >>> >>> PIPE_TRANSFER >>> ^ >>> >>> These flags control the behavior of a transfer object. >>> >>> ``PIPE_TRANSFER_READ`` >>>Resource contents read back (or accessed directly) at transfer create >>> time. >>> >>> diff --git a/src/gallium/docs/source/screen.rst >>> b/src/gallium/docs/source/screen.rst >>> index 00c9503..8759639 100644 >>> --- a/src/gallium/docs/source/screen.rst >>> +++ b/src/gallium/docs/source/screen.rst >>> @@ -369,20 +369,23 @@ The integer capabilities: >>>opcode to retrieve the current value in the framebuffer. >>> * ``PIPE_CAP_TGSI_MUL_ZERO_WINS``: Whether TGSI shaders support the >>>``TGSI_PROPERTY_MUL_ZERO_WINS`` shader property. >>> * ``PIPE_CAP_DOUBLES``: Whether double precision floating-point >>> operations >>>are supported. >>> * ``PIPE_CAP_INT64``: Whether 64-bit integer operations are supported. >>> * ``PIPE_CAP_INT64_DIVMOD``: Whether 64-bit integer division/modulo >>>operations are supported. >>> * ``PIPE_CAP_TGSI_TEX_TXF_LZ``: Whether TEX_LZ and TXF_LZ opcodes are >>>supported. >>> +* ``PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE``: The page size of sparse buffers >>> in >>> + bytes, or 0 if sparse buffers are not supported. The page size must be >>> at >>> + most 64KB. >>> >>> >>> .. _pipe_capf: >>> >>> PIPE_CAPF_* >>> >>> >>> The floating-point capabilities are: >>> >>> * ``PIPE_CAPF_MAX_LINE_WIDTH``: The maximum width of a regular line. >>> diff --git a/src/gallium/include/pipe/p_context.h >>> b/src/gallium/include/pipe/p_context.h >>> index a29fff5..4d5535b 100644 >>> --- a/src/gallium/include/pipe/p_context.h >>> +++ b/src/gallium/include/pipe/p_context.h >>> @@ -578,20 +578,33 @@ struct pipe_context { >>> * Flush any pending framebuffer writes and invalidate texture >>> caches. >>> */ >>> void (*texture_barrier)(struct pipe_context *, unsigned flags); >>> >>> /** >>> * Flush caches according to flags. >>> */ >>> void (*memory_barrier)(struct pipe_context *, unsigned flags); >>> >>> /** >>> +* Change the commitment status of a part of the given resource, >>> which must >>> +* have been created with the PIPE_RESOURCE_FLAG_SPARSE bit. >>> +* >>> +* \param level The texture level whose commitment should be changed. >>> +* \param box The region of the resource whose commitment should be >>> changed. >>> +* \param commit Whether memory should be committed or un-committed. >>> +* >>> +* \return false if out of
Re: [Mesa-dev] [PATCH] radv: move to using nir clip/cull merge pass.
Reviewed-by: Edward O'CallaghanOn 03/29/2017 04:14 PM, Dave Airlie wrote: > From: Dave Airlie > > Doing this before tessellation makes doing some bits of > tessellation a bit cleaner. It also cleans up a bit of the > llvm generator code. > > Signed-off-by: Dave Airlie > --- > src/amd/common/ac_nir_to_llvm.c | 144 > ++-- > src/amd/vulkan/radv_pipeline.c | 1 + > 2 files changed, 36 insertions(+), 109 deletions(-) > > diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c > index f164d8f..78602fd 100644 > --- a/src/amd/common/ac_nir_to_llvm.c > +++ b/src/amd/common/ac_nir_to_llvm.c > @@ -144,8 +144,6 @@ struct nir_to_llvm_context { > int num_locals; > LLVMValueRef *locals; > bool has_ddxy; > - uint8_t num_input_clips; > - uint8_t num_input_culls; > uint8_t num_output_clips; > uint8_t num_output_culls; > > @@ -170,12 +168,9 @@ static unsigned > shader_io_get_unique_index(gl_varying_slot slot) > return 0; > if (slot == VARYING_SLOT_PSIZ) > return 1; > - if (slot == VARYING_SLOT_CLIP_DIST0 || > - slot == VARYING_SLOT_CULL_DIST0) > + if (slot == VARYING_SLOT_CLIP_DIST0) > return 2; > - if (slot == VARYING_SLOT_CLIP_DIST1 || > - slot == VARYING_SLOT_CULL_DIST1) > - return 3; > + /* 3 is reserved for clip dist as well */ > if (slot >= VARYING_SLOT_VAR0 && slot <= VARYING_SLOT_VAR31) > return 4 + (slot - VARYING_SLOT_VAR0); > unreachable("illegal slot in get unique index\n"); > @@ -2195,7 +2190,6 @@ load_gs_input(struct nir_to_llvm_context *ctx, > unsigned param, vtx_offset_param; > LLVMValueRef value[4], result; > unsigned vertex_index; > - unsigned cull_offset = 0; > radv_get_deref_offset(ctx, >variables[0]->deref, > false, _index, > _index, _index); > @@ -2205,13 +2199,11 @@ load_gs_input(struct nir_to_llvm_context *ctx, > LLVMConstInt(ctx->i32, 4, false), ""); > > param = > shader_io_get_unique_index(instr->variables[0]->var->data.location); > - if (instr->variables[0]->var->data.location == VARYING_SLOT_CULL_DIST0) > - cull_offset += ctx->num_input_clips; > for (unsigned i = 0; i < instr->num_components; i++) { > > args[0] = ctx->esgs_ring; > args[1] = vtx_offset; > - args[2] = LLVMConstInt(ctx->i32, (param * 4 + i + const_index + > cull_offset) * 256, false); > + args[2] = LLVMConstInt(ctx->i32, (param * 4 + i + const_index) > * 256, false); > args[3] = ctx->i32zero; > args[4] = ctx->i32one; /* OFFEN */ > args[5] = ctx->i32zero; /* IDXEN */ > @@ -2366,8 +2358,7 @@ visit_store_var(struct nir_to_llvm_context *ctx, > > value = llvm_extract_elem(ctx, src, chan); > > - if (instr->variables[0]->var->data.location == > VARYING_SLOT_CLIP_DIST0 || > - instr->variables[0]->var->data.location == > VARYING_SLOT_CULL_DIST0) > + if (instr->variables[0]->var->data.compact) > stride = 1; > if (indir_index) { > unsigned count = glsl_count_attribute_slots( > @@ -3143,7 +3134,7 @@ visit_emit_vertex(struct nir_to_llvm_context *ctx, > LLVMValueRef gs_next_vertex; > LLVMValueRef can_emit, kill; > int idx; > - int clip_cull_slot = -1; > + > assert(instr->const_index[0] == 0); > /* Write vertex attribute values to GSVS ring */ > gs_next_vertex = LLVMBuildLoad(ctx->builder, > @@ -3175,27 +3166,11 @@ visit_emit_vertex(struct nir_to_llvm_context *ctx, > if (!(ctx->output_mask & (1ull << i))) > continue; > > - if (i == VARYING_SLOT_CLIP_DIST1 || > - i == VARYING_SLOT_CULL_DIST1) > - continue; > - > - if (i == VARYING_SLOT_CLIP_DIST0 || > - i == VARYING_SLOT_CULL_DIST0) { > + if (i == VARYING_SLOT_CLIP_DIST0) { > /* pack clip and cull into a single set of slots */ > - if (clip_cull_slot == -1) { > - clip_cull_slot = idx; > - if (ctx->num_output_clips + > ctx->num_output_culls > 4) > - slot_inc = 2; > - } else { > - slot = clip_cull_slot; > - slot_inc = 0; > - } > - if (i == VARYING_SLOT_CLIP_DIST0) > - length = ctx->num_output_clips; > - if (i == VARYING_SLOT_CULL_DIST0) {
Re: [Mesa-dev] [PATCH] i965: expose BRW_OPCODE_[F32TO16/F16TO32] opcode_descs on gen8+
On Wed, Mar 29, 2017 at 4:47 AM, Alejandro Piñeirowrote: > Technically those hw operations are only available on gen7, as gen8+ > support the conversion on the MOV. But, when using the builder to > implement nir operations (example: nir_op_fquantize2f16), it is not > needed to do the gen check. This check is done later, on the final > emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the > specific operation accordingly. > > So in the middle, during optimization phases those hw operations can > be around for gen8+ too. > > Without this patch, several (at least 95) vulkan-cts quantize tests > crashes when using INTEL_DEBUG=optimizer. For example: > dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert > --- > src/intel/compiler/brw_eu.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c > index 77400c1..bff37d7 100644 > --- a/src/intel/compiler/brw_eu.c > +++ b/src/intel/compiler/brw_eu.c > @@ -499,10 +499,10 @@ static const struct opcode_desc opcode_descs[128] = { >.name = "csel",.nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN8), > }, > [BRW_OPCODE_F32TO16] = { > - .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75, > + .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75 | GEN8 | > GEN9, > }, > [BRW_OPCODE_F16TO32] = { > - .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75, > + .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75 | GEN8 | > GEN9, > }, This table is for hardware information, used by brw_eu_validate.c. Since these opcodes do not exist on Gen8+, we should not add that to the table. I assume that the crashes you are referring to are assertion failures in brw_instruction_name() -- assert(brw_opcode_desc(devinfo, op)->name) If that's the case, there's an identical case immediately above. We use BRW_OPCODE_DO in the backend IRs, but that opcode is not used on Gen6+. I would add two more cases for f32to16 and f16to32 there. Perhaps we should not use BRW_OPCODE_* for operations used in the backend IR that may not actually exist as a real opcode in hardware. Not sure. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] i965: expose BRW_OPCODE_[F32TO16/F16TO32] opcode_descs on gen8+
Technically those hw operations are only available on gen7, as gen8+ support the conversion on the MOV. But, when using the builder to implement nir operations (example: nir_op_fquantize2f16), it is not needed to do the gen check. This check is done later, on the final emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the specific operation accordingly. So in the middle, during optimization phases those hw operations can be around for gen8+ too. Without this patch, several (at least 95) vulkan-cts quantize tests crashes when using INTEL_DEBUG=optimizer. For example: dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert v2: simplify the code using GEN_GE (Ilia Mirkin) --- src/intel/compiler/brw_eu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c index 77400c1..e7dd325 100644 --- a/src/intel/compiler/brw_eu.c +++ b/src/intel/compiler/brw_eu.c @@ -499,10 +499,10 @@ static const struct opcode_desc opcode_descs[128] = { .name = "csel",.nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN8), }, [BRW_OPCODE_F32TO16] = { - .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75, + .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN_GE(GEN7), }, [BRW_OPCODE_F16TO32] = { - .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75, + .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN_GE(GEN7), }, /* Reserved - 21-22 */ [BRW_OPCODE_BFREV] = { -- 2.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: expose BRW_OPCODE_[F32TO16/F16TO32] opcode_descs on gen8+
I guess you want GEN_GE(GEN7), no? On Mar 29, 2017 7:48 AM, "Alejandro Piñeiro"wrote: > Technically those hw operations are only available on gen7, as gen8+ > support the conversion on the MOV. But, when using the builder to > implement nir operations (example: nir_op_fquantize2f16), it is not > needed to do the gen check. This check is done later, on the final > emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the > specific operation accordingly. > > So in the middle, during optimization phases those hw operations can > be around for gen8+ too. > > Without this patch, several (at least 95) vulkan-cts quantize tests > crashes when using INTEL_DEBUG=optimizer. For example: > dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert > --- > src/intel/compiler/brw_eu.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c > index 77400c1..bff37d7 100644 > --- a/src/intel/compiler/brw_eu.c > +++ b/src/intel/compiler/brw_eu.c > @@ -499,10 +499,10 @@ static const struct opcode_desc opcode_descs[128] = { >.name = "csel",.nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN8), > }, > [BRW_OPCODE_F32TO16] = { > - .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75, > + .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75 | > GEN8 | GEN9, > }, > [BRW_OPCODE_F16TO32] = { > - .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75, > + .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75 | > GEN8 | GEN9, > }, > /* Reserved - 21-22 */ > [BRW_OPCODE_BFREV] = { > -- > 2.9.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gbm/dri: Flush after unmap
On 29 March 2017 at 13:02, Thomas Hellstromwrote: > Hi, Emil, > > On 03/29/2017 01:30 PM, Emil Velikov wrote: >> Hi Thomas, >> >> On 28 March 2017 at 20:39, Thomas Hellstrom wrote: >>> Drivers may queue dma operations on the context at unmap time so we need >>> to flush to make sure the data gets to the bo. Ideally the application >>> would take care of this, but since there appears to be no exported gbm >>> flush functionality we need to explicitly flush at unmap time. >>> >>> This fixes a problem where kmscube on vmwgfx in rgba textured mode would >>> render using an uninitialized texture rather than the intended >>> rgba pattern. >>> >> I haven't checked but the issue should not be restricted to vmwgfx, right ? >> >> Perhaps we should add the following >> Fixes: 8aeb6d768b4 ("gbm: Add map/unmap functions") >> CC: > > Unfortunately I've, perhaps a bit prematurely, already pushed the fix. > Is there a way to get it > into stable after push? > Adding mesa-stable@ to the CC list should do it. Check out the instructions for more examples. https://www.mesa3d.org/submittingpatches.html#nominations > >> >>> Signed-off-by: Thomas Hellstrom >>> --- >>> src/gbm/backends/dri/gbm_dri.c | 9 - >>> 1 file changed, 8 insertions(+), 1 deletion(-) >>> >>> diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c >>> index ac7ede8..6c2244c 100644 >>> --- a/src/gbm/backends/dri/gbm_dri.c >>> +++ b/src/gbm/backends/dri/gbm_dri.c >>> @@ -243,7 +243,7 @@ struct dri_extension_match { >>> }; >>> >>> static struct dri_extension_match dri_core_extensions[] = { >>> - { __DRI2_FLUSH, 1, offsetof(struct gbm_dri_device, flush) }, >>> + { __DRI2_FLUSH, 4, offsetof(struct gbm_dri_device, flush) }, >> Currently the classic nouveau, radeon/r200 and i915 drivers do not >> support v4 of the extension. >> As-is this will 'break' them... if they ever worked to begin with. >> >> One solution is to bail out (return -ENOSYS or similar) in map/unmap >> API of the when the DRI module is too old. >> Just some ^^ food for thought. > > Hmm. Is there even a use-case for gbm with those drivers? If so we > should perhaps make them up-to-date with the flush extension. > Of the above: - nouveau: Does not support DRI_IMAGE, thus it doesn't work even before the patch. - i915: I have some untested ancient patches. Will see if I can rebase + send out. - radeons: ?? If someone reports an issue we can ask them to write/test some code, I guess ;-) -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/2] st/mesa: EGLImageTarget* error handling
On Wed, 2017-03-29 at 13:01 +0200, Nicolai Hähnle wrote: > On 29.03.2017 09:44, Philipp Zabel wrote: > > Stop trying to specify texture or renderbuffer objects for unsupported > > EGL images. Generate the error codes specified in the OES_EGL_image > > extension. > > > > EGLImageTargetTexture2D and EGLImageTargetRenderbuffer would call > > the pipe driver's create_surface callback without ever checking that > > the given EGL image is actually compatible with the chosen target > > texture or renderbuffer. This patch adds a call to the pipe driver's > > is_format_supported callback and generates an INVALID_OPERATION error > > for unsupported EGL images. If the EGL image handle does not describe > > a valid EGL image, an INVALID_VALUE error is generated. > > > > Signed-off-by: Philipp Zabel> > Reviewed-by: Nicolai Hähnle > > --- > > v2: fixed get_surface to actually use the usage and error parameters > > The v2 usually goes above :) Ok, I'll remember that next time. > Do you need someone to commit this for you? Yes, please. regards Philipp ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: Invalidate L2 for TRANSFER_WRITE barriers
On 28 March 2017 at 19:11, Bas Nieuwenhuizenwrote: > On Tue, Mar 28, 2017 at 6:31 PM, Alex Smith > wrote: >> On 28 March 2017 at 17:09, Emil Velikov wrote: >>> >>> On 22 March 2017 at 10:06, Bas Nieuwenhuizen >>> wrote: >>> > On Tue, Mar 21, 2017 at 1:02 PM, Alex Smith >>> > wrote: >>> >> CP DMA and PKT3_WRITE_DATA (in CmdUpdateBuffer) don't (currently) write >>> >> through L2. Therefore, to make these writes visible to later accesses >>> >> we must invalidate L2 rather than just writing it back, to avoid the >>> >> possibility that stale data is read through L2. >>> >> >>> >> Cc: "17.0" >>> >> Signed-off-by: Alex Smith >>> >> --- >>> >> It's possible for both CP DMA and PKT3_WRITE_DATA to write through L2 >>> >> as far as I can see, and changing things so that they do also solves >>> >> the problems that this patch fixes. >>> >> >>> >> However, I don't know what the exact consequences of doing so are, or >>> >> whether there are any situations where that shouldn't be done, so I've >>> >> gone with this fix instead as it seems like a safer option for now. >>> > >>> > Yeah we should be able to. I'm more comfortable sending this patch to >>> > stable though, so this patch is >>> > >>> Bas, others, >>> >>> Patch addresses radv_{src,dst}_access_flush() which landed with commit >>> 6dbb0eaccc3, after the 17.0 branchpoint. >> >> >> Oops, my mistake. >> >> I think radv_CmdPipelineBarrier on the 17.0 branch still needs >> RADV_CMD_FLAG_INV_GLOBAL_L2 added for TRANSFER_WRITE barriers at least. Bas, >> do you think that should be added in a separate patch just for stable, or >> would you prefer to push those later changes to stable as well? Looks like >> there's some fixes in those as well. > > I'd prefer to backport this patch. The other patches IMO contain too > much risk for regression and are actually mostly for optimizations. Amazing, thank you. Please add a note like below, so that we get some nice and clear references [Bas: patch is a backport for 17.0 of the cherry-pick below] (cherry picked from commit bc5d587a80b64fb3e0a5ea8067e6317fbca2bbc5) -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev