[Mesa-dev] [PATCH] radeonsi: add a workaround for weird s_buffer_load_dword behavior on SI

2017-10-22 Thread Marek Olšák
From: Marek Olšák See my LLVM patch which fixes the root cause. Users have to apply this patch and then they have 2 choices: - Downgrade to LLVM 5.0 - Update to LLVM git after my LLVM patch is pushed. It won't be possible to use current and earlier development version of

[Mesa-dev] [PATCH 3/7] i965: enable varying component packing for BDW+

2017-10-22 Thread Timothy Arceri
shader-db results BDW: total instructions in shared programs: 13192895 -> 13182437 (-0.08%) instructions in affected programs: 827145 -> 816687 (-1.26%) helped: 5199 HURT: 116 total cycles in shared programs: 539249342 -> 539156566 (-0.02%) cycles in affected programs: 21894552 -> 21801776

[Mesa-dev] [PATCH 6/7] radv: clone meta shaders before linking

2017-10-22 Thread Timothy Arceri
The IR is reused in different pipeline combinations so we need to clone it to avoid link time optimistaions messing up the original copy. --- src/amd/vulkan/radv_pipeline.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/amd/vulkan/radv_pipeline.c

[Mesa-dev] [PATCH 2/7] nir: add varying component packing helpers

2017-10-22 Thread Timothy Arceri
--- src/compiler/nir/nir.h | 2 + src/compiler/nir/nir_linking_helpers.c | 235 + 2 files changed, 237 insertions(+) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index dd833cf183..6a761ab655 100644 --- a/src/compiler/nir/nir.h +++

[Mesa-dev] [PATCH 4/7] ac: add support for explicit component packing

2017-10-22 Thread Timothy Arceri
This is needed for RADV to support explicit component packing. This is also required to use the new NIR component splitting / packing passes. --- src/amd/common/ac_nir_to_llvm.c | 57 + 1 file changed, 46 insertions(+), 11 deletions(-) diff --git

[Mesa-dev] [PATCH 7/7] radv: enable nir component packing

2017-10-22 Thread Timothy Arceri
SaschaWillems Vulkan demo tessellation: ~4300fps -> ~4800fps --- src/amd/vulkan/radv_pipeline.c | 13 + 1 file changed, 13 insertions(+) diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index d0e47383d7..69bda152e2 100644 ---

[Mesa-dev] [PATCH 1/7] i965: call nir_lower_io_to_scalar() at link time for BDW and above

2017-10-22 Thread Timothy Arceri
This will allow dead components of varyings to be removed. BDW shader-db results: total instructions in shared programs: 13190730 -> 13108459 (-0.62%) instructions in affected programs: 2110903 -> 2028632 (-3.90%) helped: 14043 HURT: 486 total cycles in shared programs: 541148990 -> 540544072

[Mesa-dev] [PATCH 5/7] radv: enable lower to scalar nir pass

2017-10-22 Thread Timothy Arceri
This will allow dead components of varyings to be removed. --- src/amd/vulkan/radv_pipeline.c | 24 1 file changed, 24 insertions(+) diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index 669d9a4858..2a25a423a2 100644 ---

[Mesa-dev] [PATCH v2] clover: Fix compilation after clang r315871

2017-10-22 Thread Jan Vesely
From: Jan Vesely v2: use a more generic compat function Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103388 Signed-off-by: Jan Vesely --- src/gallium/state_trackers/clover/llvm/codegen/common.cpp | 5 ++---

Re: [Mesa-dev] [PATCH v2] i965 : optimized bucket index calculation

2017-10-22 Thread Marathe, Yogesh
Ian, Rest all review comments noted, we'll get back, Thanks. >-Original Message- >From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf Of >Ian Romanick >Sent: Friday, October 20, 2017 9:51 AM >To: Muthukumar, Aravindan ; mesa- >>

[Mesa-dev] [PATCH 2/6] util: move pipe_barrier into src/util and rename to util_barrier

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle The #if guard is probably not 100% equivalent to the previous PIPE_OS check, but if anything it should be an over-approximation (are there pthread implementations without barriers?), so people will get either a good implementation or compile errors

[Mesa-dev] [PATCH 4/6] radeonsi: move pipe debug callback to si_context

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/drivers/radeon/r600_pipe_common.c | 12 src/gallium/drivers/radeon/r600_pipe_common.h | 1 - src/gallium/drivers/radeonsi/si_compute.c | 6 +++--- src/gallium/drivers/radeonsi/si_pipe.c | 12

[Mesa-dev] [PATCH 6/6] radeonsi: always use async compiles when creating shader/compute states

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle With Gallium threaded contexts, creating shader/compute states is effectively a screen operation, so we should not use context state. In particular, this allows us to avoid using the context's LLVM TargetMachine. This isn't an issue yet because

[Mesa-dev] [PATCH 5/5] mesa: flush and wait after creating a fallback texture

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle Fixes non-deterministic failures in dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_sync.images.texture_source.teximage2d_render and others in dEQP-EGL.functional.sharing.gles2.multithread.* --- src/mesa/main/texobj.c | 5 + 1 file

[Mesa-dev] [PATCH 3/5] st/mesa: use asynchronous flushes

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/mesa/state_tracker/st_cb_flush.c | 4 ++-- src/mesa/state_tracker/st_cb_syncobj.c | 26 -- 2 files changed, 26 insertions(+), 4 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_flush.c

[Mesa-dev] [PATCH 0/5] st/dri,mesa: various flushing patches

2017-10-22 Thread Nicolai Hähnle
Hi all, Two bugs that I noticed are fixed in patches #1 and #5. Patch #2 and #4 are mostly cleanups. Patch #3 enables asynchronous Gallium flushes in st/mesa. Section 5.3.1 (Determining Completion of Changes to an object) states that: "The contents of an object T are considered to have been

[Mesa-dev] [PATCH 2/5] st/mesa: remove redundant flushes from st_flush

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle st_flush should flush state tracker-internal state and the pipe, but not mesa/main state. Of the four callers: - glFlush/glFinish already call FLUSH_{VERTICES,STATE}. - st_vdpau doesn't need to call them. - st_manager will now call them explicitly.

[Mesa-dev] [PATCH 1/5] st/dri: use stapi flush instead of pipe flush when creating fences

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle There may be pending operations (e.g. vertices) that need to be flushed by the state tracker. Found by inspection. --- src/gallium/state_trackers/dri/dri_helpers.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git

[Mesa-dev] [PATCH 4/5] mesa: increase MaxServerWaitTimeout

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle The current value was introduced in commit a27180d0d8666, which claims that it represents ~1.11 years. However, it is interpreted in nanoseconds, so it actually only represents ~9.8 hours. That seems a bit short. Use the largest value consistent

[Mesa-dev] [PATCH 1/2] ac/nir: Fix nir_texop_lod on GFX for 1D arrays.

2017-10-22 Thread Bas Nieuwenhuizen
Fixes: 1bcb953e166 'radv: handle GFX9 1D textures' --- src/amd/common/ac_nir_to_llvm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 83b49b535c6..473dd67355b 100644 ---

[Mesa-dev] [PATCH 2/2] radv: Disallow indirect outputs for GS on GFX9 as well.

2017-10-22 Thread Bas Nieuwenhuizen
Since it also uses the output vector before writing to memory. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' --- src/amd/vulkan/radv_shader.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c

[Mesa-dev] [PATCH 4/7] u_queue: add util_queue_fence_reset

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/util/u_queue.c | 4 +--- src/util/u_queue.h | 13 + 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/src/util/u_queue.c b/src/util/u_queue.c index 33436e0749a..2272006042f 100644 --- a/src/util/u_queue.c +++

[Mesa-dev] [PATCH 5/7] u_queue: add a futex-based implementation of fences

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle Fences are now 4 bytes instead of 96 bytes (on my 64-bit system). Signaling a fence is a single atomic operation in the fast case plus a syscall in the slow case. Testing if a fence is signaled is the same as before (a simple comparison), but

[Mesa-dev] [PATCH 2/7] u_queue: group fence functions together

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/util/u_queue.h | 19 ++- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/src/util/u_queue.h b/src/util/u_queue.h index ff713ae54d6..7a028ef0847 100644 --- a/src/util/u_queue.h +++ b/src/util/u_queue.h @@ -47,20

[Mesa-dev] [PATCH 6/7] radeonsi: use ready fences on all shaders, not just optimized ones

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle There's a race condition between si_shader_select_with_key and si_bind_XX_shader: Thread 1 Thread 2 si_shader_select_with_key begin compiling the first variant

[Mesa-dev] [PATCH 7/7] radeonsi: reduce the scope of sel->mutex in si_shader_select_with_key

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle We only need the lock to guard changes in the variant linked list. The actual compilation can happen outside the lock, since we use the ready fence as a guard. --- src/gallium/drivers/radeonsi/si_state_shaders.c | 6 -- 1 file changed, 4

[Mesa-dev] [PATCH 1/7] util: move futex helpers into futex.h

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/util/Makefile.sources | 1 + src/util/futex.h | 51 +++ src/util/meson.build | 1 + src/util/simple_mtx.h | 20 +-- 4 files changed, 54 insertions(+), 19

[Mesa-dev] [PATCH 3/7] u_queue: export util_queue_fence_signal

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/util/u_queue.c | 2 +- src/util/u_queue.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/src/util/u_queue.c b/src/util/u_queue.c index 3b05110e9f8..33436e0749a 100644 --- a/src/util/u_queue.c +++ b/src/util/u_queue.c @@

[Mesa-dev] [PATCH 0/7] u_queue fence patches and a radeonsi fix

2017-10-22 Thread Nicolai Hähnle
Hi all, Another multi-threading bug I ran into, this time with radeonsi shader fences. Patches 1-5 make util_queue_fence more widely usable, and include an optional futex-based implementation. The patches themselves are on top of the simple-mutex patches, but could easily be removed to drop the

[Mesa-dev] [PATCH 23/25] ddebug: optionally handle transfer commands like draws

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle Transfer commands can have associated GPU operations. Enabled by passing GALLIUM_DDEBUG=transfers. --- src/gallium/drivers/ddebug/dd_context.c | 65 - src/gallium/drivers/ddebug/dd_draw.c| 234

[Mesa-dev] [PATCH 09/25] gallium/u_threaded: implement asynchronous flushes

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle This requires out-of-band creation of fences, and will be signaled to the pipe_context::flush implementation by a special TC_FLUSH_ASYNC flag. --- src/gallium/auxiliary/util/u_threaded_context.c| 96 +-

[Mesa-dev] [PATCH 07/25] radeonsi: move fence functions to si_fence.c

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/drivers/radeon/r600_pipe_common.c | 267 -- src/gallium/drivers/radeonsi/Makefile.sources | 1 + src/gallium/drivers/radeonsi/meson.build | 1 + src/gallium/drivers/radeonsi/si_fence.c | 304

[Mesa-dev] [PATCH v3 17/34] intel/compiler: Remove final_program_size from brw_compile_*

2017-10-22 Thread Jordan Justen
The caller can now use brw_stage_prog_data::program_size which is set by the brw_compile_* functions. Cc: Jason Ekstrand Signed-off-by: Jordan Justen --- src/intel/blorp/blorp.c | 10 -- src/intel/blorp/blorp_blit.c|

[Mesa-dev] [PATCH v3 29/34] i965: Don't link when the program was found in the disk cache

2017-10-22 Thread Jordan Justen
Signed-off-by: Jordan Justen Cc: Timothy Arceri --- src/mesa/drivers/dri/i965/brw_link.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp b/src/mesa/drivers/dri/i965/brw_link.cpp index

[Mesa-dev] [PATCH v3 18/34] blob: Don't set overrun if reading 0 bytes at end of data

2017-10-22 Thread Jordan Justen
Signed-off-by: Jordan Justen --- src/compiler/blob.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/blob.c b/src/compiler/blob.c index 8dd254fefc6..5e8671b7b44 100644 --- a/src/compiler/blob.c +++ b/src/compiler/blob.c @@ -256,7 +256,7

[Mesa-dev] [PATCH v3 09/34] glsl_to_nir: Zero nir_constant in constant_copy for valgrind & nir_serialize

2017-10-22 Thread Jordan Justen
Signed-off-by: Jordan Justen Reviewed-by: Timothy Arceri Reviewed-by: Kenneth Graunke --- src/compiler/glsl/glsl_to_nir.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[Mesa-dev] [PATCH v3 01/34] glsl: move shader_cache type handling to glsl_types

2017-10-22 Thread Jordan Justen
From: Connor Abbott Not sure if this is the best place to put it, but we're going to need this for NIR too. Reviewed-by: Timothy Arceri Reviewed-by: Jordan Justen --- src/compiler/glsl/shader_cache.cpp | 171

[Mesa-dev] [PATCH v3 11/34] nir: Add hooks for testing serialization

2017-10-22 Thread Jordan Justen
From: Jason Ekstrand Reviewed-by: Timothy Arceri Reviewed-by: Jordan Justen --- src/compiler/nir/nir.h | 17 + src/compiler/nir/nir_serialize.c | 19 +++ 2 files changed, 36

[Mesa-dev] [PATCH v3 26/34] mesa/glsl: add api_enabled flag to gl_transform_feedback_info

2017-10-22 Thread Jordan Justen
From: Timothy Arceri This will be used to disable the shader cache when xfb is enabled via the api as we don't currently allow for it when generating the sha for the shader. --- src/compiler/glsl/link_varyings.cpp | 5 - src/mesa/main/mtypes.h | 3

[Mesa-dev] [PATCH v3 08/34] glsl_to_nir: Zero nir_variable struct for valgrind & nir_serialize

2017-10-22 Thread Jordan Justen
Signed-off-by: Jordan Justen Reviewed-by: Timothy Arceri Reviewed-by: Kenneth Graunke --- src/compiler/glsl/glsl_to_nir.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[Mesa-dev] [PATCH v3 23/34] i965: add shader cache support for geometry shaders

2017-10-22 Thread Jordan Justen
From: Timothy Arceri v2: * Use MAYBE_UNUSED. (Matt) [jordan.l.jus...@intel.com: *_cached_program => brw_disk_cache_*_program] Signed-off-by: Jordan Justen --- src/mesa/drivers/dri/i965/brw_disk_cache.c | 23 +++

[Mesa-dev] [PATCH v3 22/34] i965: Add shader cache support for vertex and fragment stages

2017-10-22 Thread Jordan Justen
From: Timothy Arceri This enables the cache on vertex and fragment shaders only. v2: * Use MAYBE_UNUSED. (Matt) [jordan.l.jus...@intel.com: reword subject] [jordan.l.jus...@intel.com: *_cached_program => brw_disk_cache_*_program] Signed-off-by: Jordan Justen

[Mesa-dev] [PATCH v3 20/34] intel/compiler: Add functions to get prog_data and prog_key sizes for a stage

2017-10-22 Thread Jordan Justen
Signed-off-by: Jordan Justen --- src/intel/compiler/brw_compiler.c | 36 src/intel/compiler/brw_compiler.h | 6 ++ 2 files changed, 42 insertions(+) diff --git a/src/intel/compiler/brw_compiler.c

[Mesa-dev] [PATCH v3 24/34] i965: add shader cache support for tess stages

2017-10-22 Thread Jordan Justen
From: Timothy Arceri v2: * Use MAYBE_UNUSED. (Matt) [jordan.l.jus...@intel.com: *_cached_program => brw_disk_cache_*_program] Signed-off-by: Jordan Justen --- src/mesa/drivers/dri/i965/brw_disk_cache.c | 45

[Mesa-dev] [PATCH v3 07/34] nir: Zero nir_load_const_instr::value for valgrind & nir_serialize

2017-10-22 Thread Jordan Justen
Signed-off-by: Jordan Justen Reviewed-by: Timothy Arceri Reviewed-by: Kenneth Graunke --- src/compiler/nir/nir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/nir/nir.c

[Mesa-dev] [PATCH v3 06/34] intel/nir: Zero local index const struct for valgrind & nir_serialize

2017-10-22 Thread Jordan Justen
Signed-off-by: Jordan Justen Reviewed-by: Timothy Arceri Reviewed-by: Kenneth Graunke --- src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 1 + 1 file changed, 1 insertion(+) diff --git

[Mesa-dev] [PATCH v3 15/34] i965: Don't rely on nir for uses_texture_gather

2017-10-22 Thread Jordan Justen
When a program is restored from the shader cache, prog->nir will be NULL, but prog->info will be restored. Signed-off-by: Jordan Justen --- src/mesa/drivers/dri/i965/brw_wm.c | 4 ++-- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++--

[Mesa-dev] [PATCH v3 03/34] compiler/types: Support [de]serializing void types

2017-10-22 Thread Jordan Justen
From: Jason Ekstrand Reviewed-by: Timothy Arceri Reviewed-by: Jordan Justen --- src/compiler/glsl_types.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/compiler/glsl_types.cpp

[Mesa-dev] [PATCH v3 02/34] nir/intrinsics: Set the correct num_indices for load_output

2017-10-22 Thread Jordan Justen
From: Jason Ekstrand Cc: mesa-sta...@lists.freedesktop.org Reviewed-by: Timothy Arceri Reviewed-by: Kenneth Graunke Reviewed-by: Jordan Justen --- src/compiler/nir/nir_intrinsics.h | 2 +- 1

[Mesa-dev] [PATCH v2 0/3] st/mesa: per-context sampler views array locking

2017-10-22 Thread Nicolai Hähnle
Hi all, I sent similar patches around a while ago. After drawoverhead benchmarks I went back to the drawing board. In the end, the current approach is quite similar to what I did before, except with an added "poor man's RCU" that allows the fast path to work without locking. Please review!

[Mesa-dev] [PATCH 08/25] gallium/u_threaded: mark queries flushed only for non-deferred flushes

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle The driver uses (and must use) the flushed flag of queries as a hint that it does not have to check for synchronization with currently queued up commands. Deferred flushes do not actually flush queued up commands, so we must not set the flushed flag

[Mesa-dev] [PATCH 02/25] gallium: remove unused and deprecated u_time.h

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle Cc: Jose Fonseca --- src/gallium/auxiliary/Makefile.sources | 1 - src/gallium/auxiliary/meson.build | 1 - src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c | 1 -

[Mesa-dev] [PATCH 01/25] util: move os_time.[ch] to src/util

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/auxiliary/Makefile.sources | 2 -- src/gallium/auxiliary/gallivm/lp_bld_init.c| 2 +- src/gallium/auxiliary/hud/hud_cpu.c| 2 +- src/gallium/auxiliary/hud/hud_cpufreq.c| 2 +-

[Mesa-dev] [PATCH 14/25] radeonsi: implement PIPE_FLUSH_{TOP, BOTTOM}_OF_PIPE

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_fence.c | 83 - 1 file changed, 82 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_fence.c b/src/gallium/drivers/radeonsi/si_fence.c index

[Mesa-dev] [PATCH 00/25] Asynchronous flushes and ddebug core rewrite

2017-10-22 Thread Nicolai Hähnle
Hi all, I was chasing an elusive bug that went away with GALLIUM_THREAD=0, so I wanted to use ddebug with Gallium threads. That required some fixes to how radeonsi compiles shaders. However, with that fixed, ddebug *also* made the bug go away. This series does a lot of things, but the

[Mesa-dev] [PATCH 13/25] radeonsi: document some subtle details of fence_finish & fence_server_sync

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_fence.c | 31 ++- 1 file changed, 30 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_fence.c b/src/gallium/drivers/radeonsi/si_fence.c index

[Mesa-dev] [PATCH 05/25] gallium: add PIPE_FLUSH_ASYNC and PIPE_FLUSH_HINT_FINISH

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle Also document some subtleties of pipe_context::flush. --- src/gallium/docs/source/context.rst | 9 + src/gallium/include/pipe/p_context.h | 8 +++- src/gallium/include/pipe/p_defines.h | 2 ++ 3 files changed, 18 insertions(+), 1

[Mesa-dev] [PATCH 19/25] ddebug: use an atomic increment when numbering files

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/drivers/ddebug/dd_util.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/ddebug/dd_util.h b/src/gallium/drivers/ddebug/dd_util.h index cfc0fb0ccce..bdfb7cc9163 100644 ---

[Mesa-dev] [PATCH 06/25] gallium: add PIPE_FLUSH_{TOP, BOTTOM}_OF_PIPE bits

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle These bits are intended to be used by the ddebug hang detection and are named in analogy to the Vulkan stage bits (and the corresponding Radeon pipeline event). Hang detection needs fences on the granularity of individual commands, which nothing

[Mesa-dev] [PATCH 12/25] gallium: add pipe_context::callback

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle For running post-draw operations inside the driver thread. ddebug will use it. --- src/gallium/auxiliary/util/u_threaded_context.c| 46 ++ .../auxiliary/util/u_threaded_context_calls.h | 1 +

[Mesa-dev] [PATCH 11/25] gallium/u_threaded: implement pipe_context::set_log_context

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/auxiliary/util/u_threaded_context.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/gallium/auxiliary/util/u_threaded_context.c b/src/gallium/auxiliary/util/u_threaded_context.c index fb4864bcbaa..43f983a1e5a

[Mesa-dev] [PATCH 21/25] ddebug: generalize print_named_xxx via a PRINT_NAMED macro

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/drivers/ddebug/dd_draw.c | 25 ++--- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/src/gallium/drivers/ddebug/dd_draw.c b/src/gallium/drivers/ddebug/dd_draw.c index 99c9c929b2e..a856d0142a1

[Mesa-dev] [PATCH 22/25] ddebug: dump context and before/after times of draws

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/drivers/ddebug/dd_draw.c | 8 src/gallium/drivers/ddebug/dd_pipe.h | 2 ++ 2 files changed, 10 insertions(+) diff --git a/src/gallium/drivers/ddebug/dd_draw.c b/src/gallium/drivers/ddebug/dd_draw.c index

[Mesa-dev] [PATCH v3 10/34] nir: add serialization and deserialization

2017-10-22 Thread Jordan Justen
From: Connor Abbott v2 (Jason Ekstrand): - Various whitespace cleanups - Add helpers for reading/writing objects - Rework derefs - [de]serialize nir_shader::num_* - Fix uses of blob_reserve_bytes - Use a bitfield struct for packing tex_instr data v3: - Zero

[Mesa-dev] [PATCH v3 19/34] intel/compiler: Add union types for prog_data and prog_key stages

2017-10-22 Thread Jordan Justen
Signed-off-by: Jordan Justen --- src/intel/compiler/brw_compiler.h | 18 ++ 1 file changed, 18 insertions(+) diff --git a/src/intel/compiler/brw_compiler.h b/src/intel/compiler/brw_compiler.h index 701b4a80bf1..9359b767e35 100644 ---

[Mesa-dev] [PATCH v3 05/34] nir: Zero local_size const struct for valgrind & nir_serialize

2017-10-22 Thread Jordan Justen
Signed-off-by: Jordan Justen Reviewed-by: Timothy Arceri Reviewed-by: Kenneth Graunke --- src/compiler/nir/nir_lower_system_values.c | 1 + 1 file changed, 1 insertion(+) diff --git

[Mesa-dev] [PATCH v3 28/34] i965: add cache fallback support using serialized nir

2017-10-22 Thread Jordan Justen
If the i965 gen program cannot be loaded from the cache, then we fallback to using a serialized nir program. This is based on "i965: add cache fallback support" by Timothy Arceri . Tim's version was written to fallback to compiling from source, and therefore had to

[Mesa-dev] [PATCH v3 32/34] disk_cache: Fix issue reading GLSL metadata

2017-10-22 Thread Jordan Justen
This would cause the read of the metadata content to fail, which would prevent the linking from being skipped. Seen on Rocket League with i965 shader cache. Cc: Timothy Arceri Signed-off-by: Jordan Justen Reviewed-by: Timothy Arceri

[Mesa-dev] [PATCH v3 00/34] i965 disk shader cache

2017-10-22 Thread Jordan Justen
git://people.freedesktop.org/~jljusten/mesa i965-shader-cache-v3 The series adds support for a disk shader cache for i965, but it does not enable it by default. To enable the i965 shader cache you need to set the environment variable MESA_GLSL_CACHE_DISABLE=0. v3: * Reworks suggested by Jason:

Re: [Mesa-dev] [PATCH] meson: fix egl build for meson version < 0.43

2017-10-22 Thread Rhys Kidd
On 20 October 2017 at 20:34, Dylan Baker wrote: > Meson 0.43 added the ability to pass nested lists to > include_directories, so the code that we have works for 0.43, but not > for 0.42. This patch changes the include_directories list to be flat so > it works with 0.42 > >

[Mesa-dev] [PATCH 1/6] gallium: add async debug message forwarding helper

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/auxiliary/Makefile.sources | 2 + src/gallium/auxiliary/meson.build | 2 + src/gallium/auxiliary/util/u_async_debug.c | 113 + src/gallium/auxiliary/util/u_async_debug.h | 74

[Mesa-dev] [PATCH 3/6] u_queue: add util_queue_finish for waiting for previously added jobs

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle Schedule one job for every thread, and wait on a barrier inside the job execution function. --- src/util/u_queue.c | 35 +++ src/util/u_queue.h | 2 ++ 2 files changed, 37 insertions(+) diff --git

[Mesa-dev] [PATCH 5/6] radeonsi: fix potential use-after-free of debug callbacks

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle Found by inspection. --- src/gallium/drivers/radeonsi/si_pipe.c | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c index 0d2132efb54..34ca2a56be5 100644 ---

[Mesa-dev] [PATCH 0/6] radeonsi: always compile asynchronously from shader (selector) creation

2017-10-22 Thread Nicolai Hähnle
Hi all, We compile the main shader part of when creating a shader CSO. This happens asynchronously by default, except for debug contexts due to the way shader printing works. This is one reason why ddebug doesn't work with threaded Gallium (because threaded Gallium want to create shader CSOs from

[Mesa-dev] [PATCH 15/25] gallium/u_dump: export util_dump_ptr

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle Change format to %p while we're at it. --- src/gallium/auxiliary/util/u_dump.h | 3 +++ src/gallium/auxiliary/util/u_dump_state.c | 4 ++-- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/util/u_dump.h

[Mesa-dev] [PATCH 03/25] threads: update for late C11 changes

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle C11 threads were changed to use struct timespec instead of xtime, and thrd_sleep got a second argument. See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1554.htm and http://en.cppreference.com/w/c/thread/{thrd_sleep,cnd_timedwait,mtx_timedlock}

[Mesa-dev] [PATCH 04/25] util/u_queue: add util_queue_fence_wait_timeout

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/util/futex.h | 9 -- src/util/simple_mtx.h | 2 +- src/util/u_queue.c| 77 ++- src/util/u_queue.h| 51 ++ 4 files changed, 116 insertions(+),

[Mesa-dev] [PATCH 10/25] gallium/u_threaded: avoid syncs for get_query_result

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle Queries should still get marked as flushed when flushes are executed asynchronously in the driver thread. To this end, the management of the unflushed_queries list is moved into the driver thread. --- src/gallium/auxiliary/util/u_threaded_context.c

[Mesa-dev] [PATCH 20/25] ddebug: rewrite to always use a threaded approach

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle This patch has multiple goals: 1. Off-load the writing of records in 'always' mode to another thread for performance. 2. Allow using ddebug with threaded contexts. This really forces us to move some of the "after_draw" handling into another

[Mesa-dev] [PATCH 18/25] dd/util: extract dd_get_debug_filename_and_mkdir

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/drivers/ddebug/dd_util.h | 30 ++ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/ddebug/dd_util.h b/src/gallium/drivers/ddebug/dd_util.h index 4e1a945c57d..cfc0fb0ccce

[Mesa-dev] [PATCH 17/25] gallium/u_dump: add and use util_dump_transfer_usage

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/auxiliary/util/u_debug.c| 19 +++ src/gallium/auxiliary/util/u_dump.h | 3 ++ src/gallium/auxiliary/util/u_dump_defines.c | 53 + src/gallium/auxiliary/util/u_dump_state.c | 2

[Mesa-dev] [PATCH 25/25] radeonsi: use a threaded context even for debug contexts

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_pipe.c | 11 ++- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c index 19428d8b4e7..c5742cf9883 100644 ---

[Mesa-dev] [PATCH 24/25] radeonsi: record and dump time of flush

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_debug.c | 5 - src/gallium/drivers/radeonsi/si_hw_context.c | 3 +++ src/gallium/drivers/radeonsi/si_pipe.h | 1 + 3 files changed, 8 insertions(+), 1 deletion(-) diff --git

[Mesa-dev] [PATCH 16/25] gallium/u_dump: add util_dump_ns

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle --- src/gallium/auxiliary/util/u_dump.h | 3 +++ src/gallium/auxiliary/util/u_dump_state.c | 10 ++ 2 files changed, 13 insertions(+) diff --git a/src/gallium/auxiliary/util/u_dump.h b/src/gallium/auxiliary/util/u_dump.h index

[Mesa-dev] [PATCH 1/3] gallium: clarify the constraints on sampler_view_destroy

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle r600 expects the context that created the sampler view to still be alive (there is a per-context list of sampler views). svga currently bails when the context of destruction is not the same as creation. The GL state tracker, which is the only one

[Mesa-dev] [PATCH 3/3] st/mesa: guard sampler views changes with a mutex

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle Some locking is unfortunately required, because well-formed GL programs can have multiple threads racing to access the same texture, e.g.: two threads/contexts rendering from the same texture, or one thread destroying a context while the other is

[Mesa-dev] [PATCH 2/3] st/mesa: re-arrange st_finalize_texture

2017-10-22 Thread Nicolai Hähnle
From: Nicolai Hähnle Move the early-out for surface-based textures earlier. This narrows the scope of the locking added in a follow-up commit. Fix one remaining case of initializing a surface-based texture without properly finalizing it. ---

[Mesa-dev] [PATCH v3 21/34] i965: add initial implementation of on disk shader cache

2017-10-22 Thread Jordan Justen
From: Timothy Arceri This uses the recently-added disk_cache.c to write out the final linked binary for vertex and fragment shader programs. This is based off the initial implementation done by Carl Worth. v2: * Squash 'i965: add image param shader cache support'

[Mesa-dev] [PATCH v3 04/34] glsl: Add field initializers for glsl_struct_field default constructor

2017-10-22 Thread Jordan Justen
This helps valgrind when encode_type_to_blob is used. Signed-off-by: Jordan Justen Reviewed-by: Kenneth Graunke --- src/compiler/glsl_types.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/compiler/glsl_types.h

Re: [Mesa-dev] [PATCH] meson: fix egl build for meson version < 0.43

2017-10-22 Thread Vinson Lee
On Fri, Oct 20, 2017 at 5:34 PM, Dylan Baker wrote: > Meson 0.43 added the ability to pass nested lists to > include_directories, so the code that we have works for 0.43, but not > for 0.42. This patch changes the include_directories list to be flat so > it works with 0.42 >

Re: [Mesa-dev] [PATCH v3 38/43] i965/fs: Optimize 16-bit SSBO stores by packing two into a 32-bit reg

2017-10-22 Thread Eduardo Lima Mitev
On 10/12/2017 08:38 PM, Jose Maria Casanova Crespo wrote: > From: Eduardo Lima Mitev > > Currently, we use byte-scattered write messages for storing 16-bit > into an SSBO. This is because untyped surface messages have a fixed > 32-bit size. > > This patch optimizes these

[Mesa-dev] [PATCH v2] glsl: fix derived cs variables

2017-10-22 Thread Ilia Mirkin
There are two issues with the current implementation. First, it relies on the layout(local_size_*) happening in the same shader as the main function, and secondly it doesn't work for variable group sizes. In both cases, the simplest fix is to move the setup of these derived values to a later

Re: [Mesa-dev] [PATCH 4/5] mesa: increase MaxServerWaitTimeout

2017-10-22 Thread Kenneth Graunke
On Sunday, October 22, 2017 12:18:11 PM PDT Nicolai Hähnle wrote: > From: Nicolai Hähnle > > The current value was introduced in commit a27180d0d8666, which claims > that it represents ~1.11 years. However, it is interpreted in nanoseconds, > so it actually only

[Mesa-dev] [PATCH] ac/nir: Only clamp shadow reference on radeonsi.

2017-10-22 Thread Bas Nieuwenhuizen
Vulkan CTS does not expect the value to be clamped (at least for D32), and it makes a differences even though depth is in [0,1], due to strict inequalities. I couldn't find anything in the Vulkan spec about this, but the test seemed to be copied from GL tests and the GL spec only specifies

Re: [Mesa-dev] [PATCH v2] clover: Fix compilation after clang r315871

2017-10-22 Thread Francisco Jerez
Jan Vesely writes: > From: Jan Vesely > > v2: use a more generic compat function > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103388 > Signed-off-by: Jan Vesely > --- >

[Mesa-dev] [PATCH] mesa/bufferobj: don't double negate the range

2017-10-22 Thread Dave Airlie
From: Dave Airlie This fixes a regression I introduced refactoring this code, I managed to invert range twice, I moved the inversion into the common code, but forgot to stop doing it in the callee. Fixes: GL45-CTS.multi_bind.dispatch_bind_buffers_base Fixes: 35ac13ed3

Re: [Mesa-dev] [PATCH 2/2] radv: Disallow indirect outputs for GS on GFX9 as well.

2017-10-22 Thread Timothy Arceri
Reviewed-by: Timothy Arceri On 23/10/17 03:43, Bas Nieuwenhuizen wrote: Since it also uses the output vector before writing to memory. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' --- src/amd/vulkan/radv_shader.c | 4 +--- 1 file changed, 1

[Mesa-dev] [PATCH v3 30/34] i965: Initialize sha1 hash of dri config options

2017-10-22 Thread Jordan Justen
Signed-off-by: Jordan Justen Reviewed-by: Timothy Arceri --- src/mesa/drivers/dri/i965/brw_context.c | 4 src/mesa/drivers/dri/i965/brw_context.h | 1 + 2 files changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.c

[Mesa-dev] [PATCH v3 16/34] intel/compiler: add new field for storing program size

2017-10-22 Thread Jordan Justen
From: Carl Worth This will be used by the on disk shader cache. v2: * Set in brw_compile_* rather than brw_codegen_*. (Jason) Signed-off-by: Timothy Arceri [jordan.l.jus...@intel.com: Only add to brw_stage_prog_data] Signed-off-by: Jordan

[Mesa-dev] [PATCH v3 31/34] glsl/shader_cache: Save fs (BlendSupport) metadata

2017-10-22 Thread Jordan Justen
Fixes many GL 4.5 CTS blend tests, such as: * GL45-CTS.blend_equation_advanced.extension_directive_enable * GL45-CTS.blend_equation_advanced.extension_directive_warn * GL45-CTS.blend_equation_advanced.blend_all.GL_MULTIPLY_KHR_all_qualifier *

  1   2   >