Re: [Mesa-dev] [PATCH v2 01/26] util: move os_time.[ch] to src/util

2017-11-10 Thread Nicolai Hähnle
On 10.11.2017 13:35, Jon Turney wrote: On 06/11/2017 10:23, Nicolai Hähnle wrote: diff --git a/src/gallium/auxiliary/os/os_time.h b/src/util/os_time.h similarity index 89% rename from src/gallium/auxiliary/os/os_time.h rename to src/util/os_time.h index ca0bdd5a0c4..049ab118db2 100644 --- a/src

[Mesa-dev] [PATCH] util/u_thread: fix compilation on Mac OS

2017-11-10 Thread Nicolai Hähnle
From: Nicolai Hähnle Apparently, it doesn't have pthread barriers. p_config.h (which was originally used to guard this code) uses the __APPLE__ macro to detect Mac OS. Fixes: f0d3a4de75 ("util: move pipe_barrier into src/util and rename to util_barrier") Cc: Roland Scheidegger

[Mesa-dev] [PATCH] util/u_queue: handle OS_TIMEOUT_INFINITE in util_queue_fence_wait_timeout

2017-11-10 Thread Nicolai Hähnle
From: Nicolai Hähnle Fixes e.g. piglit/bin/bufferstorage-persistent read -auto Fixes: e6dbc804a87a ("winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences") --- src/util/u_queue.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/util/u_queue.h

[Mesa-dev] [PATCH] gallium/u_threaded: fix end_query regression

2017-11-10 Thread Nicolai Hähnle
From: Nicolai Hähnle Ouch... Fixes: 244536d3d6b4 ("gallium/u_threaded: avoid syncs for get_query_result") --- src/gallium/auxiliary/util/u_threaded_context.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/gallium/auxiliary/util/u_threaded_context.c b/src/gallium/auxi

Re: [Mesa-dev] [PATCH] threads: fix MinGW build breakage

2017-11-09 Thread Nicolai Hähnle
Sorry for the mess. Reviewed-by: Nicolai Hähnle On 09.11.2017 17:46, Brian Paul wrote: Fixes: f1a364878431c8 ("threads: update for late C11 changes") --- include/c11/threads_win32.h | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/include/c11/threads

Re: [Mesa-dev] [PATCH 1/4] r600: use min_dx10/max_dx10 instead of min/max

2017-11-09 Thread Nicolai Hähnle
On 09.11.2017 18:26, Roland Scheidegger wrote: Am 09.11.2017 um 18:19 schrieb Jan Vesely: On Thu, 2017-11-09 at 03:58 +0100, srol...@vmware.com wrote: From: Roland Scheidegger I believe this is the safe thing to do, especially ever since the driver actually generates NaNs for muls too. Albeit

Re: [Mesa-dev] [PATCH 2/4] r600: use mysterious DX10_CLAMP bit in pixel shader setup

2017-11-09 Thread Nicolai Hähnle
On 09.11.2017 18:37, Jan Vesely wrote: On Thu, 2017-11-09 at 09:13 +0100, Nicolai Hähnle wrote: The internal docs are pretty much the same (i.e. confusing and non-explicit), but my layman's reading of the RTL is that DX10_CLAMP only affects clamping. So if you have a v_mul_f32 0

[Mesa-dev] [PATCH 3/4] st/mesa: use asynchronous flushes in st_finish

2017-11-09 Thread Nicolai Hähnle
From: Nicolai Hähnle With threaded gallium, the driver may currently be running in another thread. In that case, we will execute all remaining commands in that thread instead of syncing, which should be better for cache locality. --- src/mesa/state_tracker/st_cb_flush.c | 2 +- 1 file changed

[Mesa-dev] [PATCH 1/4] u_threaded_gallium: remove synchronization in fence_server_sync

2017-11-09 Thread Nicolai Hähnle
From: Nicolai Hähnle The whole point of fence_server_sync is that it can be used to avoid waiting in the application thread. --- src/gallium/auxiliary/util/u_threaded_context.c | 14 +++--- src/gallium/auxiliary/util/u_threaded_context.h | 1 + src/gallium/auxiliary/util

[Mesa-dev] [PATCH 0/4] st/mesa: use asynchronous flushes

2017-11-09 Thread Nicolai Hähnle
Hi all, I've previously sent some of this series, but I'm splitting it up further for bisectability, plus the first patch is new. The idea here is to further reduce the amount of synchronization required with threaded gallium. Eventually, we should be able to eliminate synchronizations entirely

[Mesa-dev] [PATCH 2/4] st/mesa: implement st_server_wait_sync properly

2017-11-09 Thread Nicolai Hähnle
From: Nicolai Hähnle Asynchronous flushes require a proper implementation of st_server_wait_sync, because we could have the following with threaded Gallium: Context 1 app Context 1 driver Context 2 - - f = glFenceSync glFlush

[Mesa-dev] [PATCH 4/4] st/mesa: use asynchronous flushes for glFlush

2017-11-09 Thread Nicolai Hähnle
From: Nicolai Hähnle Having the gallium driver thread flush in the background should be sufficient for glFlush semantics. Various end-of-frame flushes (from st_context_flush and st/dri) still use a synchronous flush. We should eventually be able to transition those to asynchronous flushes as

Re: [Mesa-dev] [PATCH v2 07/26] winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences

2017-11-09 Thread Nicolai Hähnle
On 06.11.2017 19:21, Marek Olšák wrote: On Mon, Nov 6, 2017 at 11:23 AM, Nicolai Hähnle wrote: From: Nicolai Hähnle The idea is to fix the following interleaving of operations that can arise from deferred fences: Thread 1 / Context 1 Thread 2 / Context 2

Re: [Mesa-dev] [PATCH] r600/query: drop rest of vi workaround code.

2017-11-09 Thread Nicolai Hähnle
On 09.11.2017 06:54, Dave Airlie wrote: From: Dave Airlie This isn't needed in r600 anymore. Signed-off-by: Dave Airlie Reviewed-by: Nicolai Hähnle --- src/gallium/drivers/r600/r600_query.c | 46 ++- src/gallium/drivers/r600/r600_query.h | 4 --

Re: [Mesa-dev] [PATCH 7/6] radeonsi: don't call r600_can_dma_copy_buffer for DISCARD_RANGE

2017-11-09 Thread Nicolai Hähnle
On 09.11.2017 04:21, Marek Olšák wrote: From: Marek Olšák we don't use dma_data in this codepath. Reviewed-by: Nicolai Hähnle --- src/gallium/drivers/radeon/r600_buffer_common.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/gallium/drivers/r

Re: [Mesa-dev] [PATCH 6/6] radeonsi: remove has_cp_dma, has_streamout flags

2017-11-09 Thread Nicolai Hähnle
On 09.11.2017 04:15, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/radeon/r600_buffer_common.c | 5 + src/gallium/drivers/radeon/r600_pipe_common.h | 2 -- src/gallium/drivers/radeonsi/si_pipe.c | 3 --- 3 files changed, 1 insertion(+), 9 deletions(-) diff --

Re: [Mesa-dev] [PATCH 4/6] radeonsi: pack r600_texture better

2017-11-09 Thread Nicolai Hähnle
e could be removed, but it's not important. Patches 1-5: Reviewed-by: Nicolai Hähnle +*/ + unsignedframebuffers_bound; /* Whether the texture is a displayable back buffer and needs DCC * decompression, which is expensive. There

Re: [Mesa-dev] [PATCH 16/17] util: Add Mesa ARB_get_program_binary helper functions

2017-11-09 Thread Nicolai Hähnle
On 09.11.2017 07:42, Jordan Justen wrote: Signed-off-by: Jordan Justen --- src/util/Makefile.sources | 2 + src/util/meson.build | 2 + src/util/program_binary.c | 322 ++ src/util/program_binary.h | 91 + 4 files changed, 4

Re: [Mesa-dev] [PATCH 15/17] main: Clear shader program data whenever ProgramBinary is called

2017-11-09 Thread Nicolai Hähnle
Patches 10-15: Reviewed-by: Nicolai Hähnle On 09.11.2017 07:42, Jordan Justen wrote: The GL_ARB_get_program_binary extension spec says: "If ProgramBinary fails to load a binary, no error is generated, but any information about a previous link or load of that program object is

Re: [Mesa-dev] [PATCH 03/17] compiler: Fold shader_cache in with libglsl sources

2017-11-09 Thread Nicolai Hähnle
On 09.11.2017 07:42, Jordan Justen wrote: It appears that we include the shader cache sources into libglsl regardless. The Meson build already does this. Signed-off-by: Jordan Justen Patches 1-3: Reviewed-by: Nicolai Hähnle --- src/compiler/Android.glsl.mk | 3 +-- src/compiler

Re: [Mesa-dev] [PATCH] amd/addrlib: update to latest version

2017-11-09 Thread Nicolai Hähnle
On 08.11.2017 23:54, Ilia Mirkin wrote: On Wed, Nov 8, 2017 at 4:13 AM, Nicolai Hähnle wrote: On 08.11.2017 09:53, Michel Dänzer wrote: On 07/11/17 10:58 PM, Marek Olšák wrote: On Tue, Nov 7, 2017 at 9:01 PM, Nicolai Hähnle wrote: On 07.11.2017 18:35, Michel Dänzer wrote: On 07/11/17

Re: [Mesa-dev] [PATCH 2/4] r600: use mysterious DX10_CLAMP bit in pixel shader setup

2017-11-09 Thread Nicolai Hähnle
The internal docs are pretty much the same (i.e. confusing and non-explicit), but my layman's reading of the RTL is that DX10_CLAMP only affects clamping. So if you have a v_mul_f32 0, inf that will generate a NaN just fine and is simply unaffected by DX10_CLAMP. However, if the clamp bit i

Re: [Mesa-dev] [PATCH 4/4] util/tgsi: use ASSERT_BITFIELD_SIZE() to check opcode field size

2017-11-08 Thread Nicolai Hähnle
For the series: Reviewed-by: Nicolai Hähnle On 08.11.2017 01:07, Brian Paul wrote: I've noticed at least two places where we store the TGSI opcode in an unsigned:8 bitfield. We're at 249 opcodes now. If we hit 256 we'll need to grow those bitfields. Use the new ASSERT_BITFIE

Re: [Mesa-dev] [PATCH] amd/addrlib: update to latest version

2017-11-08 Thread Nicolai Hähnle
On 08.11.2017 09:53, Michel Dänzer wrote: On 07/11/17 10:58 PM, Marek Olšák wrote: On Tue, Nov 7, 2017 at 9:01 PM, Nicolai Hähnle wrote: On 07.11.2017 18:35, Michel Dänzer wrote: On 07/11/17 06:28 PM, Marek Olšák wrote: Hi, This patch is too large for the mailing list: https

Re: [Mesa-dev] [PATCH 8/8] r600: add support for hw atomic counters. (v3)

2017-11-07 Thread Nicolai Hähnle
On 07.11.2017 19:38, Dave Airlie wrote: On 8 November 2017 at 03:26, Nicolai Hähnle wrote: On 07.11.2017 07:31, Dave Airlie wrote: From: Dave Airlie This adds support for the evergreen/cayman atomic counters. These are implemented using GDS append/consume counters. The values for each

Re: [Mesa-dev] [PATCH] amd/addrlib: update to latest version

2017-11-07 Thread Nicolai Hähnle
that, but the commit discipline on the internal addrlib repository is pretty crappy, so we'd end up having to massage commits anyway. Maybe we can find a sweet spot somewhere by updating slightly more regularly, perhaps once a month. With Dylan's comment addressed, Acked-by: Nicolai H

Re: [Mesa-dev] [PATCH 2/8] gallium/tgsi: start adding hw atomics (v3)

2017-11-07 Thread Nicolai Hähnle
On 07.11.2017 18:26, Nicolai Hähnle wrote: On 07.11.2017 17:57, Marek Olšák wrote: With HW atomic counters, MaxAtomicBufferSize is a pretty small number (counters * 4). TGSI has maximum index = 32K. Ah, you're right. I forgot: the other comments (about the assertion in patch 2, and

Re: [Mesa-dev] [PATCH 5/8] st/mesa: start adding support for hw atomics atom.

2017-11-07 Thread Nicolai Hähnle
On 07.11.2017 07:31, Dave Airlie wrote: From: Dave Airlie This adds a new atom that calls the new driver API to bind buffers containing hw atomics. Signed-off-by: Dave Airlie --- src/mesa/state_tracker/st_atom_atomicbuf.c | 37 src/mesa/state_tracker/st_atom_

Re: [Mesa-dev] [PATCH 8/8] r600: add support for hw atomic counters. (v3)

2017-11-07 Thread Nicolai Hähnle
? I suppose it might require more stuff to manage GDS allocations in the kernel, and if it works with this approach... Acked-by: Nicolai Hähnle v2: move hw atomic assignment into driver. v3: fix messing up caps (Gert Wollny), only store ranges in driver, drop buffers. Signed-off-by: Da

Re: [Mesa-dev] [PATCH 2/8] gallium/tgsi: start adding hw atomics (v3)

2017-11-07 Thread Nicolai Hähnle
On 07.11.2017 17:57, Marek Olšák wrote: With HW atomic counters, MaxAtomicBufferSize is a pretty small number (counters * 4). TGSI has maximum index = 32K. Ah, you're right. Patches 1-7: Reviewed-by: Nicolai Hähnle Marek On Tue, Nov 7, 2017 at 5:43 PM, Nicolai Hähnle wrote

Re: [Mesa-dev] [PATCH] glsl: Transform fb buffers are only active if a variable uses them

2017-11-07 Thread Nicolai Hähnle
require a binding even if nothing was declared to use the default buffer. Affects: KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list_and_api Reviewed-by: Nicolai Hähnle --- src/compiler/glsl/link_varyings.cpp | 24

Re: [Mesa-dev] [PATCH] radeonsi: remove unused field in the PCI ID table

2017-11-07 Thread Nicolai Hähnle
Reviewed-by: Nicolai Hähnle On 07.11.2017 15:28, Marek Olšák wrote: From: Marek Olšák --- include/pci_ids/radeonsi_pci_ids.h| 458 +++--- src/amd/common/ac_gpu_info.c | 2 +- src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 2

Re: [Mesa-dev] [PATCH 4/4] radeonsi: add si_screen::has_ls_vgpr_init_bug

2017-11-07 Thread Nicolai Hähnle
For the series: Reviewed-by: Nicolai Hähnle On 07.11.2017 04:12, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/radeonsi/si_pipe.c | 2 ++ src/gallium/drivers/radeonsi/si_pipe.h | 1 + src/gallium/drivers/radeonsi/si_shader.c | 3 +-- src/gallium/drivers

Re: [Mesa-dev] [PATCH 2/2] r600g: use SIMPLE_FLOAT for blending to avoid NaNs in RTs

2017-11-07 Thread Nicolai Hähnle
On 06.11.2017 15:40, Ilia Mirkin wrote: On Mon, Nov 6, 2017 at 8:48 AM, Ilia Mirkin wrote: On Mon, Nov 6, 2017 at 6:21 AM, Nicolai Hähnle wrote: On 06.11.2017 05:22, Ilia Mirkin wrote: Radeonsi also sets this flag. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103544 Signed-off

Re: [Mesa-dev] [PATCH] loader/dri3: Improve dri3 thread-safety

2017-11-07 Thread Nicolai Hähnle
On 06.11.2017 12:53, Thomas Hellstrom wrote: On 11/06/2017 12:14 PM, Nicolai Hähnle wrote: On 03.11.2017 12:02, Thomas Hellstrom wrote: It turned out that with recent changes that call into dri3 from glFinish(), it appears like different thread end up waiting for X events simultaneously

Re: [Mesa-dev] [PATCH v2] glsl: add varying resources for arrays of complex types

2017-11-07 Thread Nicolai Hähnle
Looks plausible. Reviewed-by: Nicolai Hähnle On 02.11.2017 18:49, Juan A. Suarez Romero wrote: This patch is mostly a patch done by Ilia Mirkin. It fixes KHR-GL45.enhanced_layouts.varying_structure_locations. v2: fix locations for TCS/TES/GS inputs and outputs (Ilia) CC: Ilia Mirkin

Re: [Mesa-dev] [PATCH 2/8] gallium/tgsi: start adding hw atomics (v3)

2017-11-07 Thread Nicolai Hähnle
On 07.11.2017 17:25, Nicolai Hähnle wrote: On 07.11.2017 07:31, Dave Airlie wrote: diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst index 1a51fe9..0c331f2 100644 --- a/src/gallium/docs/source/tgsi.rst +++ b/src/gallium/docs/source/tgsi.rst @@ -2638,9 +2638,11

Re: [Mesa-dev] [PATCH 7/8] st/mesa: add support for hw atomics to glsl->tgsi. (v3)

2017-11-07 Thread Nicolai Hähnle
On 07.11.2017 07:31, Dave Airlie wrote: From: Dave Airlie This adds support for creating the hw atomic tgsi from the glsl codepaths. v2: drop the atomic index and move to backend. v3: drop buffer decls. (Marek) Signed-off-by: Dave Airlie --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 101

Re: [Mesa-dev] [PATCH 2/8] gallium/tgsi: start adding hw atomics (v3)

2017-11-07 Thread Nicolai Hähnle
On 07.11.2017 07:31, Dave Airlie wrote: From: Dave Airlie This adds support for a hw atomic counters to TGSI. A new register file for storing atomic counters is added, along with a new atomic counter semantic, along with docs for both. v2: drop semantic, move hw counter to backend, Ilia point

Re: [Mesa-dev] [PATCH] util/tgsi: add static assertion to catch opcode overflowing bitfield

2017-11-07 Thread Nicolai Hähnle
Reviewed-by: Nicolai Hähnle On 07.11.2017 17:09, Brian Paul wrote: I've noticed at least two places where we store the TGSI opcode in an unsigned:8 bitfield. We're at 249 opcodes now. If we hit 256 we'll need to grow those bitfields. Add a static assertion to detect that. --

Re: [Mesa-dev] [PATCH] radeonsi/gfx9: limit the scissor bug workaround to Vega10 and Raven only

2017-11-07 Thread Nicolai Hähnle
Reviewed-by: Nicolai Hähnle On 07.11.2017 16:16, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/radeonsi/si_state_draw.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi

Re: [Mesa-dev] [PATCH 2/2] r600g: use SIMPLE_FLOAT for blending to avoid NaNs in RTs

2017-11-06 Thread Nicolai Hähnle
. Assuming that the test passes: Reviewed-by: Nicolai Hähnle src/gallium/drivers/r600/evergreen_state.c | 1 + src/gallium/drivers/r600/r600_state.c | 1 + 2 files changed, 2 insertions(+) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600

Re: [Mesa-dev] [PATCH 3/3] radeonsi: don't map big VRAM buffers for the first upload directly

2017-11-06 Thread Nicolai Hähnle
For the series: Reviewed-by: Nicolai Hähnle On 04.11.2017 14:03, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/radeon/r600_buffer_common.c | 20 src/gallium/drivers/radeon/r600_pipe_common.h | 1 + 2 files changed, 21 insertions(+) diff --git a

Re: [Mesa-dev] [PATCH] loader/dri3: Improve dri3 thread-safety

2017-11-06 Thread Nicolai Hähnle
On 03.11.2017 12:02, Thomas Hellstrom wrote: It turned out that with recent changes that call into dri3 from glFinish(), it appears like different thread end up waiting for X events simultaneously, causing deadlocks since they steal events from eachoter and update the dri3 counters behind eachoth

[Mesa-dev] [PATCH v2 20/26] ddebug: use an atomic increment when numbering files

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/gallium/drivers/ddebug/dd_util.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/ddebug/dd_util.h b/src/gallium/drivers/ddebug/dd_util.h index cfc0fb0ccce..bdfb7cc9163 100644 --- a/src/gallium

[Mesa-dev] [PATCH v2 18/26] gallium/u_dump: add and use util_dump_transfer_usage

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/gallium/auxiliary/util/u_debug.c| 19 +++ src/gallium/auxiliary/util/u_dump.h | 3 ++ src/gallium/auxiliary/util/u_dump_defines.c | 53 + src/gallium/auxiliary/util/u_dump_state.c | 2

[Mesa-dev] [PATCH v2 23/26] ddebug: dump context and before/after times of draws

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/gallium/drivers/ddebug/dd_draw.c | 8 src/gallium/drivers/ddebug/dd_pipe.h | 2 ++ 2 files changed, 10 insertions(+) diff --git a/src/gallium/drivers/ddebug/dd_draw.c b/src/gallium/drivers/ddebug/dd_draw.c index a856d0142a1

[Mesa-dev] [PATCH v2 24/26] ddebug: optionally handle transfer commands like draws

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Transfer commands can have associated GPU operations. Enabled by passing GALLIUM_DDEBUG=transfers. Reviewed-by: Marek Olšák --- src/gallium/drivers/ddebug/dd_context.c | 65 - src/gallium/drivers/ddebug/dd_draw.c| 234 src

[Mesa-dev] [PATCH v2 26/26] radeonsi: use a threaded context even for debug contexts

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/gallium/drivers/radeonsi/si_pipe.c | 11 ++- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c index 10225353907..b193a0b4f21 100644 --- a

[Mesa-dev] [PATCH v2 21/26] ddebug: rewrite to always use a threaded approach

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle This patch has multiple goals: 1. Off-load the writing of records in 'always' mode to another thread for performance. 2. Allow using ddebug with threaded contexts. This really forces us to move some of the "after_draw" handling into another thre

[Mesa-dev] [PATCH v2 25/26] radeonsi: record and dump time of flush

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/gallium/drivers/radeonsi/si_debug.c | 5 - src/gallium/drivers/radeonsi/si_hw_context.c | 3 +++ src/gallium/drivers/radeonsi/si_pipe.h | 1 + 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers

[Mesa-dev] [PATCH v2 16/26] gallium/u_dump: export util_dump_ptr

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Change format to %p while we're at it. Reviewed-by: Marek Olšák --- src/gallium/auxiliary/util/u_dump.h | 3 +++ src/gallium/auxiliary/util/u_dump_state.c | 4 ++-- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/util/u_dump

[Mesa-dev] [PATCH v2 22/26] ddebug: generalize print_named_xxx via a PRINT_NAMED macro

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/gallium/drivers/ddebug/dd_draw.c | 25 ++--- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/src/gallium/drivers/ddebug/dd_draw.c b/src/gallium/drivers/ddebug/dd_draw.c index 99c9c929b2e..a856d0142a1

[Mesa-dev] [PATCH v2 15/26] radeonsi: implement PIPE_FLUSH_{TOP, BOTTOM}_OF_PIPE

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle v2: use uncached system memory for the fence, and use the CPU to clear it so we never read garbage when checking the fence --- src/gallium/drivers/radeonsi/si_fence.c | 89 - 1 file changed, 88 insertions(+), 1 deletion(-) diff --git a

[Mesa-dev] [PATCH v2 11/26] gallium/u_threaded: avoid syncs for get_query_result

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Queries should still get marked as flushed when flushes are executed asynchronously in the driver thread. To this end, the management of the unflushed_queries list is moved into the driver thread. Reviewed-by: Marek Olšák --- src/gallium/auxiliary/util

[Mesa-dev] [PATCH v2 08/26] radeonsi: move fence functions to si_fence.c

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/gallium/drivers/radeon/r600_pipe_common.c | 267 -- src/gallium/drivers/radeonsi/Makefile.sources | 1 + src/gallium/drivers/radeonsi/meson.build | 1 + src/gallium/drivers/radeonsi/si_fence.c | 304

[Mesa-dev] [PATCH v2 17/26] gallium/u_dump: add util_dump_ns

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/gallium/auxiliary/util/u_dump.h | 3 +++ src/gallium/auxiliary/util/u_dump_state.c | 10 ++ 2 files changed, 13 insertions(+) diff --git a/src/gallium/auxiliary/util/u_dump.h b/src/gallium/auxiliary/util/u_dump.h index

[Mesa-dev] [PATCH v2 12/26] gallium/u_threaded: implement pipe_context::set_log_context

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/gallium/auxiliary/util/u_threaded_context.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/gallium/auxiliary/util/u_threaded_context.c b/src/gallium/auxiliary/util/u_threaded_context.c index 4908ea8a7ba..1f8a9d5088b

[Mesa-dev] [PATCH v2 13/26] gallium: add pipe_context::callback

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle For running post-draw operations inside the driver thread. ddebug will use it. Reviewed-by: Marek Olšák --- src/gallium/auxiliary/util/u_threaded_context.c| 46 ++ .../auxiliary/util/u_threaded_context_calls.h | 1 + src/gallium/include/pipe

[Mesa-dev] [PATCH v2 09/26] gallium/u_threaded: mark queries flushed only for non-deferred flushes

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle The driver uses (and must use) the flushed flag of queries as a hint that it does not have to check for synchronization with currently queued up commands. Deferred flushes do not actually flush queued up commands, so we must not set the flushed flag for them. Found by

[Mesa-dev] [PATCH v2 19/26] dd/util: extract dd_get_debug_filename_and_mkdir

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/gallium/drivers/ddebug/dd_util.h | 30 ++ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/ddebug/dd_util.h b/src/gallium/drivers/ddebug/dd_util.h index 4e1a945c57d

[Mesa-dev] [PATCH v2 10/26] gallium/u_threaded: implement asynchronous flushes

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle This requires out-of-band creation of fences, and will be signaled to the pipe_context::flush implementation by a special TC_FLUSH_ASYNC flag. v2: - remove an incorrect assertion - handle fence_server_sync for unsubmitted fences by relying on the improved

[Mesa-dev] [PATCH v2 14/26] radeonsi: document some subtle details of fence_finish & fence_server_sync

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle v2: remove the change to si_fence_server_sync, we'll handle that more robustly Reviewed-by: Marek Olšák (v1) --- src/gallium/drivers/radeonsi/si_fence.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_fe

[Mesa-dev] [PATCH v2 00/26] Asynchronous flushes and ddebug core rewrite

2017-11-06 Thread Nicolai Hähnle
Hi all, here's a re-spin of the series, v1 was here: https://patchwork.freedesktop.org/series/32427, and the updated patches in a larger context are here: https://cgit.freedesktop.org/~nh/mesa/log/?h=fences-threads-ddebug Changes in v2: - patch 3: Windows build issues should be fixed now (tested

[Mesa-dev] [PATCH v2 07/26] winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle The idea is to fix the following interleaving of operations that can arise from deferred fences: Thread 1 / Context 1 Thread 2 / Context 2 f = deferred flush <--- application-side synchronization --->

[Mesa-dev] [PATCH v2 01/26] util: move os_time.[ch] to src/util

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/gallium/auxiliary/Makefile.sources | 2 -- src/gallium/auxiliary/gallivm/lp_bld_init.c| 2 +- src/gallium/auxiliary/hud/hud_cpu.c| 2 +- src/gallium/auxiliary/hud/hud_cpufreq.c| 2 +- src

[Mesa-dev] [PATCH v2 06/26] gallium: add PIPE_FLUSH_{TOP, BOTTOM}_OF_PIPE bits

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle These bits are intended to be used by the ddebug hang detection and are named in analogy to the Vulkan stage bits (and the corresponding Radeon pipeline event). Hang detection needs fences on the granularity of individual commands, which nothing else really covers. The

[Mesa-dev] [PATCH v2 04/26] util/u_queue: add util_queue_fence_wait_timeout

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle v2: - style fixes - fix missing timeout handling in futex path Reviewed-by: Marek Olšák (v1) --- src/util/futex.h | 9 -- src/util/simple_mtx.h | 2 +- src/util/u_queue.c| 82 ++- src/util/u_queue.h| 54

[Mesa-dev] [PATCH v2 02/26] gallium: remove unused and deprecated u_time.h

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Cc: Jose Fonseca Reviewed-by: Marek Olšák --- src/gallium/auxiliary/Makefile.sources | 1 - src/gallium/auxiliary/meson.build | 1 - src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c | 1 - src/gallium/auxiliary/pipebuffer

[Mesa-dev] [PATCH v2 03/26] threads: update for late C11 changes

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle C11 threads were changed to use struct timespec instead of xtime, and thrd_sleep got a second argument. See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1554.htm and http://en.cppreference.com/w/c/thread/{thrd_sleep,cnd_timedwait,mtx_timedlock} Note that cnd_timedwait

[Mesa-dev] [PATCH v2 05/26] gallium: add PIPE_FLUSH_ASYNC and PIPE_FLUSH_HINT_FINISH

2017-11-06 Thread Nicolai Hähnle
From: Nicolai Hähnle Also document some subtleties of pipe_context::flush. Reviewed-by: Marek Olšák --- src/gallium/docs/source/context.rst | 9 + src/gallium/include/pipe/p_context.h | 8 +++- src/gallium/include/pipe/p_defines.h | 2 ++ 3 files changed, 18 insertions(+), 1

[Mesa-dev] [PATCH v2 8/8] radeonsi: reduce the scope of sel->mutex in si_shader_select_with_key

2017-11-03 Thread Nicolai Hähnle
From: Nicolai Hähnle We only need the lock to guard changes in the variant linked list. The actual compilation can happen outside the lock, since we use the ready fence as a guard. v2: fix double-unlock Reviewed-by: Marek Olšák --- src/gallium/drivers/radeonsi/si_state_shaders.c | 8

[Mesa-dev] [PATCH v2 7/8] radeonsi: use ready fences on all shaders, not just optimized ones

2017-11-03 Thread Nicolai Hähnle
From: Nicolai Hähnle There's a race condition between si_shader_select_with_key and si_bind_XX_shader: Thread 1 Thread 2 si_shader_select_with_key begin compiling the first variant (guarded by sel-&

[Mesa-dev] [PATCH v2 6/8] u_queue: add a futex-based implementation of fences

2017-11-03 Thread Nicolai Hähnle
From: Nicolai Hähnle Fences are now 4 bytes instead of 96 bytes (on my 64-bit system). Signaling a fence is a single atomic operation in the fast case plus a syscall in the slow case. Testing if a fence is signaled is the same as before (a simple comparison), but waiting on a fence is now no

[Mesa-dev] [PATCH v2 5/8] u_queue: add util_queue_fence_reset

2017-11-03 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/util/u_queue.c | 4 +--- src/util/u_queue.h | 13 + 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/src/util/u_queue.c b/src/util/u_queue.c index 33436e0749a..2272006042f 100644 --- a/src/util/u_queue.c +++ b

[Mesa-dev] [PATCH v2 3/8] u_queue: group fence functions together

2017-11-03 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/util/u_queue.h | 19 ++- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/src/util/u_queue.h b/src/util/u_queue.h index ff713ae54d6..7a028ef0847 100644 --- a/src/util/u_queue.h +++ b/src/util/u_queue.h

[Mesa-dev] [PATCH v2 1/8] util: move futex helpers into futex.h

2017-11-03 Thread Nicolai Hähnle
From: Nicolai Hähnle v2: style fixes Reviewed-by: Marek Olšák (v1) --- src/util/Makefile.sources | 1 + src/util/futex.h | 53 +++ src/util/meson.build | 1 + src/util/simple_mtx.h | 20 +- 4 files changed, 56

[Mesa-dev] [PATCH v2 2/8] util/u_atomic: add p_atomic_xchg

2017-11-03 Thread Nicolai Hähnle
From: Nicolai Hähnle The closest to it in the old-style gcc builtins is __sync_lock_test_and_set, however, that is only guaranteed to work with values 0 and 1 and only provides an acquire barrier. I also don't know about other OSes, so we provide a simple & stupid emulation via p_atomi

[Mesa-dev] [PATCH v2 4/8] u_queue: export util_queue_fence_signal

2017-11-03 Thread Nicolai Hähnle
From: Nicolai Hähnle Reviewed-by: Marek Olšák --- src/util/u_queue.c | 2 +- src/util/u_queue.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/src/util/u_queue.c b/src/util/u_queue.c index 3b05110e9f8..33436e0749a 100644 --- a/src/util/u_queue.c +++ b/src/util/u_queue.c

[Mesa-dev] [PATCH v2 0/8] u_queue fence patches and a radeonsi fix

2017-11-03 Thread Nicolai Hähnle
Hi all, Some small style changes relative to v1, and a fix to the memory barrier semantics in the futex-based fence implementation. Patches 2 (which is new) and 6 still need a review. Please take a look! Thanks, Nicolai -- src/gallium/drivers/radeonsi/si_shader.c | 3 + src/gallium/driver

Re: [Mesa-dev] [PATCH 09/17] mesa/st: add support for waiting for semaphore objects

2017-11-03 Thread Nicolai Hähnle
Hi Andres, On 03.11.2017 18:36, Andres Rodriguez wrote: On 2017-11-03 05:17 AM, Nicolai Hähnle wrote: On 02.11.2017 04:57, Andres Rodriguez wrote: Bits to implement ServerWaitSemaphoreObject/ServerSignalSemaphoreObject Signed-off-by: Andres Rodriguez ---   src/mesa/state_tracker

Re: [Mesa-dev] [PATCH 14/25] radeonsi: implement PIPE_FLUSH_{TOP, BOTTOM}_OF_PIPE

2017-11-03 Thread Nicolai Hähnle
On 03.11.2017 19:46, Marek Olšák wrote: On Fri, Nov 3, 2017 at 3:48 PM, Nicolai Hähnle wrote: On 31.10.2017 17:21, Marek Olšák wrote: On Sun, Oct 22, 2017 at 9:07 PM, Nicolai Hähnle wrote: From: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_fence.c | 83

Re: [Mesa-dev] [PATCH 3/5] st/mesa: use asynchronous flushes

2017-11-03 Thread Nicolai Hähnle
On 31.10.2017 18:59, Marek Olšák wrote: On Sun, Oct 22, 2017 at 9:18 PM, Nicolai Hähnle wrote: From: Nicolai Hähnle --- src/mesa/state_tracker/st_cb_flush.c | 4 ++-- src/mesa/state_tracker/st_cb_syncobj.c | 26 -- 2 files changed, 26 insertions(+), 4 deletions

Re: [Mesa-dev] [PATCH] mesa: prevent deleting the dummy ATI_fs

2017-11-03 Thread Nicolai Hähnle
On 03.11.2017 14:56, Miklós Máté wrote: On 03/11/17 11:04, Nicolai Hähnle wrote: On 03.11.2017 00:06, Miklós Máté wrote: On 02/11/17 17:16, Nicolai Hähnle wrote: On 01.11.2017 00:34, Miklós Máté wrote: This fixes a crash upon context destruction when glGenFragmentShadersATI() was used

Re: [Mesa-dev] [PATCH 08/25] gallium/u_threaded: mark queries flushed only for non-deferred flushes

2017-11-03 Thread Nicolai Hähnle
On 30.10.2017 13:31, Marek Olšák wrote: On Mon, Oct 30, 2017 at 2:57 AM, Marek Olšák wrote: On Sun, Oct 22, 2017 at 9:07 PM, Nicolai Hähnle wrote: From: Nicolai Hähnle The driver uses (and must use) the flushed flag of queries as a hint that it does not have to check for synchronization

Re: [Mesa-dev] [PATCH 09/25] gallium/u_threaded: implement asynchronous flushes

2017-11-03 Thread Nicolai Hähnle
On 31.10.2017 03:15, Marek Olšák wrote: On Sun, Oct 22, 2017 at 9:07 PM, Nicolai Hähnle wrote: @@ -107,20 +138,46 @@ static boolean si_fence_finish(struct pipe_screen *screen, uint64_t timeout) { struct radeon_winsys *rws = ((struct r600_common_screen

Re: [Mesa-dev] [PATCH 14/25] radeonsi: implement PIPE_FLUSH_{TOP, BOTTOM}_OF_PIPE

2017-11-03 Thread Nicolai Hähnle
On 31.10.2017 17:21, Marek Olšák wrote: On Sun, Oct 22, 2017 at 9:07 PM, Nicolai Hähnle wrote: From: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_fence.c | 83 - 1 file changed, 82 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers

Re: [Mesa-dev] [PATCH v3] mesa: Add new fast mtx_t mutex type for basic use cases

2017-11-03 Thread Nicolai Hähnle
On 03.11.2017 14:23, Nicolai Hähnle wrote: What's the status of this? On 16.10.2017 14:36, Emil Velikov wrote: On 16 October 2017 at 08:06, Timothy Arceri wrote: While modern pthread mutexes are very fast, they still incur a call to an external DSO and overhead of the generality and fea

Re: [Mesa-dev] [PATCH v3] mesa: Add new fast mtx_t mutex type for basic use cases

2017-11-03 Thread Nicolai Hähnle
What's the status of this? On 16.10.2017 14:36, Emil Velikov wrote: On 16 October 2017 at 08:06, Timothy Arceri wrote: While modern pthread mutexes are very fast, they still incur a call to an external DSO and overhead of the generality and features of pthread mutexes. Most mutexes in mesa onl

Re: [Mesa-dev] gallium/r600 hw atomic support v2

2017-11-03 Thread Nicolai Hähnle
On 03.11.2017 08:24, Dave Airlie wrote: Ilia pointed out a bad assumption I made, so I've decided to move to allocating the hw indices in the backend, a bit ugly but seems to work. Thanks for doing this. I want to get GDS atomics into radeonsi as well. I've already sent some minor comments, wi

Re: [Mesa-dev] [PATCH 8/9] r600: add support for hw atomic counters. (v2)

2017-11-03 Thread Nicolai Hähnle
On 03.11.2017 08:24, Dave Airlie wrote: From: Dave Airlie This adds support for the evergreen/cayman atomic counters. These are implemented using GDS append/consume counters. The values for each counter are loaded before drawing and saved after each draw using special CP packets. v2: move hw

Re: [Mesa-dev] [PATCH 6/9] st/mesa: setup hw atomic limits.

2017-11-03 Thread Nicolai Hähnle
c->Program[MESA_SHADER_FRAGMENT].MaxAtomicBuffers; It's not true for GCN GDS-based atomics, because we can just use normal GDS instruction there (rather than the limited ordered append counters). But we can deal with that when we get there. With the indentation fixed, this pa

Re: [Mesa-dev] [PATCH 12/17] radeonsi: implement semaphore operations

2017-11-03 Thread Nicolai Hähnle
On 02.11.2017 04:57, Andres Rodriguez wrote: Allow importing, waiting and signaling of semaphore objects. Semaphore objects are backed by syncobj based fences. Signed-off-by: Andres Rodriguez --- src/gallium/drivers/radeon/r600_pipe_common.c | 52 +++ src/gallium/dri

Re: [Mesa-dev] [PATCH] mesa: prevent deleting the dummy ATI_fs

2017-11-03 Thread Nicolai Hähnle
On 03.11.2017 00:06, Miklós Máté wrote: On 02/11/17 17:16, Nicolai Hähnle wrote: On 01.11.2017 00:34, Miklós Máté wrote: This fixes a crash upon context destruction when glGenFragmentShadersATI() was used. Backtrace: ==15060== Invalid free() / delete / delete[] / realloc() ==15060==    at

Re: [Mesa-dev] [PATCH 2/2] st/glsl_to_nir: delay adding built-in uniforms to Parameters list

2017-11-03 Thread Nicolai Hähnle
On 02.11.2017 20:45, Timothy Arceri wrote: On 03/11/17 03:25, Nicolai Hähnle wrote: On 01.11.2017 06:20, Timothy Arceri wrote: Delaying adding built-in uniforms until after we convert to NIR gives us a better chance to optimise them away. Also NIR allows us to iterate over the uniforms

Re: [Mesa-dev] [PATCH 8/9] drisw: Enable flush control for llvmpipe and softpipe

2017-11-03 Thread Nicolai Hähnle
Same concerns about testing as Emil, but the logic of it all is sound, so patches 3-8 are Reviewed-by: Nicolai Hähnle On 02.11.2017 20:01, Adam Jackson wrote: Hilariously this is a fairly big win. Neil's multi-context-test improves from ~24 to ~36 fps with llvmpipe on a Core i5-

Re: [Mesa-dev] [PATCH 3/9] dri: Change __DriverApiRec::CreateContext to take a struct for attribs

2017-11-03 Thread Nicolai Hähnle
ion, - uint32_t flags, - bool notify_reset, - unsigned priority, - unsigned *error, + const struct __DriverContextConfig *ctx_config, + unsigned *error, Also here. Apart from these, patches

Re: [Mesa-dev] [PATCH 03/12] glsl: Remove program_resource_visitor::visit_field(const glsl_struct_field *)

2017-11-03 Thread Nicolai Hähnle
Patches 2 & 3: Reviewed-by: Nicolai Hähnle On 02.11.2017 21:25, Ian Romanick wrote: From: Ian Romanick I could not find any remaining users. Signed-off-by: Ian Romanick --- src/compiler/glsl/link_uniforms.cpp | 8 src/compiler/glsl/linker.h | 10 -- 2 f

Re: [Mesa-dev] [PATCH 13/17] mesa: implement buffer/texture barriers for semaphore wait/signal

2017-11-03 Thread Nicolai Hähnle
On 02.11.2017 04:57, Andres Rodriguez wrote: Make sure memory is accessible to the external client, for the specified memory object, before the signal/after the wait. Signed-off-by: Andres Rodriguez --- src/mesa/main/dd.h | 14 ++- src/mesa/main/externalobjec

Re: [Mesa-dev] [PATCH 09/17] mesa/st: add support for waiting for semaphore objects

2017-11-03 Thread Nicolai Hähnle
On 02.11.2017 04:57, Andres Rodriguez wrote: Bits to implement ServerWaitSemaphoreObject/ServerSignalSemaphoreObject Signed-off-by: Andres Rodriguez --- src/mesa/state_tracker/st_cb_semaphoreobjects.c | 28 + 1 file changed, 28 insertions(+) diff --git a/src/mesa/sta

<    1   2   3   4   5   6   7   8   9   10   >