On 10.11.2017 13:35, Jon Turney wrote:
On 06/11/2017 10:23, Nicolai Hähnle wrote:
diff --git a/src/gallium/auxiliary/os/os_time.h b/src/util/os_time.h
similarity index 89%
rename from src/gallium/auxiliary/os/os_time.h
rename to src/util/os_time.h
index ca0bdd5a0c4..049ab118db2 100644
--- a/src
From: Nicolai Hähnle
Apparently, it doesn't have pthread barriers.
p_config.h (which was originally used to guard this code) uses the
__APPLE__ macro to detect Mac OS.
Fixes: f0d3a4de75 ("util: move pipe_barrier into src/util and rename to
util_barrier")
Cc: Roland Scheidegger
From: Nicolai Hähnle
Fixes e.g. piglit/bin/bufferstorage-persistent read -auto
Fixes: e6dbc804a87a ("winsys/amdgpu: handle cs_add_fence_dependency for
deferred/unsubmitted fences")
---
src/util/u_queue.h | 6 ++
1 file changed, 6 insertions(+)
diff --git a/src/util/u_queue.h
From: Nicolai Hähnle
Ouch...
Fixes: 244536d3d6b4 ("gallium/u_threaded: avoid syncs for get_query_result")
---
src/gallium/auxiliary/util/u_threaded_context.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/src/gallium/auxiliary/util/u_threaded_context.c
b/src/gallium/auxi
Sorry for the mess.
Reviewed-by: Nicolai Hähnle
On 09.11.2017 17:46, Brian Paul wrote:
Fixes: f1a364878431c8 ("threads: update for late C11 changes")
---
include/c11/threads_win32.h | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/c11/threads
On 09.11.2017 18:26, Roland Scheidegger wrote:
Am 09.11.2017 um 18:19 schrieb Jan Vesely:
On Thu, 2017-11-09 at 03:58 +0100, srol...@vmware.com wrote:
From: Roland Scheidegger
I believe this is the safe thing to do, especially ever since the driver
actually generates NaNs for muls too.
Albeit
On 09.11.2017 18:37, Jan Vesely wrote:
On Thu, 2017-11-09 at 09:13 +0100, Nicolai Hähnle wrote:
The internal docs are pretty much the same (i.e. confusing and
non-explicit), but my layman's reading of the RTL is that DX10_CLAMP
only affects clamping. So if you have a
v_mul_f32 0
From: Nicolai Hähnle
With threaded gallium, the driver may currently be running in another
thread. In that case, we will execute all remaining commands in that
thread instead of syncing, which should be better for cache locality.
---
src/mesa/state_tracker/st_cb_flush.c | 2 +-
1 file changed
From: Nicolai Hähnle
The whole point of fence_server_sync is that it can be used to
avoid waiting in the application thread.
---
src/gallium/auxiliary/util/u_threaded_context.c | 14 +++---
src/gallium/auxiliary/util/u_threaded_context.h | 1 +
src/gallium/auxiliary/util
Hi all,
I've previously sent some of this series, but I'm splitting it up
further for bisectability, plus the first patch is new.
The idea here is to further reduce the amount of synchronization
required with threaded gallium.
Eventually, we should be able to eliminate synchronizations entirely
From: Nicolai Hähnle
Asynchronous flushes require a proper implementation of
st_server_wait_sync, because we could have the following with
threaded Gallium:
Context 1 app Context 1 driver Context 2
- -
f = glFenceSync
glFlush
From: Nicolai Hähnle
Having the gallium driver thread flush in the background should be
sufficient for glFlush semantics.
Various end-of-frame flushes (from st_context_flush and st/dri) still
use a synchronous flush. We should eventually be able to transition
those to asynchronous flushes as
On 06.11.2017 19:21, Marek Olšák wrote:
On Mon, Nov 6, 2017 at 11:23 AM, Nicolai Hähnle wrote:
From: Nicolai Hähnle
The idea is to fix the following interleaving of operations
that can arise from deferred fences:
Thread 1 / Context 1 Thread 2 / Context 2
On 09.11.2017 06:54, Dave Airlie wrote:
From: Dave Airlie
This isn't needed in r600 anymore.
Signed-off-by: Dave Airlie
Reviewed-by: Nicolai Hähnle
---
src/gallium/drivers/r600/r600_query.c | 46 ++-
src/gallium/drivers/r600/r600_query.h | 4 --
On 09.11.2017 04:21, Marek Olšák wrote:
From: Marek Olšák
we don't use dma_data in this codepath.
Reviewed-by: Nicolai Hähnle
---
src/gallium/drivers/radeon/r600_buffer_common.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/gallium/drivers/r
On 09.11.2017 04:15, Marek Olšák wrote:
From: Marek Olšák
---
src/gallium/drivers/radeon/r600_buffer_common.c | 5 +
src/gallium/drivers/radeon/r600_pipe_common.h | 2 --
src/gallium/drivers/radeonsi/si_pipe.c | 3 ---
3 files changed, 1 insertion(+), 9 deletions(-)
diff --
e could be removed, but it's not important.
Patches 1-5:
Reviewed-by: Nicolai Hähnle
+*/
+ unsignedframebuffers_bound;
/* Whether the texture is a displayable back buffer and needs DCC
* decompression, which is expensive. There
On 09.11.2017 07:42, Jordan Justen wrote:
Signed-off-by: Jordan Justen
---
src/util/Makefile.sources | 2 +
src/util/meson.build | 2 +
src/util/program_binary.c | 322 ++
src/util/program_binary.h | 91 +
4 files changed, 4
Patches 10-15:
Reviewed-by: Nicolai Hähnle
On 09.11.2017 07:42, Jordan Justen wrote:
The GL_ARB_get_program_binary extension spec says:
"If ProgramBinary fails to load a binary, no error is generated, but
any information about a previous link or load of that program object
is
On 09.11.2017 07:42, Jordan Justen wrote:
It appears that we include the shader cache sources into libglsl
regardless.
The Meson build already does this.
Signed-off-by: Jordan Justen
Patches 1-3:
Reviewed-by: Nicolai Hähnle
---
src/compiler/Android.glsl.mk | 3 +--
src/compiler
On 08.11.2017 23:54, Ilia Mirkin wrote:
On Wed, Nov 8, 2017 at 4:13 AM, Nicolai Hähnle wrote:
On 08.11.2017 09:53, Michel Dänzer wrote:
On 07/11/17 10:58 PM, Marek Olšák wrote:
On Tue, Nov 7, 2017 at 9:01 PM, Nicolai Hähnle
wrote:
On 07.11.2017 18:35, Michel Dänzer wrote:
On 07/11/17
The internal docs are pretty much the same (i.e. confusing and
non-explicit), but my layman's reading of the RTL is that DX10_CLAMP
only affects clamping. So if you have a
v_mul_f32 0, inf
that will generate a NaN just fine and is simply unaffected by
DX10_CLAMP. However, if the clamp bit i
For the series:
Reviewed-by: Nicolai Hähnle
On 08.11.2017 01:07, Brian Paul wrote:
I've noticed at least two places where we store the TGSI opcode in
an unsigned:8 bitfield. We're at 249 opcodes now. If we hit 256 we'll
need to grow those bitfields. Use the new ASSERT_BITFIE
On 08.11.2017 09:53, Michel Dänzer wrote:
On 07/11/17 10:58 PM, Marek Olšák wrote:
On Tue, Nov 7, 2017 at 9:01 PM, Nicolai Hähnle wrote:
On 07.11.2017 18:35, Michel Dänzer wrote:
On 07/11/17 06:28 PM, Marek Olšák wrote:
Hi,
This patch is too large for the mailing list:
https
On 07.11.2017 19:38, Dave Airlie wrote:
On 8 November 2017 at 03:26, Nicolai Hähnle wrote:
On 07.11.2017 07:31, Dave Airlie wrote:
From: Dave Airlie
This adds support for the evergreen/cayman atomic counters.
These are implemented using GDS append/consume counters. The values
for each
that, but the
commit discipline on the internal addrlib repository is pretty crappy,
so we'd end up having to massage commits anyway. Maybe we can find a
sweet spot somewhere by updating slightly more regularly, perhaps once a
month.
With Dylan's comment addressed,
Acked-by: Nicolai H
On 07.11.2017 18:26, Nicolai Hähnle wrote:
On 07.11.2017 17:57, Marek Olšák wrote:
With HW atomic counters, MaxAtomicBufferSize is a pretty small number
(counters * 4). TGSI has maximum index = 32K.
Ah, you're right.
I forgot: the other comments (about the assertion in patch 2, and
On 07.11.2017 07:31, Dave Airlie wrote:
From: Dave Airlie
This adds a new atom that calls the new driver API to
bind buffers containing hw atomics.
Signed-off-by: Dave Airlie
---
src/mesa/state_tracker/st_atom_atomicbuf.c | 37
src/mesa/state_tracker/st_atom_
? I suppose it might require more stuff to manage GDS
allocations in the kernel, and if it works with this approach...
Acked-by: Nicolai Hähnle
v2: move hw atomic assignment into driver.
v3: fix messing up caps (Gert Wollny), only store ranges in driver,
drop buffers.
Signed-off-by: Da
On 07.11.2017 17:57, Marek Olšák wrote:
With HW atomic counters, MaxAtomicBufferSize is a pretty small number
(counters * 4). TGSI has maximum index = 32K.
Ah, you're right.
Patches 1-7:
Reviewed-by: Nicolai Hähnle
Marek
On Tue, Nov 7, 2017 at 5:43 PM, Nicolai Hähnle wrote
require a binding even if
nothing was declared to use the default buffer.
Affects:
KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list
KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list_and_api
Reviewed-by: Nicolai Hähnle
---
src/compiler/glsl/link_varyings.cpp | 24
Reviewed-by: Nicolai Hähnle
On 07.11.2017 15:28, Marek Olšák wrote:
From: Marek Olšák
---
include/pci_ids/radeonsi_pci_ids.h| 458 +++---
src/amd/common/ac_gpu_info.c | 2 +-
src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 2
For the series:
Reviewed-by: Nicolai Hähnle
On 07.11.2017 04:12, Marek Olšák wrote:
From: Marek Olšák
---
src/gallium/drivers/radeonsi/si_pipe.c | 2 ++
src/gallium/drivers/radeonsi/si_pipe.h | 1 +
src/gallium/drivers/radeonsi/si_shader.c | 3 +--
src/gallium/drivers
On 06.11.2017 15:40, Ilia Mirkin wrote:
On Mon, Nov 6, 2017 at 8:48 AM, Ilia Mirkin wrote:
On Mon, Nov 6, 2017 at 6:21 AM, Nicolai Hähnle wrote:
On 06.11.2017 05:22, Ilia Mirkin wrote:
Radeonsi also sets this flag.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103544
Signed-off
On 06.11.2017 12:53, Thomas Hellstrom wrote:
On 11/06/2017 12:14 PM, Nicolai Hähnle wrote:
On 03.11.2017 12:02, Thomas Hellstrom wrote:
It turned out that with recent changes that call into dri3 from
glFinish(),
it appears like different thread end up waiting for X events
simultaneously
Looks plausible.
Reviewed-by: Nicolai Hähnle
On 02.11.2017 18:49, Juan A. Suarez Romero wrote:
This patch is mostly a patch done by Ilia Mirkin.
It fixes KHR-GL45.enhanced_layouts.varying_structure_locations.
v2: fix locations for TCS/TES/GS inputs and outputs (Ilia)
CC: Ilia Mirkin
On 07.11.2017 17:25, Nicolai Hähnle wrote:
On 07.11.2017 07:31, Dave Airlie wrote:
diff --git a/src/gallium/docs/source/tgsi.rst
b/src/gallium/docs/source/tgsi.rst
index 1a51fe9..0c331f2 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -2638,9 +2638,11
On 07.11.2017 07:31, Dave Airlie wrote:
From: Dave Airlie
This adds support for creating the hw atomic tgsi from
the glsl codepaths.
v2: drop the atomic index and move to backend.
v3: drop buffer decls. (Marek)
Signed-off-by: Dave Airlie
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 101
On 07.11.2017 07:31, Dave Airlie wrote:
From: Dave Airlie
This adds support for a hw atomic counters to TGSI.
A new register file for storing atomic counters is added,
along with a new atomic counter semantic, along with docs
for both.
v2: drop semantic, move hw counter to backend,
Ilia point
Reviewed-by: Nicolai Hähnle
On 07.11.2017 17:09, Brian Paul wrote:
I've noticed at least two places where we store the TGSI opcode in
an unsigned:8 bitfield. We're at 249 opcodes now. If we hit 256 we'll
need to grow those bitfields. Add a static assertion to detect that.
--
Reviewed-by: Nicolai Hähnle
On 07.11.2017 16:16, Marek Olšák wrote:
From: Marek Olšák
---
src/gallium/drivers/radeonsi/si_state_draw.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c
b/src/gallium/drivers/radeonsi
.
Assuming that the test passes:
Reviewed-by: Nicolai Hähnle
src/gallium/drivers/r600/evergreen_state.c | 1 +
src/gallium/drivers/r600/r600_state.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/src/gallium/drivers/r600/evergreen_state.c
b/src/gallium/drivers/r600
For the series:
Reviewed-by: Nicolai Hähnle
On 04.11.2017 14:03, Marek Olšák wrote:
From: Marek Olšák
---
src/gallium/drivers/radeon/r600_buffer_common.c | 20
src/gallium/drivers/radeon/r600_pipe_common.h | 1 +
2 files changed, 21 insertions(+)
diff --git a
On 03.11.2017 12:02, Thomas Hellstrom wrote:
It turned out that with recent changes that call into dri3 from glFinish(),
it appears like different thread end up waiting for X events simultaneously,
causing deadlocks since they steal events from eachoter and update the dri3
counters behind eachoth
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/drivers/ddebug/dd_util.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/ddebug/dd_util.h
b/src/gallium/drivers/ddebug/dd_util.h
index cfc0fb0ccce..bdfb7cc9163 100644
--- a/src/gallium
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/auxiliary/util/u_debug.c| 19 +++
src/gallium/auxiliary/util/u_dump.h | 3 ++
src/gallium/auxiliary/util/u_dump_defines.c | 53 +
src/gallium/auxiliary/util/u_dump_state.c | 2
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/drivers/ddebug/dd_draw.c | 8
src/gallium/drivers/ddebug/dd_pipe.h | 2 ++
2 files changed, 10 insertions(+)
diff --git a/src/gallium/drivers/ddebug/dd_draw.c
b/src/gallium/drivers/ddebug/dd_draw.c
index a856d0142a1
From: Nicolai Hähnle
Transfer commands can have associated GPU operations.
Enabled by passing GALLIUM_DDEBUG=transfers.
Reviewed-by: Marek Olšák
---
src/gallium/drivers/ddebug/dd_context.c | 65 -
src/gallium/drivers/ddebug/dd_draw.c| 234
src
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_pipe.c | 11 ++-
1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
b/src/gallium/drivers/radeonsi/si_pipe.c
index 10225353907..b193a0b4f21 100644
--- a
From: Nicolai Hähnle
This patch has multiple goals:
1. Off-load the writing of records in 'always' mode to another thread
for performance.
2. Allow using ddebug with threaded contexts. This really forces us to
move some of the "after_draw" handling into another thre
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_debug.c | 5 -
src/gallium/drivers/radeonsi/si_hw_context.c | 3 +++
src/gallium/drivers/radeonsi/si_pipe.h | 1 +
3 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers
From: Nicolai Hähnle
Change format to %p while we're at it.
Reviewed-by: Marek Olšák
---
src/gallium/auxiliary/util/u_dump.h | 3 +++
src/gallium/auxiliary/util/u_dump_state.c | 4 ++--
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/src/gallium/auxiliary/util/u_dump
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/drivers/ddebug/dd_draw.c | 25 ++---
1 file changed, 10 insertions(+), 15 deletions(-)
diff --git a/src/gallium/drivers/ddebug/dd_draw.c
b/src/gallium/drivers/ddebug/dd_draw.c
index 99c9c929b2e..a856d0142a1
From: Nicolai Hähnle
v2: use uncached system memory for the fence, and use the CPU to
clear it so we never read garbage when checking the fence
---
src/gallium/drivers/radeonsi/si_fence.c | 89 -
1 file changed, 88 insertions(+), 1 deletion(-)
diff --git a
From: Nicolai Hähnle
Queries should still get marked as flushed when flushes are executed
asynchronously in the driver thread.
To this end, the management of the unflushed_queries list is moved into
the driver thread.
Reviewed-by: Marek Olšák
---
src/gallium/auxiliary/util
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeon/r600_pipe_common.c | 267 --
src/gallium/drivers/radeonsi/Makefile.sources | 1 +
src/gallium/drivers/radeonsi/meson.build | 1 +
src/gallium/drivers/radeonsi/si_fence.c | 304
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/auxiliary/util/u_dump.h | 3 +++
src/gallium/auxiliary/util/u_dump_state.c | 10 ++
2 files changed, 13 insertions(+)
diff --git a/src/gallium/auxiliary/util/u_dump.h
b/src/gallium/auxiliary/util/u_dump.h
index
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/auxiliary/util/u_threaded_context.c | 11 +++
1 file changed, 11 insertions(+)
diff --git a/src/gallium/auxiliary/util/u_threaded_context.c
b/src/gallium/auxiliary/util/u_threaded_context.c
index 4908ea8a7ba..1f8a9d5088b
From: Nicolai Hähnle
For running post-draw operations inside the driver thread. ddebug will
use it.
Reviewed-by: Marek Olšák
---
src/gallium/auxiliary/util/u_threaded_context.c| 46 ++
.../auxiliary/util/u_threaded_context_calls.h | 1 +
src/gallium/include/pipe
From: Nicolai Hähnle
The driver uses (and must use) the flushed flag of queries as a hint that
it does not have to check for synchronization with currently queued up
commands. Deferred flushes do not actually flush queued up commands, so
we must not set the flushed flag for them.
Found by
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/drivers/ddebug/dd_util.h | 30 ++
1 file changed, 18 insertions(+), 12 deletions(-)
diff --git a/src/gallium/drivers/ddebug/dd_util.h
b/src/gallium/drivers/ddebug/dd_util.h
index 4e1a945c57d
From: Nicolai Hähnle
This requires out-of-band creation of fences, and will be signaled to
the pipe_context::flush implementation by a special TC_FLUSH_ASYNC flag.
v2:
- remove an incorrect assertion
- handle fence_server_sync for unsubmitted fences by
relying on the improved
From: Nicolai Hähnle
v2: remove the change to si_fence_server_sync, we'll handle that more
robustly
Reviewed-by: Marek Olšák (v1)
---
src/gallium/drivers/radeonsi/si_fence.c | 22 ++
1 file changed, 22 insertions(+)
diff --git a/src/gallium/drivers/radeonsi/si_fe
Hi all,
here's a re-spin of the series, v1 was here:
https://patchwork.freedesktop.org/series/32427, and the updated
patches in a larger context are here:
https://cgit.freedesktop.org/~nh/mesa/log/?h=fences-threads-ddebug
Changes in v2:
- patch 3: Windows build issues should be fixed now (tested
From: Nicolai Hähnle
The idea is to fix the following interleaving of operations
that can arise from deferred fences:
Thread 1 / Context 1 Thread 2 / Context 2
f = deferred flush
<--- application-side synchronization --->
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/auxiliary/Makefile.sources | 2 --
src/gallium/auxiliary/gallivm/lp_bld_init.c| 2 +-
src/gallium/auxiliary/hud/hud_cpu.c| 2 +-
src/gallium/auxiliary/hud/hud_cpufreq.c| 2 +-
src
From: Nicolai Hähnle
These bits are intended to be used by the ddebug hang detection and are
named in analogy to the Vulkan stage bits (and the corresponding Radeon
pipeline event).
Hang detection needs fences on the granularity of individual commands,
which nothing else really covers. The
From: Nicolai Hähnle
v2:
- style fixes
- fix missing timeout handling in futex path
Reviewed-by: Marek Olšák (v1)
---
src/util/futex.h | 9 --
src/util/simple_mtx.h | 2 +-
src/util/u_queue.c| 82 ++-
src/util/u_queue.h| 54
From: Nicolai Hähnle
Cc: Jose Fonseca
Reviewed-by: Marek Olšák
---
src/gallium/auxiliary/Makefile.sources | 1 -
src/gallium/auxiliary/meson.build | 1 -
src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c | 1 -
src/gallium/auxiliary/pipebuffer
From: Nicolai Hähnle
C11 threads were changed to use struct timespec instead of xtime, and
thrd_sleep got a second argument.
See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1554.htm and
http://en.cppreference.com/w/c/thread/{thrd_sleep,cnd_timedwait,mtx_timedlock}
Note that cnd_timedwait
From: Nicolai Hähnle
Also document some subtleties of pipe_context::flush.
Reviewed-by: Marek Olšák
---
src/gallium/docs/source/context.rst | 9 +
src/gallium/include/pipe/p_context.h | 8 +++-
src/gallium/include/pipe/p_defines.h | 2 ++
3 files changed, 18 insertions(+), 1
From: Nicolai Hähnle
We only need the lock to guard changes in the variant linked list. The
actual compilation can happen outside the lock, since we use the ready
fence as a guard.
v2: fix double-unlock
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_state_shaders.c | 8
From: Nicolai Hähnle
There's a race condition between si_shader_select_with_key and
si_bind_XX_shader:
Thread 1 Thread 2
si_shader_select_with_key
begin compiling the first
variant
(guarded by sel-&
From: Nicolai Hähnle
Fences are now 4 bytes instead of 96 bytes (on my 64-bit system).
Signaling a fence is a single atomic operation in the fast case plus a
syscall in the slow case.
Testing if a fence is signaled is the same as before (a simple comparison),
but waiting on a fence is now no
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/util/u_queue.c | 4 +---
src/util/u_queue.h | 13 +
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/src/util/u_queue.c b/src/util/u_queue.c
index 33436e0749a..2272006042f 100644
--- a/src/util/u_queue.c
+++ b
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/util/u_queue.h | 19 ++-
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/src/util/u_queue.h b/src/util/u_queue.h
index ff713ae54d6..7a028ef0847 100644
--- a/src/util/u_queue.h
+++ b/src/util/u_queue.h
From: Nicolai Hähnle
v2: style fixes
Reviewed-by: Marek Olšák (v1)
---
src/util/Makefile.sources | 1 +
src/util/futex.h | 53 +++
src/util/meson.build | 1 +
src/util/simple_mtx.h | 20 +-
4 files changed, 56
From: Nicolai Hähnle
The closest to it in the old-style gcc builtins is __sync_lock_test_and_set,
however, that is only guaranteed to work with values 0 and 1 and only
provides an acquire barrier. I also don't know about other OSes, so we
provide a simple & stupid emulation via p_atomi
From: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/util/u_queue.c | 2 +-
src/util/u_queue.h | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/util/u_queue.c b/src/util/u_queue.c
index 3b05110e9f8..33436e0749a 100644
--- a/src/util/u_queue.c
+++ b/src/util/u_queue.c
Hi all,
Some small style changes relative to v1, and a fix to the memory
barrier semantics in the futex-based fence implementation.
Patches 2 (which is new) and 6 still need a review. Please take
a look!
Thanks,
Nicolai
--
src/gallium/drivers/radeonsi/si_shader.c | 3 +
src/gallium/driver
Hi Andres,
On 03.11.2017 18:36, Andres Rodriguez wrote:
On 2017-11-03 05:17 AM, Nicolai Hähnle wrote:
On 02.11.2017 04:57, Andres Rodriguez wrote:
Bits to implement ServerWaitSemaphoreObject/ServerSignalSemaphoreObject
Signed-off-by: Andres Rodriguez
---
src/mesa/state_tracker
On 03.11.2017 19:46, Marek Olšák wrote:
On Fri, Nov 3, 2017 at 3:48 PM, Nicolai Hähnle wrote:
On 31.10.2017 17:21, Marek Olšák wrote:
On Sun, Oct 22, 2017 at 9:07 PM, Nicolai Hähnle
wrote:
From: Nicolai Hähnle
---
src/gallium/drivers/radeonsi/si_fence.c | 83
On 31.10.2017 18:59, Marek Olšák wrote:
On Sun, Oct 22, 2017 at 9:18 PM, Nicolai Hähnle wrote:
From: Nicolai Hähnle
---
src/mesa/state_tracker/st_cb_flush.c | 4 ++--
src/mesa/state_tracker/st_cb_syncobj.c | 26 --
2 files changed, 26 insertions(+), 4 deletions
On 03.11.2017 14:56, Miklós Máté wrote:
On 03/11/17 11:04, Nicolai Hähnle wrote:
On 03.11.2017 00:06, Miklós Máté wrote:
On 02/11/17 17:16, Nicolai Hähnle wrote:
On 01.11.2017 00:34, Miklós Máté wrote:
This fixes a crash upon context destruction when
glGenFragmentShadersATI() was used
On 30.10.2017 13:31, Marek Olšák wrote:
On Mon, Oct 30, 2017 at 2:57 AM, Marek Olšák wrote:
On Sun, Oct 22, 2017 at 9:07 PM, Nicolai Hähnle wrote:
From: Nicolai Hähnle
The driver uses (and must use) the flushed flag of queries as a hint that
it does not have to check for synchronization
On 31.10.2017 03:15, Marek Olšák wrote:
On Sun, Oct 22, 2017 at 9:07 PM, Nicolai Hähnle wrote:
@@ -107,20 +138,46 @@ static boolean si_fence_finish(struct pipe_screen *screen,
uint64_t timeout)
{
struct radeon_winsys *rws = ((struct r600_common_screen
On 31.10.2017 17:21, Marek Olšák wrote:
On Sun, Oct 22, 2017 at 9:07 PM, Nicolai Hähnle wrote:
From: Nicolai Hähnle
---
src/gallium/drivers/radeonsi/si_fence.c | 83 -
1 file changed, 82 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers
On 03.11.2017 14:23, Nicolai Hähnle wrote:
What's the status of this?
On 16.10.2017 14:36, Emil Velikov wrote:
On 16 October 2017 at 08:06, Timothy Arceri
wrote:
While modern pthread mutexes are very fast, they still incur a call
to an
external DSO and overhead of the generality and fea
What's the status of this?
On 16.10.2017 14:36, Emil Velikov wrote:
On 16 October 2017 at 08:06, Timothy Arceri wrote:
While modern pthread mutexes are very fast, they still incur a call to an
external DSO and overhead of the generality and features of pthread mutexes.
Most mutexes in mesa onl
On 03.11.2017 08:24, Dave Airlie wrote:
Ilia pointed out a bad assumption I made, so I've decided to
move to allocating the hw indices in the backend, a bit ugly but
seems to work.
Thanks for doing this. I want to get GDS atomics into radeonsi as well.
I've already sent some minor comments, wi
On 03.11.2017 08:24, Dave Airlie wrote:
From: Dave Airlie
This adds support for the evergreen/cayman atomic counters.
These are implemented using GDS append/consume counters. The values
for each counter are loaded before drawing and saved after each draw
using special CP packets.
v2: move hw
c->Program[MESA_SHADER_FRAGMENT].MaxAtomicBuffers;
It's not true for GCN GDS-based atomics, because we can just use normal
GDS instruction there (rather than the limited ordered append counters).
But we can deal with that when we get there.
With the indentation fixed, this pa
On 02.11.2017 04:57, Andres Rodriguez wrote:
Allow importing, waiting and signaling of semaphore objects.
Semaphore objects are backed by syncobj based fences.
Signed-off-by: Andres Rodriguez
---
src/gallium/drivers/radeon/r600_pipe_common.c | 52 +++
src/gallium/dri
On 03.11.2017 00:06, Miklós Máté wrote:
On 02/11/17 17:16, Nicolai Hähnle wrote:
On 01.11.2017 00:34, Miklós Máté wrote:
This fixes a crash upon context destruction when
glGenFragmentShadersATI() was used. Backtrace:
==15060== Invalid free() / delete / delete[] / realloc()
==15060== at
On 02.11.2017 20:45, Timothy Arceri wrote:
On 03/11/17 03:25, Nicolai Hähnle wrote:
On 01.11.2017 06:20, Timothy Arceri wrote:
Delaying adding built-in uniforms until after we convert to NIR
gives us a better chance to optimise them away. Also NIR allows
us to iterate over the uniforms
Same concerns about testing as Emil, but the logic of it all is sound,
so patches 3-8 are
Reviewed-by: Nicolai Hähnle
On 02.11.2017 20:01, Adam Jackson wrote:
Hilariously this is a fairly big win. Neil's multi-context-test
improves from ~24 to ~36 fps with llvmpipe on a Core i5-
ion,
- uint32_t flags,
- bool notify_reset,
- unsigned priority,
- unsigned *error,
+ const struct __DriverContextConfig *ctx_config,
+ unsigned *error,
Also here.
Apart from these, patches
Patches 2 & 3:
Reviewed-by: Nicolai Hähnle
On 02.11.2017 21:25, Ian Romanick wrote:
From: Ian Romanick
I could not find any remaining users.
Signed-off-by: Ian Romanick
---
src/compiler/glsl/link_uniforms.cpp | 8
src/compiler/glsl/linker.h | 10 --
2 f
On 02.11.2017 04:57, Andres Rodriguez wrote:
Make sure memory is accessible to the external client, for the specified
memory object, before the signal/after the wait.
Signed-off-by: Andres Rodriguez
---
src/mesa/main/dd.h | 14 ++-
src/mesa/main/externalobjec
On 02.11.2017 04:57, Andres Rodriguez wrote:
Bits to implement ServerWaitSemaphoreObject/ServerSignalSemaphoreObject
Signed-off-by: Andres Rodriguez
---
src/mesa/state_tracker/st_cb_semaphoreobjects.c | 28 +
1 file changed, 28 insertions(+)
diff --git a/src/mesa/sta
401 - 500 of 4145 matches
Mail list logo