On 2016-05-07 03:14:43, Kenneth Graunke wrote:
> My old implementation accumulated pairs in a buffer,
> and eventually processed that data on the CPU. This meant flushing
> the batchbuffer and waiting for it to completely execute before we
> could map it, resulting in really long
On Thu, May 05, 2016 at 05:04:02PM -0700, Kristian H?gsberg wrote:
> From: Kristian Høgsberg Kristensen
>
> ---
> src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 ++
> src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 5 ++
>
On 09/05/16 07:21, Pohjolainen, Topi wrote:
> On Fri, May 06, 2016 at 08:56:08AM +0200, Samuel Iglesias Gons?lvez wrote:
>> When there is a mix of definitions of uniforms with 32-bit or 64-bit
>> data type sizes, the driver ends up doing misaligned access to double
>> based variables in the push
On Fri, May 06, 2016 at 08:56:08AM +0200, Samuel Iglesias Gons?lvez wrote:
> When there is a mix of definitions of uniforms with 32-bit or 64-bit
> data type sizes, the driver ends up doing misaligned access to double
> based variables in the push constant buffer.
>
> To fix this, this patch
On 07/05/16 09:22, Jordan Justen wrote:
> On 2016-05-05 23:56:08, Samuel Iglesias Gonsálvez wrote:
>> When there is a mix of definitions of uniforms with 32-bit or 64-bit
>> data type sizes, the driver ends up doing misaligned access to double
>> based variables in the push constant buffer.
>>
IMO you will need to check for the ES version too, something like this:
https://patchwork.freedesktop.org/patch/36417/
On 05/06/2016 11:22 PM, Lars Hamre wrote:
The conditions for which certain built-in special variables
can be declared invariant were not being checked.
OpenGL ES 1.00
On 08.05.2016 21:21, Marek Olšák wrote:
> From: Marek Olšák
>
> ---
> src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 1 +
> src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
Signed-off-by: Ilia Mirkin
---
This is pretty academic since no hw supports these formats, but since core
support for these has landed, might as well extend the view logic.
src/mesa/main/textureview.c | 28 ++--
1 file changed, 26 insertions(+), 2
Reviewed-by: Ilia Mirkin
On Sun, May 8, 2016 at 6:13 PM, Samuel Pitoiset
wrote:
> We don't need them for compute shaders.
>
> Signed-off-by: Samuel Pitoiset
> ---
>
We don't need them for compute shaders.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 5 +
1 file changed, 5 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
Class "ir_constant" had a bunch of constructors where the pointer member
"array_elements" had not been initialized. This could have lead to unsafe
code if something had tried to write anything to it. This patch fixes
this issue by initializing the pointer to NULL in all the constructors.
This
On 08.05.2016 22:50, Ilia Mirkin wrote:
What exactly gets fed into the CLIPDIST and CULLDIST semantics? e.g.
is CULLDIST[0].x the first cull distance, or is it the first entity in
the combined cull/clip distance array? If the former, then this won't
work as implemented on nouveau. If the
What exactly gets fed into the CLIPDIST and CULLDIST semantics? e.g.
is CULLDIST[0].x the first cull distance, or is it the first entity in
the combined cull/clip distance array? If the former, then this won't
work as implemented on nouveau. If the latter, then why bother with
the separate
Signed-off-by: Tobias Klausmann
---
src/gallium/drivers/llvmpipe/lp_screen.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c
b/src/gallium/drivers/llvmpipe/lp_screen.c
index c6c18ee..6aef5c9
Signed-off-by: Tobias Klausmann
---
src/mesa/state_tracker/st_extensions.c | 1 +
src/mesa/state_tracker/st_program.c| 40 ++
2 files changed, 41 insertions(+)
diff --git a/src/mesa/state_tracker/st_extensions.c
This lets us safely enable or disable the extension as needed
Signed-off-by: Tobias Klausmann
Reviewed-by: Edward O'Callaghan
---
src/gallium/docs/source/screen.rst | 2 ++
From: Dave Airlie
This just renames the file in anticipation of adding cull lowering.
Signed-off-by: Tobias Klausmann
Signed-off-by: Dave Airlie
Reviewed-by: Edward O'Callaghan
---
This enables ARB_cull_distance.
Signed-off-by: Tobias Klausmann
---
docs/GL3.txt| 2 +-
docs/relnotes/11.3.0.html | 1 +
src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 2 ++
airlied:
v2: rename LowerClipDistance to LowerCombinedClipCullDistnace.
I don't think we want any other behaviour with any current hw.
Signed-off-by: Tobias Klausmann
Reviewed-by: Edward O'Callaghan
---
This will come in handy when we want to lower gl_CullDistance into
gl_CullDistanceMESA.
[airlied: drop separate APIs for clip/cull - just use single API
to call both passes.]
Signed-off-by: Tobias Klausmann
---
src/compiler/glsl/ir_optimization.h | 3 +-
Signed-off-by: Tobias Klausmann
---
src/compiler/glsl/ast_to_hir.cpp | 14
src/compiler/glsl/builtin_variables.cpp | 11 ++-
src/compiler/glsl/glcpp/glcpp-parse.y| 3 +
src/compiler/glsl/glsl_parser_extras.cpp | 1 +
Signed-off-by: Tobias Klausmann
Reviewed-by: Edward O'Callaghan
---
src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 3 +++
1 file changed, 3 insertions(+)
diff --git
Signed-off-by: Tobias Klausmann
Reviewed-by: Edward O'Callaghan
---
src/mapi/glapi/gen/gl_API.xml | 7 ++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/mapi/glapi/gen/gl_API.xml
After the cleanup of my patches in v2, this is another take on finishing this
extension.
v2: cleanup, reordering of patches, split lowering pass adapation (Dave Airlie)
v3:
- drop wrong codesection for array size check (suggested by Timothy Arceri) and
with it the now useless helper to see
Signed-off-by: Tobias Klausmann
Reviewed-by: Edward O'Callaghan
---
src/mesa/program/prog_print.c | 4
1 file changed, 4 insertions(+)
diff --git a/src/mesa/program/prog_print.c b/src/mesa/program/prog_print.c
index
On Sun, May 8, 2016 at 4:09 PM, Rob Clark wrote:
> On Sun, May 8, 2016 at 3:10 PM, Jason Ekstrand wrote:
>> Before you push this, of like to register my official skepticism as to
>> whether or not this is the right fix. Given that the end block is
On Sun, May 8, 2016 at 3:10 PM, Jason Ekstrand wrote:
> Before you push this, of like to register my official skepticism as to
> whether or not this is the right fix. Given that the end block is special,
> anyone who depends on iterating over it should know that and handle
Before you push this, of like to register my official skepticism as to
whether or not this is the right fix. Given that the end block is special,
anyone who depends on iterating over it should know that and handle it
specially anyway. I'm pretty sure that block numbering is the only thing
that
On Sun, May 08, 2016 at 12:15:00PM +0100, Daniel Stone wrote:
> Hi,
> I'd already suggested the same, but it never got pushed:
> https://lists.freedesktop.org/archives/mesa-dev/2016-May/115501.html
>
> So I guess we can add the Tested-by from the other, for whichever gets
> pushed.
>
Was there
For the history it'd probably be nicer to squash patch 15 in the
appropriate place. Either way, patches 5 - 15 are
Reviewed-by: Nicolai Hähnle
On 08.05.2016 07:10, Marek Olšák wrote:
From: Marek Olšák
Same algorithm, just applied to T2L.
(and
LGTM:
Reviewed-by: Antia Puentes
On vie, 2016-05-06 at 10:22 +1000, Dave Airlie wrote:
> From: Dave Airlie
>
> ARRAY_SIZE and LOCATION should accept the SUBROUTINE_UNIFORM types.
>
> Fixes:
> GL43-CTS.program_interface_query.subroutines-vertex
>
Hi Rob,
On jue, 2016-05-05 at 14:40 -0400, Rob Clark wrote:
> From: Rob Clark
>
> With the switch to new block iterator macro, we silently stopped
> iterating over the end-block. Which caused nir_index_blocks() to not
> index the end-block. Resulting in funny looking
On 04.05.2016 18:43, Marek Olšák wrote:
From: Marek Olšák
---
src/gallium/drivers/radeon/r600_texture.c | 46 ++-
1 file changed, 39 insertions(+), 7 deletions(-)
diff --git a/src/gallium/drivers/radeon/r600_texture.c
On 04.05.2016 18:43, Marek Olšák wrote:
From: Marek Olšák
this is more robust and probably fixes some bugs already
---
src/gallium/drivers/r600/evergreen_state.c| 10 ++---
src/gallium/drivers/r600/r600_state.c | 5 ++-
Reviewed-by: Marek Olšák
BTW, Matt suggested that instead of declaring a local LDS array, the
shader should receive an LDS pointer via function parameters of "main"
(or something along those lines). This would avoid the need to specify
the array size at compile time.
Marek
The series is
Reviewed-by: Bas Nieuwenhuizen
On Sun, May 8, 2016 at 2:21 PM, Marek Olšák wrote:
> From: Marek Olšák
>
> ---
> src/gallium/drivers/radeon/radeon_winsys.h| 1 +
> src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
On Sat, May 7, 2016 at 1:06 PM, Rob Clark wrote:
> From: Rob Clark
>
> It was kinda sad that we couldn't optimize imul/idiv by power-of-two.
> So I bashed my head against python for a while and this is what I came
> up with. In the search
From: Roland Scheidegger
We don't target this yet, and some llvm versions incorrectly enable it based
on cpu string, causing crashes.
(Albeit this is a losing battle, it is pretty much guaranteed when the next
new feature comes along llvm will mistakenly enable it on some
Reviewed-by: Marek Olšák
Marek
On Sun, May 8, 2016 at 12:06 AM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle
>
> This is useful for shader-related counters, since they tend to quickly
> exceed 32 bits.
> ---
>
Reviewed-by: Marek Olšák
Marek
On Sun, May 8, 2016 at 12:07 AM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle
>
> Experiments with framebuffer-no-attachments type draw calls have shown that
> NULL exports stall terribly unless
From: Marek Olšák
---
src/gallium/drivers/radeon/radeon_winsys.h| 1 +
src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 16
src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 2 +-
src/gallium/winsys/amdgpu/drm/amdgpu_winsys.h | 1 -
4 files changed, 10
From: Marek Olšák
---
src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 1 +
src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
index
From: Marek Olšák
---
src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 20 ++--
src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 2 +-
src/gallium/winsys/radeon/drm/radeon_drm_winsys.h | 3 ---
3 files changed, 11 insertions(+), 14 deletions(-)
diff
From: Marek Olšák
---
src/gallium/drivers/r300/r300_query.c | 4 +++-
src/gallium/drivers/radeon/r600_pipe_common.c | 10 +++---
src/gallium/drivers/radeon/r600_query.c | 3 ++-
src/gallium/drivers/radeonsi/si_debug.c | 2 +-
From: Marek Olšák
Same algorithm, just applied to T2L.
(and using a 0-based address and surface.bo_size instead of buf->size)
---
src/gallium/drivers/radeonsi/cik_sdma.c | 108
1 file changed, 55 insertions(+), 53 deletions(-)
diff --git
Hi,
I'd already suggested the same, but it never got pushed:
https://lists.freedesktop.org/archives/mesa-dev/2016-May/115501.html
So I guess we can add the Tested-by from the other, for whichever gets
pushed.
Cheers,
Daniel
On Sunday, 8 May 2016, Hans de Goede wrote:
>
The calculated limit gave problems on SI as it was > 32 KiB
and the hardware LDS size on SI is only 32 KiB. It isn't
correct anyway when processing multiple patches in a threadgroup.
As we potentially have any number of patches such that the
used LDS is at most the hardware LDS size, and exact
Make pipe_loader_sw_probe_kms take ownership of the passed in fd,
like pipe_loader_drm_probe_fd does.
The only caller is dri_kms_init_screen which passes in a dupped fd,
just like dri2_init_screen passes in a dupped fd to
pipe_loader_drm_probe_fd.
Signed-off-by: Hans de Goede
My old implementation accumulated pairs in a buffer,
and eventually processed that data on the CPU. This meant flushing
the batchbuffer and waiting for it to completely execute before we
could map it, resulting in really long stalls. We could also run out
of space in the buffer, and
From: Nicolai Hähnle
---
src/compiler/glsl/list.h | 17 +
src/compiler/glsl/lower_jumps.cpp | 7 +--
2 files changed, 18 insertions(+), 6 deletions(-)
diff --git a/src/compiler/glsl/list.h b/src/compiler/glsl/list.h
index f05d437..3ecd3e4
https://bugs.freedesktop.org/show_bug.cgi?id=93551
--- Comment #23 from Karol Herbst ---
(In reply to Jamey Sharp from comment #17)
> (In reply to Ernst Sjöstrand from comment #16)
> > Yes I know because when I merged in
> >
https://bugs.freedesktop.org/show_bug.cgi?id=93551
--- Comment #25 from kilobug ---
Perhaps it's due to LLVM version, for the Intel vs radeonsi ? I tried with llvm
3.8, and some of OpenGL 4.2 extensions require llvm 3.9... but I've to admit it
scares me a bit to recompile
https://bugs.freedesktop.org/show_bug.cgi?id=93551
--- Comment #24 from Jamey Sharp ---
(In reply to Karol Herbst from comment #23)
> Anyhow, their engine requires ARB_shading_language_include, GL4.2 and
> 595d56cc866638f371626cc1d0137a6a54a7d0f8
I'm still super confused
From: Nicolai Hähnle
We request more than 32KB of LDS here, which SI doesn't have. Since LLVM
recently started checking the size of declared LDS allocations, all shaders
involved in tesselation fail to compile on SI.
Note that the entire calculation here seems wrong,
From: Nicolai Hähnle
The old iteration casts sentinel nodes (which are mere exec_nodes) into
whatever type we're looping over, which leads to badness (in fact, gcc's
undefined behaviour sanitizer crashes while trying to verify that we have
the correct type at hand).
This reverts commit 6a0d036483caf87d43ebe2edd1905873446c9589.
This commit breaks both gdm on wayland as well as gnome-shell on Xorg
(with the modesetting driver) on my skylake desktop with 2 hdmi attached
1920x1080 dvi-monitors.
The symptons in both cases are both monitors showing whatever was
Signed-off-by: Jeremy Huddleston Sequoia
CC: Nicolai Hähnle
CC: Matt Turner
CC: Ian Romanick
---
src/mesa/main/shaderapi.c | 26 +-
1 file changed, 25 insertions(+), 1 deletion(-)
diff
https://bugs.freedesktop.org/show_bug.cgi?id=95215
--- Comment #11 from Karol Herbst ---
(In reply to Kenneth Graunke from comment #8)
> FWIW, I would prefer not to implement ARB_shading_language_include.
well the game requires it, because even GL vendor spoofing to
Hi,
this is a re-send of two patches that didn't get anybody's attention, when I
sent them out last week, plus some additional fixes for rarer instances of
the same problem that I've encountered since then.
The problem that these patches fix is simple: the exec_list iterations often
cast
https://bugs.freedesktop.org/show_bug.cgi?id=95215
--- Comment #10 from Karol Herbst ---
see this comment:
https://bugs.freedesktop.org/show_bug.cgi?id=93551#c22
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for
From: Nicolai Hähnle
---
src/compiler/glsl/ast_function.cpp | 4 ++--
src/compiler/glsl/link_uniform_initializers.cpp | 8 +++-
2 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/src/compiler/glsl/ast_function.cpp
From: Nicolai Hähnle
Experiments with framebuffer-no-attachments type draw calls have shown that
NULL exports stall terribly unless we ensure that export memory is allocated
by the SPI.
---
src/gallium/drivers/radeonsi/si_state_shaders.c | 15 ++-
1 file
https://bugs.freedesktop.org/show_bug.cgi?id=93551
--- Comment #22 from Karol Herbst ---
this is due the game requiring a 4.2 context. They asked for it and then it
crashes because they don't error check.
Then the game crashes because there is no
fs_visitor::emit_urb_writes skips writing the VUE header for shaders
that don't write gl_PointSize, gl_Layer, or gl_ViewportIndex. This
leaves their values uninitialized. Kristian's nearby comment says:
"But often none of the special varyings that live there are written
and in that case we can
From: Roland Scheidegger
At least with MCJIT the disassembler will crash otherwise when trying to
disassemble such functions.
---
src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
From: Nicolai Hähnle
This macro avoids undefined behaviour that crashes gcc's ubsan.
---
src/compiler/glsl/list.h | 13 +
src/compiler/glsl/opt_dead_code_local.cpp | 7 +--
src/compiler/glsl/opt_tree_grafting.cpp | 5 +
3 files
The SAMPLEMASK semantic should only return the bits set covered by the
current invocation. However we were always retrieving the covmask, which
returns the covered samples of the whole pixel.
When not doing per-sample invocation, this is precisely what we want.
However when doing per-sample
Looks good to me, just two remarks below...
On 06.05.2016 13:31, Marek Olšák wrote:
From: Marek Olšák
Ported from the initial amdgpu winsys from the private AMD branch.
The thread creates the buffer list, submits IBs, and cleans up
the submission context, which can also
Signed-off-by: Ilia Mirkin
---
.../drivers/nouveau/codegen/nv50_ir_driver.h | 6 ++--
.../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 7 ++--
.../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 7 ++--
.../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 5
From: Nicolai Hähnle
This is useful for shader-related counters, since they tend to quickly
exceed 32 bits.
---
src/gallium/drivers/radeon/r600_perfcounter.c | 22 +++---
src/gallium/drivers/radeonsi/si_perfcounter.c | 13 -
2 files changed,
From: Rob Clark
It was kinda sad that we couldn't optimize imul/idiv by power-of-two.
So I bashed my head against python for a while and this is what I came
up with. In the search expression, you can use "#a^2" to only match
constants which are a power of two. The
From: Nicolai Hähnle
This macro avoids undefined downcasting of list sentinels that crashes gcc's
ubsan.
---
src/compiler/glsl/list.h| 8
src/compiler/glsl/opt_tree_grafting.cpp | 5 +
2 files changed, 9 insertions(+), 4 deletions(-)
diff
72 matches
Mail list logo