Dave Airlie writes:
> This is hopefully the final posting for this series, I've gotten
> the lowering pass to look like I wanted, which is to say it lowers
> to vec4[2].
>
> TGSI then uses the CLIPDIST semantic and the two properties to
> workout what is what. This means the
Kenneth Graunke writes:
> As far as I can tell, this was just entirely missing...honestly, I'm
> not sure how anything worked at all.
>
> Caught by noticing GPU hangs in image load store tests with scalar TCS,
> but probably has broader implications.
Yeah, presumably not
Kenneth Graunke writes:
> fs_visitor::emit_urb_writes skips writing the VUE header for shaders
> that don't write gl_PointSize, gl_Layer, or gl_ViewportIndex. This
> leaves their values uninitialized. Kristian's nearby comment says:
>
> "But often none of the special
"Pohjolainen, Topi" <topi.pohjolai...@intel.com> writes:
> On Thu, May 05, 2016 at 05:04:02PM -0700, Kristian H?gsberg wrote:
>> From: Kristian Høgsberg Kristensen <kristian.h.kristen...@intel.com>
>>
>> ---
>> src/mesa/drivers/dri/i965/intel_mi
This needs to be able to find the generated nir_opcodes.h header.
---
src/compiler/Makefile.am | 5 +
1 file changed, 5 insertions(+)
diff --git a/src/compiler/Makefile.am b/src/compiler/Makefile.am
index fe96cb3..8f37448 100644
--- a/src/compiler/Makefile.am
+++ b/src/compiler/Makefile.am
\
...
which looks correct for all SKUs, except GT1 and GT4, which both
override it to the correct value. As for GT4 urb size, it wont matter
for a while, but 1088 / 3 is the safe choice.
With the .urb.size assignments removed, this patch is
Reviewed-by: Kristian Høgsberg Kristensen <kr
Matt Turner <matts...@gmail.com> writes:
> On Fri, Jan 8, 2016 at 2:36 PM, Kristian Høgsberg <k...@bitplanet.net> wrote:
>> From: Kristian Høgsberg Kristensen <k...@owl.jf.intel.com>
>>
>> These are used by code that doens't necessarily link to libgl
Ian Romanick <i...@freedesktop.org> writes:
> On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote:
>> ---
>> src/glsl/builtin_variables.cpp | 5 +
>> src/glsl/glsl_parser_extras.cpp | 1 +
>> src/glsl/glsl_parser_extras.h
Ian Romanick <i...@freedesktop.org> writes:
> On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote:
>> fs_visitor::emit_vs_system_value() looks like it's trying to handle
>> SYSTEM_VALUE_VERTEX_ID, but we should never see that value in the
>> backend.
>>
next patch. I'll send out
a v2 with the rebasing fixed.
Kristian
> On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote:
>> We already have gl_BaseVertexARB in the .x component of the SGVS vec4
>> and plug gl_BaseInstanceARB into the last free component (.y).
>> ---
>>
Ian Romanick <i...@freedesktop.org> writes:
> On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote:
>> We can inspect VS prog_data for iterations i > 0, and only flag
>> BRW_NEW_VERTICES when one of our system values change.
>>
>> This change also flags
Ian Romanick <i...@freedesktop.org> writes:
> On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote:
>> We have to break open a new vec4 for gl_DrawIDARB. We've used up all
>> space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its
>> own se
---
src/glsl/builtin_variables.cpp | 5 +
src/glsl/glsl_parser_extras.cpp | 1 +
src/glsl/glsl_parser_extras.h | 2 ++
src/glsl/nir/nir.c | 8
src/glsl/nir/nir_intrinsics.h | 2 ++
src/glsl/nir/shader_enums.h | 20
We have to break open a new vec4 for gl_DrawIDARB. We've used up all
space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its
own separate vertex buffer anyway. This is because we point the vb for
base vertex and base instance into the draw parameter BO for indirect
draw calls, but
We can inspect VS prog_data for iterations i > 0, and only flag
BRW_NEW_VERTICES when one of our system values change.
This change also flags BRW_NEW_VERTICES in one case we were missing
before: if we're doing an indirect draw, prims[i].basevertex is always 0
and the real base vertex value is in
We already have gl_BaseVertexARB in the .x component of the SGVS vec4
and plug gl_BaseInstanceARB into the last free component (.y).
---
src/mesa/drivers/dri/i965/brw_compiler.h | 2 ++
src/mesa/drivers/dri/i965/brw_context.h | 9 --
src/mesa/drivers/dri/i965/brw_draw.c
fs_visitor::emit_vs_system_value() looks like it's trying to handle
SYSTEM_VALUE_VERTEX_ID, but we should never see that value in the
backend.
---
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
Hi,
Here's 7 patches to implement GL_ARB_shader_draw_parameters:
https://www.opengl.org/registry/specs/ARB/shader_draw_parameters.txt
and I have few new piglit tests for the extension as well.
Kristian
Kristian Høgsberg Kristensen (7):
mesa/vbo: Add draw_id field to struct _mesa_prim
The drivers will need this for passing in gl_DrawIDARB. For indirect
multidraw calls, we get the prim array and prim[i].draw_id == i and is
redundant. But for non-indirect calls, we get one primitive at a time
and need the draw_id field.
---
src/mesa/vbo/vbo.h| 1 +
This optimizes a + b - b to just a. Modest shader-db results (BDW):
total instructions in shared programs: 7842452 -> 7841862 (-0.01%)
instructions in affected programs: 61938 -> 61348 (-0.95%)
total loops in shared programs:2131 -> 2131 (0.00%)
helped:
This is a helper function for setting up the local invocation ID
payload according to the cs_prog_data generated by the compiler. It's
intended to be available to users of libi965_compiler so move it there.
---
src/mesa/drivers/dri/i965/brw_compiler.h | 7 +++
We always pass in shader->ir and we already pass in the shader, so just
drop the exec_list. Most passes either take just a exec_list or a
shader, so this seems more consistent.
Reviewed-by: Timothy Arceri <timothy.arc...@collabora.com>
Signed-off-by: Kristian Høgsberg Kris
, the pass now failed to eliminate some cases of dead code.
This v2 series, now has no shader-db impact and no jenkins regressions.
Kristian Høgsberg Kristensen (3):
glsl: Drop exec_list argument to lower_ubo_reference
glsl: Lower UBO and SSBO access in glsl linker
glsl: Use array deref
ars in the IR as
an array deref. This lets us run lowering passes that lower the vector
access to I/O (eg for SSBO load/store) before we lower the per-component
access to full vector writes.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/glsl/Makefile.sources
All GLSL IR consumers run this lowering pass so we can move it to the
linker. This moves the pass up quite a bit, but that's the point: it
needs to run before we throw away information about per-component vector
access.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
--
ars in the IR as
an array deref. This lets us run lowering passes that lower the vector
access to I/O (eg for SSBO load/store) before we lower the per-component
access to full vector writes.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/glsl/Makefile.sources
We always pass in shader->ir and we already pass in the shader, so just
drop the exec_list. Most passes either take just a exec_list or a
shader, so this seems more consistent.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/glsl/ir_optimization.h
All GLSL IR consumers run this lowering pass so we can move it to the
linker. This moves the pass up quite a bit, but that's the point: it
needs to run before we throw away information about per-component vector
access.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
--
is racy
and we have to handle this different in case the vector is in globally
visible storage.
Kristian Høgsberg Kristensen (3):
glsl: Drop exec_list argument to lower_ubo_reference
glsl: Lower UBO and SSBO access in glsl linker
glsl: Use array deref for access to vector components
src/glsl
The scalar destination registers break copy propagation. Instead compute
the results to a regular register and then reference a component when we
later use the result as a source.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_fs_builder
The emit_untyped_read and emit_untyped_write helpers already uniformize
the surface index argument. No need to do it before calling them.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 --
1 file changed, 2 deletions(-)
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 15 +++
1 file changed, 15 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
b/src/mesa/drivers/dr
Now that we don't read each component one-by-one, we don't need the
temoprary vgrf for the offset. More importantly, this register was type
UD while the nir source was type D. This broke copy propagation and left
a redundant MOV in the generated code.
Signed-off-by: Kristian Høgsberg Kristensen
The destination for SHADER_OPCODE_FIND_LIVE_CHANNEL is always a UD
register. When we replace the opcode with a MOV, make sure we use a UD
immediate 0 so copy propagation doesn't bail because of non-matching
types.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
sr
Instead of looping through single-component reads, read all components
in one go.
Reviewed-by: Iago Toral Quiroga <ito...@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.jus...@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mes
of
the payload and can accept misc strides and modifiers.
I also took a look at ssbo stores and made it write out contiguous
channels in the writemask together, in particular, the common case of
writing a vec4 goes from 4 to 1 write instruction.
Kristian Høgsberg Kristensen (8):
i965: Don't use
dan.l.jus...@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_eu_emit.c | 3 +--
src/mesa/drivers/dri/i965/brw_fs.cpp| 2 +-
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_eu_
Write groups of enabled components together.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 56 +++-
1 file changed, 26 insertions(+), 30 deletions(-)
diff --git a/src/mesa/drivers/dr
;8,8,1>D { align1 1Q };
send(8) g124<1>UD g9<8,8,1>UD
dp data 1 ( untyped surface read, Surface = 0,
SIMD8, Mask = 0x0) mlen 1 rlen 4 { align1 1Q };
send(8) g7<1>UD g8<8,8,1>UD
-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_fs_builder.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h
b/src/mesa/drivers/dri/i965/brw_fs_builder.h
index df10a9d..98ce71e 100644
--- a/sr
We always set the mask to 0x, which is what it defaults to when no
header is present. Let's drop the header instead.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_eu_emit.c | 3 +--
src/mesa/drivers/dri/i965/brw_fs.cpp| 4 ++--
2
Instead of looping through single-component reads, read all components
in one go.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 25 +++--
1 file changed, 7 insertions(+), 18 deletions(-)
diff --git a/sr
the compiler.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/Makefile.am | 8 ++--
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/Makefile.am
b/src/mesa/drivers/dri/i965/Makefile.am
index 82e58a6..2
We need the debug flag parsing and INTEL_DEBUG in the compiler, but we
don't want the dependency on bufmgr (libdrm_intel) in there. Move to
intel_screen.c.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/intel_debug.c | 14 +-
sr
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 65c3628..b79b4a4 100644
---
We want to use the rest of brw_shader.cpp with the rest of the compiler
without pulling in the GLSL linking code.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/Makefile.sources | 1 +
src/mesa/drivers/dri/i965/brw_link.cpp
This introduces a new libtool helper library, libi965_compiler.la. This
library is moderately self-contained, but still needs to link to all of
libmesa.la among other things.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/Makefile.am
We call this from the compiler so move it to brw_shader.cpp.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_shader.cpp | 26 ++
src/mesa/drivers/dri/i965/brw_vs.c | 25 -
2 files chang
to the compiler code that uses the
payload layout and makes it avaiable to other users of the compiler.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_context.h | 1 +
src/mesa/drivers/dri/i965/brw_cs.h| 5 +-
src/mesa/drive
We want to use intel_debug.c in code that doesn't link to dri common.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/intel_debug.c | 5 ++--
src/util/Makefile.sources | 2 ++
src/util/debug.c
brw_program.c won't be part of the compiler library, but we need
brw_mark_surface_used() in the compiler. Move to brw_shader.cpp.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_program.c | 10 --
src/mesa/drivers/dr
We move these calls one level up into the codegen functions.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_cs.c| 3 +++
src/mesa/drivers/dri/i965/brw_fs.cpp | 13 -
src/mesa/drivers/dri/i965/br
of brw_context references in there, INTEL_DEUBG is defined in the
compiler library and other oddities. However, the split lets us link
the compiler unit tests to just libi965_compiler.la and drop a few
dependencies there and of course, lets us use the compiler in other
projects more easily.
Kristian Høgsberg
function instead.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_context.h | 7 ++-
src/mesa/drivers/dri/i965/brw_program.c | 12
2 files changed, 6 insertions(+), 13 deletions(-)
diff --git a/src/mesa/drivers/dr
brw_get_shader_time_index() is all tangled up in brw_context state and
we can't call it from the compiler. Thanks the Jasons recent
refactoring, we can just get the index and pass to the emit functions
instead.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/d
/show_bug.cgi?id=91970
Cc: "11.0" <mesa-sta...@lists.freedesktop.org>
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.c
dependencies out of the compiler.
Kristian Høgsberg Kristensen (3):
i965: Move compute shader code around
i965: Move brw_fs_precompile() to brw_wm.c
i965: Move perf_debug code to brw_codegen_*_prog()
src/mesa/drivers/dri/i965/Makefile.sources | 3 +-
src/mesa/drivers/dri/i965/brw_cs.c
All other precompile functions live in the brw_.c files, make fs
follow the convention.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 58 ---
src/mesa/drivers/dri/i965/brw_wm.c
We're trying to avoid a libdrm dependency in the core compiler, so let's
move the perf_debug code one level up from the brw_*_emit() helpers to
the brw_codegen_*_prog() helpers.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drivers/dri/i965/brw_cs.c
this convention, we move the brw_cs_emit()
function into brw_fs.cpp. We can then rename brw_cs.cpp to brw_cs.c and
do this in C like the other similar files. Finally, move state setup
and atoms to gen7_cs_state.c.
Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net>
---
src/mesa/drive
60 matches
Mail list logo