Cool. 1-3 are
Reviewed-by: Jason Ekstrand
On Fri, Nov 9, 2018 at 7:09 AM Alejandro Piñeiro
wrote:
> The offset compute was working fine for the case of attrib_slots=1,
> and updating the offset for the following varying.
>
> But in the case of attrib_slots=2 (so dvec3/4), we a
On November 9, 2018 06:39:25 Alejandro Piñeiro wrote:
On 08/11/18 23:14, Jason Ekstrand wrote:
On Thu, Nov 8, 2018 at 7:22 AM Alejandro Piñeiro wrote:
On OpenGL, a array of a simple type adds just one varying. So
gl_transform_feedback_varying_info struct defined at mtypes.h includes
Suffixes are dropped from a bunch of conversion opcodes when it makes
sense to do so. Others are kept if we really do want the bit-size
restriction.
---
src/compiler/nir/nir_opt_algebraic.py | 58 +--
1 file changed, 29 insertions(+), 29 deletions(-)
diff --git
---
src/amd/common/ac_nir_to_llvm.c | 12
src/compiler/glsl/glsl_to_nir.cpp | 2 +-
src/compiler/nir/nir_builder.h| 12
src/compiler/spirv/vtn_glsl450.c | 4 ++--
.../drivers/freedreno/ir3/ir3_compiler_nir.c | 11
Instead of using an OrderedDict, just have a (necessarily sorted) array
of transforms and a set of opcodes.
---
src/compiler/nir/nir_algebraic.py | 21 +++--
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/src/compiler/nir/nir_algebraic.py
Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is
one if 8, 16, 32, or 64. This leads to having a few more opcodes but
now everything is consistent and booleans aren't a weird special case
anymore.
---
src/compiler/nir/nir.h | 4 ++--
All conversion opcodes require a destination size but this makes
constructing certain algebraic expressions rather cumbersome. This
commit adds support to nir_search and nir_algebraic for writing
conversion opcodes without a size. These meta-opcodes match any
conversion of that type regardless
Many of the x2b optimizations in nir_opt_algebraic can be handled by the
generic untyped conversion opcodes we just added. However, there are a
few that still need an explicit size for some reason.
---
src/compiler/nir/nir_opt_algebraic.py | 14 +++---
1 file changed, 7 insertions(+), 7
---
src/compiler/nir/nir_opt_algebraic.py | 9 ++---
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/src/compiler/nir/nir_opt_algebraic.py
b/src/compiler/nir/nir_opt_algebraic.py
index 8b24daddfdc..cda0aaf17f5 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++
---
src/compiler/nir/nir_opt_algebraic.py | 112 +-
1 file changed, 56 insertions(+), 56 deletions(-)
diff --git a/src/compiler/nir/nir_opt_algebraic.py
b/src/compiler/nir/nir_opt_algebraic.py
index 75a34e4a673..aeed5a8e4da 100644
---
Many of the x2b optimizations in nir_opt_algebraic can be handled by the
generic untyped conversion opcodes we just added. However, there are a
few that still need an explicit size for some reason.
---
src/compiler/nir/nir_opt_algebraic.py | 14 +++---
1 file changed, 7 insertions(+), 7
Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is
one if 8, 16, 32, or 64. This leads to having a few more opcodes but
now everything is consistent and booleans aren't a weird special case
anymore.
---
src/compiler/nir/nir.h | 4 ++--
---
src/compiler/nir/nir_opt_algebraic.py | 112 +-
1 file changed, 56 insertions(+), 56 deletions(-)
diff --git a/src/compiler/nir/nir_opt_algebraic.py
b/src/compiler/nir/nir_opt_algebraic.py
index 75a34e4a673..aeed5a8e4da 100644
---
Suffixes are dropped from a bunch of conversion opcodes when it makes
sense to do so. Others are kept if we really do want the bit-size
restriction.
---
src/compiler/nir/nir_opt_algebraic.py | 58 +--
1 file changed, 29 insertions(+), 29 deletions(-)
diff --git
---
src/amd/common/ac_nir_to_llvm.c | 12
src/compiler/glsl/glsl_to_nir.cpp | 2 +-
src/compiler/nir/nir_builder.h| 12
src/compiler/spirv/vtn_glsl450.c | 4 ++--
.../drivers/freedreno/ir3/ir3_compiler_nir.c | 11
---
src/compiler/glsl/glsl_to_nir.cpp | 2 +-
src/compiler/nir/nir_builder.h | 12
src/compiler/nir/nir_lower_idiv.c | 2 +-
src/compiler/nir/nir_lower_int64.c | 2 +-
src/compiler/nir/nir_opcodes.py | 11 +++
src/compiler/spirv/vtn_glsl450.c| 4 ++--
---
src/compiler/nir/nir_opt_algebraic.py | 116 +-
1 file changed, 58 insertions(+), 58 deletions(-)
diff --git a/src/compiler/nir/nir_opt_algebraic.py
b/src/compiler/nir/nir_opt_algebraic.py
index 6ce65c4ad10..42dd1e2f980 100644
---
Instead of using an OrderedDict, just have a (necessarily sorted) array
of transforms and a set of opcodes.
---
src/compiler/nir/nir_algebraic.py | 21 +++--
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/src/compiler/nir/nir_algebraic.py
---
src/compiler/nir/nir.c| 67 ++
src/compiler/nir/nir_opcodes.py | 38 +++
src/compiler/nir/nir_opcodes_c.py | 79 ---
3 files changed, 83 insertions(+), 101 deletions(-)
diff --git a/src/compiler/nir/nir.c
---
src/compiler/nir/nir_opt_algebraic.py | 9 ++---
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/src/compiler/nir/nir_opt_algebraic.py
b/src/compiler/nir/nir_opt_algebraic.py
index 8b24daddfdc..cda0aaf17f5 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++
Unsized conversion opcodes require special handling in opt_algebraic
because they fallow different bit size rules from regular opcodes. In
particular, we now have a new case where we have an opcode with multiple
variable-size inputs and outputs but no common size.
---
Both of these things are already handled in the Value base class so we
don't need to handle them explicitly in Constant.
---
src/compiler/nir/nir_algebraic.py | 4
1 file changed, 4 deletions(-)
diff --git a/src/compiler/nir/nir_algebraic.py
b/src/compiler/nir/nir_algebraic.py
index
---
src/amd/common/ac_nir_to_llvm.c | 32 ++--
src/gallium/auxiliary/nir/tgsi_to_nir.c | 8 +-
.../drivers/freedreno/ir3/ir3_compiler_nir.c | 148 --
src/gallium/drivers/vc4/vc4_program.c | 8 +-
src/intel/compiler/brw_fs_nir.cpp | 78
---
src/compiler/nir/nir.h| 7
src/compiler/nir/nir_opcodes.py | 61 +--
src/compiler/nir/nir_opcodes_c.py | 1 +
src/compiler/nir/nir_validate.c | 4 ++
4 files changed, 45 insertions(+), 28 deletions(-)
diff --git a/src/compiler/nir/nir.h
Some suffixes are straight-up dropped when it makes sense while others
are converted to the @bit-size form because we really do require an
exact size in order for the expression to be well-formed.
---
src/compiler/nir/nir_opt_algebraic.py | 79 +--
1 file changed, 37
All conversion opcodes require a destination size but this makes
constructing certain algebraic expressions rather cumbersome. This
commit adds support to nir_search and nir_algebraic for writing
conversion opcodes without a size. These meta-opcodes match any
conversion of that type regardless
Because we need to know the size and we can't infer it from the source,
we add a suffixed builder helper for each possible destination size the
opcode supports.
---
src/compiler/nir/nir_builder.h| 18 +++---
src/compiler/nir/nir_builder_opcodes_h.py | 16 ++--
In bf441d22a7917f38c, I wrote a bunch of descriptive asserts for various
bit size checks in the validation of search expressions. This commit
improves a few of those and adds descriptive asserts for replace
expressions as well. We also rework _validate_bit_class_down so that it
can properly
---
src/compiler/nir/nir_opcodes.py | 22 +++---
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
index d69d09d30ce..00720708305 100644
--- a/src/compiler/nir/nir_opcodes.py
+++
, validation, and elsewhere of having this
opcode special-case.
What is the fate of NIR? That's up to the people to decide!
Cc: Ian Romanick
Cc: Connor Abbott
Jason Ekstrand COMMON (6):
nir/opcodes: Pull in the type helpers from constant_expressions
nir/opcodes: Rename tbool to tbool32
---
src/compiler/nir/nir_constant_expressions.h | 3 +-
src/compiler/nir/nir_constant_expressions.py | 34
src/compiler/nir/nir_loop_analyze.c | 7 ++--
src/compiler/nir/nir_opt_constant_folding.c | 5 +--
src/compiler/spirv/spirv_to_nir.c| 3 +-
5
While we're at it, we rework them a bit to all use regular expressions
and assert more.
---
src/compiler/nir/nir_constant_expressions.py | 25 ++
src/compiler/nir/nir_opcodes.py | 34 +---
src/compiler/nir/nir_opcodes_c.py| 11 ++-
3 files
There is a possible functional change here because we're now using
canonical bit classes in the check in validate(). If anything, it
should be more precise than the old check. The other changes just make
us print out the canonical classes in the error messages. We do this
because, if we have an
Previously, we were checking for a matching bit size late when
propagating bit sizes back down the tree by checking if the variable had
an explicit bit size and if it matched the requested bit class.
However, this check wasn't quite as accurate because it didn't handle
the case where an explicitly
I left a nit on 3. Otherwise, 1-6 are
Reviewed-by: Jason Ekstrand
On Mon, Nov 5, 2018 at 9:36 AM Lionel Landwerlin <
lionel.g.landwer...@intel.com> wrote:
> It's only used in anv_image.c
>
> Signed-off-by: Lionel Landwerlin
> ---
> src/intel/vulk
On Mon, Nov 5, 2018 at 9:36 AM Lionel Landwerlin <
lionel.g.landwer...@intel.com> wrote:
> To play around with debugging, we might want to disable one or the
> other component. Having 0s as default values makes this work.
> Otherwise we might have NULL components, leading to crashes.
>
>
On Thu, Nov 8, 2018 at 7:22 AM Alejandro Piñeiro
wrote:
> On OpenGL, a array of a simple type adds just one varying. So
> gl_transform_feedback_varying_info struct defined at mtypes.h includes
> the parameters Type (base_type) and Size (number of elements).
>
> This commit checks this when the
This is fine. For Intel hardware, the component mask is actually what we
need and I figured ffs(component_mask) - 1 and bitcount(component_mask)
wasn't all that onerous. I don't care all that much though. Of we're
going this direction, maybe just do size+offset and we can compute the mask
in
On Thu, Nov 8, 2018 at 7:22 AM Alejandro Piñeiro
wrote:
> The offset compute was working fine for the case of attrib_slots=1,
> and updating the offset for the following varying.
>
> But in the case of attrib_slots=2 (so dvec3/4), we are basically
> splitting the comp_slots needed in two
On Thu, Nov 8, 2018 at 7:22 AM Alejandro Piñeiro
wrote:
> In order to allow nir_gather_xfb_info to be used on OpenGL,
> specifically ARB_gl_spirv.
>
> So, from OpenGL 4.6 spec, section 11.1.2.1, "Output Variables":
>
> "outputs specifying both an *XfbBuffer* and an *Offset* are
>
On Thu, Nov 8, 2018 at 7:22 AM Alejandro Piñeiro
wrote:
> Although it is true that Vulkan doesn't support transform feedback
> yet, spirv to nir is handling it due ARB_gl_spirv support. Having said
> so, those decorations are handled elsewhere.
>
Actually, the RADV guys shipped their patches
On Thu, Nov 8, 2018 at 9:32 AM Emil Velikov
wrote:
> Hi Jason,
>
> On Sat, 27 Oct 2018 at 22:35, Jason Ekstrand wrote:
> >
> > Instead of setting it based on the number of layers in the framebuffer,
> > disable it whenever the shader does not explicitly wri
On Thu, Nov 8, 2018 at 9:34 AM Emil Velikov
wrote:
> On Sat, 27 Oct 2018 at 22:34, Jason Ekstrand wrote:
> >
> > Instead of hard-coding it to look at the VS stage, look at whatever the
> > last geometry stage is.
> >
> > Cc: mesa-sta...@lists.freedesktop.org
On Wed, Nov 7, 2018 at 4:06 PM Kenneth Graunke
wrote:
> On Wednesday, November 7, 2018 1:45:59 PM PST Jason Ekstrand wrote:
> > On Wed, Nov 7, 2018 at 12:20 PM Kenneth Graunke
> > wrote:
> >
> > > On Saturday, October 20, 2018 10:55:44 AM PST Jason Ekstrand wro
On Wed, Nov 7, 2018 at 12:20 PM Kenneth Graunke
wrote:
> On Saturday, October 20, 2018 10:55:44 AM PST Jason Ekstrand wrote:
> > @@ -553,14 +552,18 @@
> fs_visitor::optimize_frontfacing_ternary(nir_alu_instr *instr,
> > if (src0->intrinsic != nir_intrinsic_load_front_fa
On Wed, Nov 7, 2018 at 10:38 AM Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:
> On Fri, Oct 12, 2018 at 01:46:42PM -0500, Jason Ekstrand wrote:
> > ---
> > src/intel/isl/isl_gen7.c | 9 +
> > 1 file changed, 9 insertions(+)
> >
> > diff
On Tue, Nov 6, 2018 at 9:43 PM Roland Scheidegger
wrote:
> Am 06.11.18 um 22:48 schrieb Jason Ekstrand:
> > This came to the top of my list recently due to a difference between
> > OpenGL and Vulkan discard operations and D3D's discard operation. The
> > OpenGL and Vulk
The first patch of this series applies against
the public master branch of the Vulkan spec and contains no NDA'd material.
Unfortunately, due to the way the Khronos NDA and process works, I will not
be able to provide updates between now and when the final version ships.
--Jason
Jason Ekstrand (3)
This came to the top of my list recently due to a difference between
OpenGL and Vulkan discard operations and D3D's discard operation. The
OpenGL and Vulkan discard is defined to be control flow and derivatives
are undefined after discard. With D3D, derivatives are considered
well-defined after
---
include/vulkan/vulkan_core.h | 13 +
src/vulkan/registry/vk.xml | 13 ++---
2 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/include/vulkan/vulkan_core.h b/include/vulkan/vulkan_core.h
index 4cd8ed51dcd..e14aaf8c184 100644
---
---
src/intel/vulkan/anv_device.c | 8
src/intel/vulkan/anv_extensions.py | 1 +
2 files changed, 9 insertions(+)
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index ee35e013329..89827ccacbe 100644
--- a/src/intel/vulkan/anv_device.c
+++
The biggest change here is the rename of VK_NVX_ray_tracing to
VK_NV_ray_tracing and the total removal of VK_KHR_mir_surface.
---
include/vulkan/vulkan.h | 6 -
include/vulkan/vulkan_core.h | 480 ---
include/vulkan/vulkan_mir.h | 65 -
On Wed, Oct 17, 2018 at 6:59 AM Danylo Piliaiev
wrote:
> Conditional rendering affects next functions:
> - vkCmdDraw, vkCmdDrawIndexed, vkCmdDrawIndirect, vkCmdDrawIndexedIndirect
> - vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR
> - vkCmdDispatch, vkCmdDispatchIndirect,
On Wed, Oct 17, 2018 at 6:59 AM Danylo Piliaiev
wrote:
> Signed-off-by: Danylo Piliaiev
> ---
> src/intel/vulkan/anv_extensions.py | 1 +
> src/intel/vulkan/genX_cmd_buffer.c | 155 +
> 2 files changed, 156 insertions(+)
>
> diff --git
On Mon, Nov 5, 2018 at 10:39 AM Eero Tamminen
wrote:
> Hi,
>
> On 3.11.2018 2.06, Jason Ekstrand wrote:
> > This patch series is something we've talked about doing for a while and
> > haven't gotten around to yet. It implements a generic SEND opcode and
> then
> >
eek.
>
> On 11/02/2018 06:42 AM, Jason Ekstrand wrote:
> > Bump
> >
> > On Mon, Oct 22, 2018 at 5:14 PM Jason Ekstrand > <mailto:ja...@jlekstrand.net>> wrote:
> >
> > This is something that Connor and I have talked about quite a bit
> >
---
src/intel/compiler/brw_eu_defines.h | 1 -
src/intel/compiler/brw_fs.cpp | 31 ++--
src/intel/compiler/brw_fs.h | 4 -
src/intel/compiler/brw_fs_cse.cpp | 1 -
src/intel/compiler/brw_fs_generator.cpp | 73 ---
Previously, we were marking constant surface used in the generator and
non-constant ones in brw_fs_nir. We should pick one and go with it.
---
src/intel/compiler/brw_fs_generator.cpp | 2 --
src/intel/compiler/brw_fs_nir.cpp | 16
2 files changed, 8 insertions(+), 10
This commit pulls the surface descriptor helpers out into brw_eu.h and
makes them no longer depend on the codegen infrastructure. This should
allow us to use them directly from the IR code instead of the generator.
This change is unfortunately less mechanical than perhaps one would like
but it
---
src/intel/compiler/brw_eu_emit.c | 14 --
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index d22c5743038..e8235ce05f7 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++
---
src/intel/compiler/brw_fs.cpp | 138 ++-
src/intel/compiler/brw_fs.h | 2 +-
src/intel/compiler/brw_fs_generator.cpp | 162 +++---
.../compiler/brw_schedule_instructions.cpp| 17 ++
4 files changed, 177 insertions(+), 142
---
src/intel/compiler/brw_eu_defines.h | 1 +
src/intel/compiler/brw_fs.cpp | 10 ++
src/intel/compiler/brw_fs_nir.cpp | 14 --
src/intel/compiler/brw_shader.cpp | 2 ++
4 files changed, 21 insertions(+), 6 deletions(-)
diff --git
Like all the other sends, it's just mlen * REG_SIZE.
Fixes: 3cbc02e4693 "intel: Use TXS for image_size when we have..."
---
src/intel/compiler/brw_fs.cpp | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index
---
src/intel/compiler/brw_eu.h | 27 ---
src/intel/compiler/brw_eu_emit.c | 72 ---
src/intel/compiler/brw_fs.cpp | 181 +-
src/intel/compiler/brw_fs_generator.cpp | 62 --
.../compiler/brw_schedule_instructions.cpp
---
src/intel/compiler/brw_eu_defines.h | 7
src/intel/compiler/brw_fs.cpp | 9 +
src/intel/compiler/brw_fs.h | 6
src/intel/compiler/brw_fs_cse.cpp | 5 +++
src/intel/compiler/brw_fs_generator.cpp | 35
Instead of magically falling back to SIMD8 for atomics and typed
messages on Ivy Bridge, explicitly figure out the exec size and pass
that into brw_surface_payload_size.
---
src/intel/compiler/brw_eu_emit.c | 59 +---
1 file changed, 39 insertions(+), 20 deletions(-)
substantially better scheduling of instructions around indirect sends.
They currently generate up to 5 ALU instructions in the generator to build
the indirect descriptor and this is reduced to 1 with the other 4 happening
in the IR level where we can schedule.
Jason Ekstrand (11):
intel/defines
If you pass a bool in as the value to set, the C standard says that it
gets converted to an int prior to shifting. If you try to set a bool to
bit 31, this lands you in undefined behavior. It's better just to add
the explicit cast and let the compiler delete it for us.
---
On Fri, Nov 2, 2018 at 6:05 AM Toni Lönnberg
wrote:
> On Fri, Nov 02, 2018 at 12:09:54AM -0500, Jason Ekstrand wrote:
> > On Thu, Nov 1, 2018 at 5:51 AM Toni Lönnberg
> > wrote:
> >
> > > On Wed, Oct 31, 2018 at 01:18:11PM -0500, Jason Ekstrand wrote:
> >
LGTM
On Fri, Nov 2, 2018 at 8:37 AM Timothy Arceri wrote:
> We cannot use nir_build_alu() to create the new alu as it has no
> way to know how many components of the src we will use. This
> results in it guessing the max number of components from one of
> its inputs.
>
> Fixes the following CTS
Bump
On Mon, Oct 22, 2018 at 5:14 PM Jason Ekstrand wrote:
> This is something that Connor and I have talked about quite a bit over the
> last couple of months. The core idea is to replace NIR's current 32-bit
> 0/-1 D3D10-style booleans with a 1-bit data type. All in all, I think it
On November 2, 2018 08:20:34 Timothy Arceri wrote:
On 2/11/18 11:52 pm, Jason Ekstrand wrote:
On November 2, 2018 06:25:59 Timothy Arceri wrote:
We cannot use nir_build_alu() to create the new alu as it has no
way to know how many components of the src we will use. This
results
On November 2, 2018 06:25:59 Timothy Arceri wrote:
We cannot use nir_build_alu() to create the new alu as it has no
way to know how many components of the src we will use. This
results in it guessing the max number of components from one of
its inputs.
Fixes the following CTS tests:
On Thu, Nov 1, 2018 at 5:51 AM Toni Lönnberg
wrote:
> On Wed, Oct 31, 2018 at 01:18:11PM -0500, Jason Ekstrand wrote:
> > On Wed, Oct 31, 2018 at 11:10 AM Toni Lönnberg
> > wrote:
> >
> > > When we debug media or 3d+media workloads, we'd like to be able to see
On Thu, Nov 1, 2018 at 4:53 PM Timothy Arceri wrote:
> Cc: Jason Ekstrand
> ---
> src/compiler/nir/nir_opt_if.c | 26 +-
> 1 file changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/src/compiler/nir/nir_opt_if.c b/src/compiler/nir/nir_opt_if.c
&
On Wed, Oct 31, 2018 at 5:21 PM Timothy Arceri
wrote:
> On 1/11/18 1:28 am, Jason Ekstrand wrote:
> > On Tue, Oct 30, 2018 at 9:17 PM Timothy Arceri > <mailto:tarc...@itsqueeze.com>> wrote:
> >
> > With the simplifications to this pass in a3b4cb34589e2f
On Wed, Oct 31, 2018 at 9:34 AM Lionel Landwerlin <
lionel.g.landwer...@intel.com> wrote:
> On 31/10/2018 14:20, Jason Ekstrand wrote:
>
> Toni,
>
> I'm a bit curious where you're going with this. I started on a similar
> project a couple of years ago:
>
>
suggesting we fork the tools and the XML. I was just wondering
whether we wanted to do separate sections or an attribute. I think it
should land in mesa either way.
--Jason
> On Wed, Oct 31, 2018 at 09:20:39AM -0500, Jason Ekstrand wrote:
> > Toni,
> >
> > I'm
On Tue, Oct 30, 2018 at 9:17 PM Timothy Arceri
wrote:
> With the simplifications to this pass in a3b4cb34589e2f1a68 we
> can allow any alu instruction to be processed. For one this can
> potentially help with bcsels.
>
Do we want to? I believe that this patch is correct and I agree that we
now
Reviewed-by: Jason Ekstrand
Thanks for figuring this out. This probably explains some of the hurt I
was seeing with my series as well.
--Jason
On Tue, Oct 30, 2018 at 9:17 PM Timothy Arceri
wrote:
> From: Timothy Arceri
>
> We need to update the cursor before we check if th
Sorry I missed that one.
Reviewed-by: Jason Ekstrand
On Wed, Oct 31, 2018 at 5:57 AM Alejandro Piñeiro
wrote:
> Since commit "intel/compiler: Stop assuming the entrypoint is called
> "main"" there is no need to force the entrypoint name to be "main".
Toni,
I'm a bit curious where you're going with this. I started on a similar
project a couple of years ago:
https://gitlab.freedesktop.org/jekstrand/mesa/commits/wip/genxml-engines
Mine took a different (not necessarily better) approach of surrounding the
instructions in an tag. I'm not sure
This isn't true for Vulkan so we have to whack it to "main" in anv which
is silly. Instead of walking the list of functions and asserting that
everything is named "main" and hoping there's only one function named
"main", just use the nir_shader_get_entrypoint() helper which has better
assertions
NAK. There's really no reason to do this. If you're in the lost device
case, you've just done an ioctl (expensive) and got a GPU hang (5 second
watchdog timer). Also, it adds complexity to something that very badly
needs to "just work".
--Jason
On Tue, Oct 30, 2018 at 11:09 AM Eric Engestrom
nicalization,
> but since neither SPIR-V nor GLSL define multiple bit-sizes for
> booleans that would always have been something that drivers might need
> to address.
>
> Iago
>
>
> On Mon, 2018-10-22 at 17:13 -0500, Jason Ekstrand wrote:
> > This is something that Conno
On Tue, Oct 30, 2018 at 10:04 AM Emil Velikov
wrote:
> On Thu, 25 Oct 2018 at 17:47, Jason Ekstrand wrote:
> >
> > ---
> > src/intel/vulkan/anv_device.c | 11 +++
> > src/intel/vulkan/anv_util.c | 4
> > 2 files changed, 11 insertions(+), 4 dele
Acked-by: Jason Ekstrand
On Tue, Oct 30, 2018 at 7:56 AM Daniel Stone wrote:
> If the client has requested that AcquireNextImage not block at all, with
> a timeout of 0, then don't make any non-blocking calls.
>
> This will still potentially block infinitely given a non-infin
counts that the
default 32-bit pass and I have tracked it down to this patch. Reverting
this makes the instruction count much better for some tests, I'll check
why this happens tomorrow.
Iago
On Mon, 2018-10-22 at 17:13 -0500, Jason Ekstrand wrote:
Instead of doing our own constant folding, we
Monday, 2018-10-01 11:04:09 +0200, Juan A. Suarez Romero wrote:
>> > On Tue, 2018-09-11 at 11:15 -0500, Jason Ekstrand wrote:
>> > > Cc: mesa-sta...@lists.freedesktop.org
>> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107892
>> > &
On Mon, Oct 29, 2018 at 12:00 PM Daniel Schürmann <
daniel.schuerm...@campus.tu-berlin.de> wrote:
> Hi Jason,
>
> thx for doing this pass, I was about to do the same (then we'd have 3 :P).
> I'm not completely sure, but it looks like you're implementation is
> based on "Division by Invariant
This has thrown a few people off recently and it's good to have the
process and all the rational for it documented somewhere. A comment at
the top of nir_inline_functions seems as good a place as any.
Cc: Matt Turner
Cc: Karol Herbst
---
src/compiler/nir/nir_inline_functions.c | 68
Thanks! Pushed.
On Mon, Oct 29, 2018 at 10:07 AM Samuel Pitoiset
wrote:
> Acked-by: Samuel Pitoiset
>
> On 10/29/18 3:44 PM, Jason Ekstrand wrote:
> > This doesn't include any new features but it does include an XML and
> > header typo fix for modifiers.
> &g
The pass should work for all bit sizes but it's less clear that the
extra instructions are worth it on small integers. Also, the hardware
doesn't do mul_high on anything other than 32-bit integers and, absent
any decent mechanism for testing the pass on 8 and 16-bit types, it's
probably best to
It's a reasonably well-known fact in the world of compilers that integer
divisions by constants can be replaced by a multiply, an add, and some
shifts. This commit adds such an optimization to NIR for easiest case
of udiv. Other division operations will be added in following commits.
In order to
---
src/intel/compiler/brw_vec4_nir.cpp | 6 ++
1 file changed, 6 insertions(+)
diff --git a/src/intel/compiler/brw_vec4_nir.cpp
b/src/intel/compiler/brw_vec4_nir.cpp
index 5ccfd1f8940..b0ee0f9720d 100644
--- a/src/intel/compiler/brw_vec4_nir.cpp
+++ b/src/intel/compiler/brw_vec4_nir.cpp
@@
From: Ian Romanick
Reviewed-by: Jason Ekstrand
---
src/intel/compiler/brw_fs_nir.cpp | 5 +
1 file changed, 5 insertions(+)
diff --git a/src/intel/compiler/brw_fs_nir.cpp
b/src/intel/compiler/brw_fs_nir.cpp
index 7930205d659..5ff12787fb6 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
From: Ian Romanick
Reviewed-by: Jason Ekstrand
---
src/compiler/nir/nir_opcodes.py | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
index 209f0c5509b..1a52b3b4228 100644
--- a/src/compiler/nir/nir_opcodes.py
+++ b/src
---
src/compiler/nir/nir.h | 1 +
src/compiler/nir/nir_lower_int64.c | 65 ++
2 files changed, 66 insertions(+)
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index a0ae9a4430e..96b437e7c82 100644
--- a/src/compiler/nir/nir.h
+++
---
src/compiler/nir/nir_constant_expressions.py | 1 +
src/compiler/nir/nir_opcodes.py | 43 ++--
2 files changed, 40 insertions(+), 4 deletions(-)
diff --git a/src/compiler/nir/nir_constant_expressions.py
b/src/compiler/nir/nir_constant_expressions.py
index
---
src/intel/vulkan/anv_extensions.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/intel/vulkan/anv_extensions.py
b/src/intel/vulkan/anv_extensions.py
index ab9240f9fd8..e9afe06bb13 100644
--- a/src/intel/vulkan/anv_extensions.py
+++
801 - 900 of 12044 matches
Mail list logo