from:"Karol Herbst"

Re: [EXTERNAL] Re: Zink MR signoff tags

2022-10-18 Thread Karol Herbst

and for Nouveau while I am at it.

Unless somebody screams and still wants them.

On Wed, Oct 19, 2022 at 12:12 AM Karol Herbst  wrote:
>
> Same for Rusticl
>
> On Mon, Oct 17, 2022 at 10:28 PM Jesse Natalie  wrote:
> >
> > Jumping on the bandwagon, I'm going to adopt this for Microsoft-owned code 
> > as well (src/gallium/d3d12, src/microsoft/*).
> >
> > -Jesse
> >
> > -Original Message-
> > From: mesa-dev  On Behalf Of Gert 
> > Wollny
> > Sent: Friday, October 7, 2022 2:37 AM
> > To: erik.faye-lund ; Alyssa Rosenzweig 
> > ; Mike Blumenkrantz 
> > Cc: ML mesa-dev 
> > Subject: [EXTERNAL] Re: Zink MR signoff tags
> >
> > On Wed, 2022-10-05 at 17:21 +0200, Erik Faye-Lund wrote:
> > > On Wed, 2022-10-05 at 08:20 -0400, Alyssa Rosenzweig wrote:
> > > > + for not requiring rb/ab tags ...
> > >
> > > I think it's time to think about making this change all over Mesa as
> > > well. We're deeply in bed with GitLab by now, so I don't think there's
> > > a realistic chance that this isn't going to just be duplicate info any
> > > time soon...
> >
> > Agreed, I'll certainly do this for r600 from now on.
> >
> > - Gert

Re: [EXTERNAL] Re: Zink MR signoff tags

2022-10-18 Thread Karol Herbst

Same for Rusticl

On Mon, Oct 17, 2022 at 10:28 PM Jesse Natalie  wrote:
>
> Jumping on the bandwagon, I'm going to adopt this for Microsoft-owned code as 
> well (src/gallium/d3d12, src/microsoft/*).
>
> -Jesse
>
> -Original Message-
> From: mesa-dev  On Behalf Of Gert 
> Wollny
> Sent: Friday, October 7, 2022 2:37 AM
> To: erik.faye-lund ; Alyssa Rosenzweig 
> ; Mike Blumenkrantz 
> Cc: ML mesa-dev 
> Subject: [EXTERNAL] Re: Zink MR signoff tags
>
> On Wed, 2022-10-05 at 17:21 +0200, Erik Faye-Lund wrote:
> > On Wed, 2022-10-05 at 08:20 -0400, Alyssa Rosenzweig wrote:
> > > + for not requiring rb/ab tags ...
> >
> > I think it's time to think about making this change all over Mesa as
> > well. We're deeply in bed with GitLab by now, so I don't think there's
> > a realistic chance that this isn't going to just be duplicate info any
> > time soon...
>
> Agreed, I'll certainly do this for r600 from now on.
>
> - Gert

Re: Rust in our code base

2022-09-08 Thread Karol Herbst

will merge Rusticl tomorrow or so unless somebody complains.

On Wed, Aug 24, 2022 at 5:34 PM Karol Herbst  wrote:
>
> On Wed, Aug 24, 2022 at 5:18 PM Jason Ekstrand
>  wrote:
> >
> > +mesa-dev and my jlekstrand.net e-mail
> >
> > On Sun, 2022-08-21 at 20:44 +0200, Karol Herbst wrote:
> > > On Sun, Aug 21, 2022 at 8:34 PM Rob Clark 
> > > wrote:
> > > >
> > > > On Sun, Aug 21, 2022 at 10:45 AM Karol Herbst 
> > > > wrote:
> > > > >
> > > > > On Sun, Aug 21, 2022 at 7:43 PM Karol Herbst 
> > > > > wrote:
> > > > > >
> > > > > > On Sun, Aug 21, 2022 at 5:46 PM Rob Clark 
> > > > > > wrote:
> > > > > > >
> > > > > > > On Sat, Aug 20, 2022 at 5:23 AM Karol Herbst
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > Hey everybody,
> > > > > > > >
> > > > > > > > so I think it's time to have this discussion for real.
> > > > > > > >
> > > > > > > > I am working on Rusticl
> > > > > > > > (
> > > > > > > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15
> > > > > > > > 439)
> > > > > > > > which I would like to merge quite soon.
> > > > > > > >
> > > > > > > > Others might also plan on starting kernel drivers written
> > > > > > > > in Rust (and
> > > > > > > > if people feel comfortable to discuss this as well, they
> > > > > > > > might reply
> > > > > > > > here)
> > > > > > > >
> > > > > > > > The overall implication of that is: if we are doing this,
> > > > > > > > people (that
> > > > > > > > is we) have to accept that touching Rust code will be part
> > > > > > > > of our
> > > > > > > > development process. There is no other sane way of doing
> > > > > > > > it.
> > > > > > > >
> > > > > > > > I am not willing to wrap things in Rusticl so changing
> > > > > > > > gallium APIs
> > > > > > > > won't involve touching Rust code, and we also can't expect
> > > > > > > > people to
> > > > > > > > design their kernel drivers in weird ways "just because
> > > > > > > > somebody
> > > > > > > > doesn't want to deal with Rust"
> > > > > > > >
> > > > > > > > If we are going to do this, we have to do it for real,
> > > > > > > > which means,
> > > > > > > > Rust code will call C APIs directly and a change in those
> > > > > > > > APIs will
> > > > > > > > also require changes in Rust code making use of those APIs.
> > > > > > > >
> > > > > > > > I am so explicit on this very point, because we had some
> > > > > > > > discussion on
> > > > > > > > IRC where this was seen as a no-go at least from some
> > > > > > > > people, which
> > > > > > > > makes me think we have to find a mutual agreement on how it
> > > > > > > > should be
> > > > > > > > going forward.
> > > > > > > >
> > > > > > > > And I want to be very explicit here about the future of
> > > > > > > > Rusticl as
> > > > > > > > well: if the agreement is that people don't want to have to
> > > > > > > > deal with
> > > > > > > > Rust changing e.g. gallium, Rusticl is a dead project. I am
> > > > > > > > not
> > > > > > > > willing to come up with some trashy external-internal API
> > > > > > > > just to
> > > > > > > > maintain Rusticl outside of the mesa git repo.
> > > > > > > > And doing it on a kernel level is even more of a no-go.
> > > > > > > >
> > > > > > > > So what are we all thinking about Rust in our core repos?
> > > > > > >
> > > > > > > I think there has

Re: Rust in our code base

2022-08-24 Thread Karol Herbst

On Wed, Aug 24, 2022 at 5:18 PM Jason Ekstrand
 wrote:
>
> +mesa-dev and my jlekstrand.net e-mail
>
> On Sun, 2022-08-21 at 20:44 +0200, Karol Herbst wrote:
> > On Sun, Aug 21, 2022 at 8:34 PM Rob Clark 
> > wrote:
> > >
> > > On Sun, Aug 21, 2022 at 10:45 AM Karol Herbst 
> > > wrote:
> > > >
> > > > On Sun, Aug 21, 2022 at 7:43 PM Karol Herbst 
> > > > wrote:
> > > > >
> > > > > On Sun, Aug 21, 2022 at 5:46 PM Rob Clark 
> > > > > wrote:
> > > > > >
> > > > > > On Sat, Aug 20, 2022 at 5:23 AM Karol Herbst
> > > > > >  wrote:
> > > > > > >
> > > > > > > Hey everybody,
> > > > > > >
> > > > > > > so I think it's time to have this discussion for real.
> > > > > > >
> > > > > > > I am working on Rusticl
> > > > > > > (
> > > > > > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15
> > > > > > > 439)
> > > > > > > which I would like to merge quite soon.
> > > > > > >
> > > > > > > Others might also plan on starting kernel drivers written
> > > > > > > in Rust (and
> > > > > > > if people feel comfortable to discuss this as well, they
> > > > > > > might reply
> > > > > > > here)
> > > > > > >
> > > > > > > The overall implication of that is: if we are doing this,
> > > > > > > people (that
> > > > > > > is we) have to accept that touching Rust code will be part
> > > > > > > of our
> > > > > > > development process. There is no other sane way of doing
> > > > > > > it.
> > > > > > >
> > > > > > > I am not willing to wrap things in Rusticl so changing
> > > > > > > gallium APIs
> > > > > > > won't involve touching Rust code, and we also can't expect
> > > > > > > people to
> > > > > > > design their kernel drivers in weird ways "just because
> > > > > > > somebody
> > > > > > > doesn't want to deal with Rust"
> > > > > > >
> > > > > > > If we are going to do this, we have to do it for real,
> > > > > > > which means,
> > > > > > > Rust code will call C APIs directly and a change in those
> > > > > > > APIs will
> > > > > > > also require changes in Rust code making use of those APIs.
> > > > > > >
> > > > > > > I am so explicit on this very point, because we had some
> > > > > > > discussion on
> > > > > > > IRC where this was seen as a no-go at least from some
> > > > > > > people, which
> > > > > > > makes me think we have to find a mutual agreement on how it
> > > > > > > should be
> > > > > > > going forward.
> > > > > > >
> > > > > > > And I want to be very explicit here about the future of
> > > > > > > Rusticl as
> > > > > > > well: if the agreement is that people don't want to have to
> > > > > > > deal with
> > > > > > > Rust changing e.g. gallium, Rusticl is a dead project. I am
> > > > > > > not
> > > > > > > willing to come up with some trashy external-internal API
> > > > > > > just to
> > > > > > > maintain Rusticl outside of the mesa git repo.
> > > > > > > And doing it on a kernel level is even more of a no-go.
> > > > > > >
> > > > > > > So what are we all thinking about Rust in our core repos?
> > > > > >
> > > > > > I think there has to be willingness on the part of rust folks
> > > > > > to help
> > > > > > others who aren't so familiar with rust with these sorts of
> > > > > > API
> > > > > > changes.  You can't completely impose the burden on others
> > > > > > who have
> > > > > > never touched rust before.  That said, I expect a lot of API
> > > > > > changes
> > > > > > over time are simple

[PATCH] test email smtp google whatever

2022-05-04 Thread Karol Herbst

uhm.. does this still work?
---
 README.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/README.rst b/README.rst
index b35246e034c..6a7112c0473 100644
--- a/README.rst
+++ b/README.rst
@@ -1,3 +1,4 @@
+
 `Mesa `_ - The 3D Graphics Library
 ==
 
-- 
2.35.1

Re: [Mesa-dev] OpenGL and OpenCL on top of D3D12

2020-03-24 Thread Karol Herbst

On Tue, Mar 24, 2020 at 5:39 PM Gert Wollny  wrote:
>
> Dear Mesa developers,
>
> Today, we at Collabora together with Microsoft have announced a new
> project based on Mesa: OpenGL and OpenCL on top of Microsoft's D3D12.
> You can find the full  announcements here:
>
> https://www.collabora.com/news-and-blog/news-and-events/introducing-opencl-and-opengl-on-directx.html
>
> https://devblogs.microsoft.com/directx/in-the-works-opencl-and-opengl-mapping-layers-to-directx
>
> How does this affect Mesa?
>
> First of all, we intend to contribute this work into upstream Mesa.
>
> The OpenGL work is similar to what Zink does with Vulkan, and will use
> some comparable approaches for the emulation of features, hence there
> will be some obvious opportunities for code-sharing with Zink.
>
> The OpenCL support is not using the Clover runtime, but instead is a
> standalone runtime that shares the NIR-to-DXIL compiler that we
> contribute to Mesa and that is also used by above OpenGL layer.
>

I understand that people are unhappy with clover, but this way is not
ideal either. Why creating a runtime which only benefits Microsoft
when you could also create a replacement useful for all of gallium
instead?

> As we are using spirv-to-nir in our OpenCL compiler, we have
> implemented some missing OpenCL-specific features there as well. In
> addition, we are also carrying some out-of-tree changes from other mesa
> contributors, where we also contribute reviews of in order to help them
> land.
>
> Our work also includes contributing, improving, and maintaining the CI
> for Windows, to be run on a variety of supported Windows targets.
> Currently, Collabora is providing a Windows GitLab CI runner in order
> to run our builds. We are looking into integrating this into fd.o's
> general fleet of shared runners.
>
> A high-performance DXGI libgl-target/winsys is also in the works, so we
> can render directly into Windows' compositor surfaces. In theory, and
> as a benefit to the wider Mesa community, other hardware driver could
> be ported to support rendering into those surfaces as well.
>
> As part of this work and thanks to Microsoft's support, the WGL header
> files are being re-licensed as MIT, so we can reuse these original
> headers rather than a reverse-engineered copy. Patches for this will
> follow soon.
>
> A dump of the code in its current state can be found here:
> https://gitlab.freedesktop.org/kusma/mesa/-/tree/msclc-d3d12
> and we intend to upstream this code by breaking it into independent
> MRs shortly.
>
> We hope you're all as excited about this as we are!
>
> Gert Wollny, on behalf of the Microsoft development team (Bill
> Kristiansen and Jesse Natalie) and the Collabora development team
> (Boris Brezillon, Daniel Stone, Elie Tournier, Erik Faye-Lund, Louis-
> Francis Ratté-Boulianne)
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH v2 1/6] nv50/ir: add nv50_ir_prog_info_out

2020-03-20 Thread Karol Herbst

On Fri, Mar 20, 2020 at 10:20 AM Juan A. Suarez Romero
 wrote:
>
> On Thu, 2020-03-19 at 21:57 +0100, Mark Menzynski wrote:
> > From: Karol Herbst 
> >
> > Split out the output relevant fields from the nv50_ir_prog_info struct
> > in order to have a cleaner separation between the input and output of
> > the compilation.
> >
>
>
> Please, submit the series through GitLab (
> https://www.mesa3d.org/submittingpatches.html#submit)
>
> Thanks!
>
> J.A.
>

it's fine for nouveau patches, but yeah, I know it makes sense to do
that on gitlab, but if not everybody feels fine in a community doing
stuff on gitlab, I also don't want to enforce that, and that's the
case for nouveau right now.

>
> > gned-off-by: Karol Herbst 
> > ---
> >  .../drivers/nouveau/codegen/nv50_ir.cpp   |  49 ++--
> >  src/gallium/drivers/nouveau/codegen/nv50_ir.h |   9 +-
> >  .../drivers/nouveau/codegen/nv50_ir_driver.h  | 117 +---
> >  .../nouveau/codegen/nv50_ir_from_common.cpp   |  14 +-
> >  .../nouveau/codegen/nv50_ir_from_common.h |   3 +-
> >  .../nouveau/codegen/nv50_ir_from_nir.cpp  | 204 +++---
> >  .../nouveau/codegen/nv50_ir_from_tgsi.cpp | 256 +-
> >  .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp |   6 +-
> >  .../nouveau/codegen/nv50_ir_target.cpp|   2 +-
> >  .../drivers/nouveau/codegen/nv50_ir_target.h  |   5 +-
> >  .../nouveau/codegen/nv50_ir_target_nv50.cpp   |  17 +-
> >  .../nouveau/codegen/nv50_ir_target_nv50.h |   3 +-
> >  .../drivers/nouveau/nouveau_compiler.c|   9 +-
> >  .../drivers/nouveau/nv50/nv50_program.c   |  62 +++--
> >  .../drivers/nouveau/nvc0/nvc0_program.c   |  87 +++---
> >  15 files changed, 449 insertions(+), 394 deletions(-)
> >
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> > index c65853578f6..c2c5956874a 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> > @@ -1241,15 +1241,18 @@ void Program::releaseValue(Value *value)
> >  extern "C" {
> >
> >  static void
> > -nv50_ir_init_prog_info(struct nv50_ir_prog_info *info)
> > +nv50_ir_init_prog_info(struct nv50_ir_prog_info *info,
> > +   struct nv50_ir_prog_info_out *info_out)
> >  {
> > +   info_out->target = info->target;
> > +   info_out->type = info->type;
> > if (info->type == PIPE_SHADER_TESS_CTRL || info->type == 
> > PIPE_SHADER_TESS_EVAL) {
> > -  info->prop.tp.domain = PIPE_PRIM_MAX;
> > -  info->prop.tp.outputPrim = PIPE_PRIM_MAX;
> > +  info_out->prop.tp.domain = PIPE_PRIM_MAX;
> > +  info_out->prop.tp.outputPrim = PIPE_PRIM_MAX;
> > }
> > if (info->type == PIPE_SHADER_GEOMETRY) {
> > -  info->prop.gp.instanceCount = 1;
> > -  info->prop.gp.maxVertices = 1;
> > +  info_out->prop.gp.instanceCount = 1;
> > +  info_out->prop.gp.maxVertices = 1;
> > }
> > if (info->type == PIPE_SHADER_COMPUTE) {
> >info->prop.cp.numThreads[0] =
> > @@ -1257,23 +1260,26 @@ nv50_ir_init_prog_info(struct nv50_ir_prog_info 
> > *info)
> >info->prop.cp.numThreads[2] = 1;
> > }
> > info->io.pointSize = 0xff;
> > -   info->io.instanceId = 0xff;
> > -   info->io.vertexId = 0xff;
> > -   info->io.edgeFlagIn = 0xff;
> > -   info->io.edgeFlagOut = 0xff;
> > -   info->io.fragDepth = 0xff;
> > -   info->io.sampleMask = 0xff;
> > +   info_out->bin.smemSize = info->bin.smemSize;
> > +   info_out->io.genUserClip = info->io.genUserClip;
> > +   info_out->io.instanceId = 0xff;
> > +   info_out->io.vertexId = 0xff;
> > +   info_out->io.edgeFlagIn = 0xff;
> > +   info_out->io.edgeFlagOut = 0xff;
> > +   info_out->io.fragDepth = 0xff;
> > +   info_out->io.sampleMask = 0xff;
> > info->io.backFaceColor[0] = info->io.backFaceColor[1] = 0xff;
> >  }
> >
> >  int
> > -nv50_ir_generate_code(struct nv50_ir_prog_info *info)
> > +nv50_ir_generate_code(struct nv50_ir_prog_info *info,
> > +  struct nv50_ir_prog_info_out *info_out)
> >  {
> > int ret = 0;
> >
> > nv50_ir::Program::Type type;
> >
> > -   nv50_ir_init_prog_info(info);
> > +   nv50_ir_init_prog_info(info, info_out);
> >
> >  #define PROG_TYPE_CASE(a, b)

Re: [Mesa-dev] [PATCH] nv50/ir: get rid of smemSize

2020-03-06 Thread Karol Herbst

please ignore, there is actually a use of that, but not through TGSI.

On Fri, Mar 6, 2020 at 3:07 PM Karol Herbst  wrote:
>
> we can rely on the value we get through the cso
>
> Signed-off-by: Karol Herbst 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h | 1 -
>  src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 1 -
>  src/gallium/drivers/nouveau/nv50/nv50_program.c  | 4 +---
>  src/gallium/drivers/nouveau/nvc0/nvc0_program.c  | 4 +---
>  4 files changed, 2 insertions(+), 8 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> index 322bdd02557..1bd9bb36bf9 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> @@ -89,7 +89,6 @@ struct nv50_ir_prog_info
>int16_t maxGPR; /* may be -1 if none used */
>int16_t maxOutput;
>uint32_t tlsSpace;  /* required local memory per thread */
> -  uint32_t smemSize;  /* required shared memory per block */
>uint32_t *code;
>uint32_t codeSize;
>uint32_t instructions;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> index bd78b76f384..89d515804bc 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> @@ -1281,7 +1281,6 @@ Converter::parseNIR()
>info->prop.cp.numThreads[0] = nir->info.cs.local_size[0];
>info->prop.cp.numThreads[1] = nir->info.cs.local_size[1];
>info->prop.cp.numThreads[2] = nir->info.cs.local_size[2];
> -  info->bin.smemSize = nir->info.cs.shared_size;
>break;
> case Program::TYPE_FRAGMENT:
>info->prop.fp.earlyFragTests = nir->info.fs.early_fragment_tests;
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_program.c
> index c9d01e8cee7..31edce2ea3d 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_program.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_program.c
> @@ -350,7 +350,6 @@ nv50_program_translate(struct nv50_program *prog, 
> uint16_t chipset,
>return false;
> }
>
> -   info->bin.smemSize = prog->cp.smem_size;
> info->io.auxCBSlot = 15;
> info->io.ucpBase = NV50_CB_AUX_UCP_OFFSET;
> info->io.genUserClip = prog->vp.clpd_nr;
> @@ -398,7 +397,6 @@ nv50_program_translate(struct nv50_program *prog, 
> uint16_t chipset,
> prog->interps = info->bin.fixupData;
> prog->max_gpr = MAX2(4, (info->bin.maxGPR >> 1) + 1);
> prog->tls_space = info->bin.tlsSpace;
> -   prog->cp.smem_size = info->bin.smemSize;
> prog->mul_zero_wins = info->io.mul_zero_wins;
> prog->vp.need_vertex_id = info->io.vertexId < PIPE_MAX_SHADER_INPUTS;
>
> @@ -447,7 +445,7 @@ nv50_program_translate(struct nv50_program *prog, 
> uint16_t chipset,
>
> pipe_debug_message(debug, SHADER_INFO,
>"type: %d, local: %d, shared: %d, gpr: %d, inst: %d, 
> bytes: %d",
> -  prog->type, info->bin.tlsSpace, info->bin.smemSize,
> +  prog->type, info->bin.tlsSpace, prog->cp.smem_size,
>prog->max_gpr, info->bin.instructions,
>info->bin.codeSize);
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> index 128b94e1da5..5a9e0311101 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> @@ -603,7 +603,6 @@ nvc0_program_translate(struct nvc0_program *prog, 
> uint16_t chipset,
> info->optLevel = 3;
>  #endif
>
> -   info->bin.smemSize = prog->cp.smem_size;
> info->io.genUserClip = prog->vp.num_ucps;
> info->io.auxCBSlot = 15;
> info->io.msInfoCBSlot = 15;
> @@ -644,7 +643,6 @@ nvc0_program_translate(struct nvc0_program *prog, 
> uint16_t chipset,
> prog->relocs = info->bin.relocData;
> prog->fixups = info->bin.fixupData;
> prog->num_gprs = MAX2(4, (info->bin.maxGPR + 1));
> -   prog->cp.smem_size = info->bin.smemSize;
> prog->num_barriers = info->numBarriers;
>
> prog->vp.need_vertex_id = info->io.vertexId < PIPE_MAX_SHADER_INPUTS;
> @@ -710,7 +708,7 @@ nvc0_program_translate(struct nvc0_program *prog, 
> uint16_t chipset,
>
> pipe_debug_message(debug, SHADER_INFO,

[Mesa-dev] [PATCH] nv50/ir: get rid of smemSize

2020-03-06 Thread Karol Herbst

we can rely on the value we get through the cso

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h | 1 -
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 1 -
 src/gallium/drivers/nouveau/nv50/nv50_program.c  | 4 +---
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c  | 4 +---
 4 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
index 322bdd02557..1bd9bb36bf9 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
@@ -89,7 +89,6 @@ struct nv50_ir_prog_info
   int16_t maxGPR; /* may be -1 if none used */
   int16_t maxOutput;
   uint32_t tlsSpace;  /* required local memory per thread */
-  uint32_t smemSize;  /* required shared memory per block */
   uint32_t *code;
   uint32_t codeSize;
   uint32_t instructions;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index bd78b76f384..89d515804bc 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1281,7 +1281,6 @@ Converter::parseNIR()
   info->prop.cp.numThreads[0] = nir->info.cs.local_size[0];
   info->prop.cp.numThreads[1] = nir->info.cs.local_size[1];
   info->prop.cp.numThreads[2] = nir->info.cs.local_size[2];
-  info->bin.smemSize = nir->info.cs.shared_size;
   break;
case Program::TYPE_FRAGMENT:
   info->prop.fp.earlyFragTests = nir->info.fs.early_fragment_tests;
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.c 
b/src/gallium/drivers/nouveau/nv50/nv50_program.c
index c9d01e8cee7..31edce2ea3d 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_program.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_program.c
@@ -350,7 +350,6 @@ nv50_program_translate(struct nv50_program *prog, uint16_t 
chipset,
   return false;
}
 
-   info->bin.smemSize = prog->cp.smem_size;
info->io.auxCBSlot = 15;
info->io.ucpBase = NV50_CB_AUX_UCP_OFFSET;
info->io.genUserClip = prog->vp.clpd_nr;
@@ -398,7 +397,6 @@ nv50_program_translate(struct nv50_program *prog, uint16_t 
chipset,
prog->interps = info->bin.fixupData;
prog->max_gpr = MAX2(4, (info->bin.maxGPR >> 1) + 1);
prog->tls_space = info->bin.tlsSpace;
-   prog->cp.smem_size = info->bin.smemSize;
prog->mul_zero_wins = info->io.mul_zero_wins;
prog->vp.need_vertex_id = info->io.vertexId < PIPE_MAX_SHADER_INPUTS;
 
@@ -447,7 +445,7 @@ nv50_program_translate(struct nv50_program *prog, uint16_t 
chipset,
 
pipe_debug_message(debug, SHADER_INFO,
   "type: %d, local: %d, shared: %d, gpr: %d, inst: %d, 
bytes: %d",
-  prog->type, info->bin.tlsSpace, info->bin.smemSize,
+  prog->type, info->bin.tlsSpace, prog->cp.smem_size,
   prog->max_gpr, info->bin.instructions,
   info->bin.codeSize);
 
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
index 128b94e1da5..5a9e0311101 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
@@ -603,7 +603,6 @@ nvc0_program_translate(struct nvc0_program *prog, uint16_t 
chipset,
info->optLevel = 3;
 #endif
 
-   info->bin.smemSize = prog->cp.smem_size;
info->io.genUserClip = prog->vp.num_ucps;
info->io.auxCBSlot = 15;
info->io.msInfoCBSlot = 15;
@@ -644,7 +643,6 @@ nvc0_program_translate(struct nvc0_program *prog, uint16_t 
chipset,
prog->relocs = info->bin.relocData;
prog->fixups = info->bin.fixupData;
prog->num_gprs = MAX2(4, (info->bin.maxGPR + 1));
-   prog->cp.smem_size = info->bin.smemSize;
prog->num_barriers = info->numBarriers;
 
prog->vp.need_vertex_id = info->io.vertexId < PIPE_MAX_SHADER_INPUTS;
@@ -710,7 +708,7 @@ nvc0_program_translate(struct nvc0_program *prog, uint16_t 
chipset,
 
pipe_debug_message(debug, SHADER_INFO,
   "type: %d, local: %d, shared: %d, gpr: %d, inst: %d, 
bytes: %d",
-  prog->type, info->bin.tlsSpace, info->bin.smemSize,
+  prog->type, info->bin.tlsSpace, prog->cp.smem_size,
   prog->num_gprs, info->bin.instructions,
   info->bin.codeSize);
 
-- 
2.24.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/7] nv50/ir: add nv50_ir_prog_info_out

2020-03-05 Thread Karol Herbst

On Wed, Mar 4, 2020 at 6:37 PM Emil Velikov  wrote:
>
> Hi Mark,
>
> On Fri, 21 Feb 2020 at 12:20, Mark Menzynski  wrote:
>
> > -   ret = nv50_ir_generate_code(info);
> > +   /* these fields might be overwritten by the compiler */
> > +   info_out.bin.smemSize = prog->cp.smem_size;
> > +   info_out.io.genUserClip = prog->vp.num_ucps;
> > +
> I suspect that these two should be not be the out "version" of the
> variables, but more like in the final patch.
> Especially since nv50_ir_generate_code indiscriminately overrides info_out.
>
> While I haven't looked at the code too closely, if does seem like this
> commit causes an intermittent regression... Or perhaps we're lucky and
> things just work ;-)
>
> Either way, huge thanks for the update. Doubt I'll have the chance to
> do a proper review, despite that the performance numbers look great.
>

seems like Mark fixed that part in the 7th patch, but yeah, it should
be fixed in this patch instead.

> > +   ret = nv50_ir_generate_code(info, _out);
> > if (ret) {
> >NOUVEAU_ERR("shader translation failed: %i\n", ret);
> >goto out;
> > }
> >
>
> Thanks
> Emil
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 8/8] nvc0: Add shader disk caching

2020-02-17 Thread Karol Herbst

On Mon, Feb 17, 2020 at 6:41 PM Mark Menzynski  wrote:
>
> Adds shader disk caching for nvc0 to reduce the need to every time compile
> shaders. Shaders are saved into disk_shader_cache from nvc0_screen structure.
>
> It serializes the input nv50_ir_prog_info to compute the hash key and
> also to do a byte compare between the original nv50_ir_prog_info and the one
> saved in the cache. If keys match and also the byte compare returns they
> are equal, shaders are same, and the compiled nv50_ir_prog_info_out from the
> cache can be used instead of compiling input info.
>
> Seems to be significantly improving loading times. Piglit tests seem
> to be OK.
>
> Signed-off-by: Mark Menzynski 
> ---
>  .../drivers/nouveau/nvc0/nvc0_context.h   |  1 +
>  .../drivers/nouveau/nvc0/nvc0_program.c   | 49 ---
>  .../drivers/nouveau/nvc0/nvc0_shader_state.c  |  3 +-
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c |  2 +
>  4 files changed, 46 insertions(+), 9 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
> index 8a2a8f2797e..4b83d1afeb4 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
> @@ -321,6 +321,7 @@ extern struct draw_stage *nvc0_draw_render_stage(struct 
> nvc0_context *);
>
>  /* nvc0_program.c */
>  bool nvc0_program_translate(struct nvc0_program *, uint16_t chipset,
> +struct disk_cache *,
>  struct pipe_debug_callback *);
>  bool nvc0_program_upload(struct nvc0_context *, struct nvc0_program *);
>  void nvc0_program_destroy(struct nvc0_context *, struct nvc0_program *);
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> index 1a5073292e8..06b6f7b4db5 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> @@ -24,6 +24,7 @@
>
>  #include "compiler/nir/nir.h"
>  #include "tgsi/tgsi_ureg.h"
> +#include "util/blob.h"
>
>  #include "nvc0/nvc0_context.h"
>
> @@ -568,11 +569,19 @@ nvc0_program_dump(struct nvc0_program *prog)
>
>  bool
>  nvc0_program_translate(struct nvc0_program *prog, uint16_t chipset,
> +   struct disk_cache *disk_shader_cache,
> struct pipe_debug_callback *debug)
>  {
> +   struct blob blob;
> struct nv50_ir_prog_info *info;
> struct nv50_ir_prog_info_out info_out = {};
> -   int ret;
> +
> +   void *cached_data = NULL;
> +   size_t cached_size;
> +   bool shader_found = false;
> +
> +   int ret = 0;
> +   cache_key key;
>
> info = CALLOC_STRUCT(nv50_ir_prog_info);
> if (!info)
> @@ -631,14 +640,38 @@ nvc0_program_translate(struct nvc0_program *prog, 
> uint16_t chipset,
> info->assignSlots = nvc0_program_assign_varying_slots;
>
> /* these fields might be overwritten by the compiler */
> -   info_out.bin.smemSize = prog->cp.smem_size;
> -   info_out.io.genUserClip = prog->vp.num_ucps;
> -
> -   ret = nv50_ir_generate_code(info, _out);
> -   if (ret) {
> -  NOUVEAU_ERR("shader translation failed: %i\n", ret);
> -  goto out;
> +   info->bin.smemSize = prog->cp.smem_size;
> +   info->io.genUserClip = prog->vp.num_ucps;
> +
> +   blob_init();
> +   nv50_ir_prog_info_serialize(, info);
> +
> +   if (disk_shader_cache) {
> +  disk_cache_compute_key(disk_shader_cache, blob.data, blob.size, key);
> +  cached_data = disk_cache_get(disk_shader_cache, key, _size);
> +
> +  if (cached_data && cached_size >= blob.size) { // blob.size is the 
> size of serialized "info"
> + if (memcmp(cached_data, blob.data, blob.size) == 0) {
> +shader_found = true;
> +/* Blob contains only "info". In disk cache, "info_out" comes 
> right after it */
> +size_t offset = blob.size;
> +nv50_ir_prog_info_out_deserialize(cached_data, cached_size, 
> offset, _out);
> + }

I am still a bit unsure if we really really need this check... other
drivers don't seem to do it either, but it's definitely safer to keep
it... let's see what others think about it.

> +  }
> +  free(cached_data);
> +   }
> +   if (!shader_found) {
> +  ret = nv50_ir_generate_code(info, _out);
> +  if (ret) {
> + NOUVEAU_ERR("shader translation failed: %i\n", ret);
> + goto out;
> +  }
> +  if (disk_shader_cache) {
> + nv50_ir_prog_info_out_serialize(, _out);
> + disk_cache_put(disk_shader_cache, key, blob.data, blob.size, NULL);
> +  }
> }
> +   blob_finish();
>
> prog->code = info_out.bin.code;
> prog->code_size = info_out.bin.codeSize;
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c
> index 774c5648113..4327a89454b 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c
> +++

Re: [Mesa-dev] [PATCH 7/8] nv50/ir: Move separateFragData

2020-02-17 Thread Karol Herbst

On Mon, Feb 17, 2020 at 6:41 PM Mark Menzynski  wrote:
>
> Nv50_ir_prog_info (input) was in the wrong place, moved it to
> nv50_ir_prog_info_out.
>
> Signed-off-by: Mark Menzynski 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h  | 2 +-
>  src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp  | 2 +-
>  src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> index cdf19eeabcf..30498ceffaf 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> @@ -112,7 +112,6 @@ struct nv50_ir_prog_info
>   uint8_t inputPrim;
>} gp;
>struct {
> - bool separateFragData;
>   bool persampleInvocation;
>} fp;
>struct {
> @@ -200,6 +199,7 @@ struct nv50_ir_prog_info_out
>   bool usesSampleMaskIn;
>   bool readsFramebuffer;
>   bool readsSampleLocations;
> + bool separateFragData;
>} fp;
> } prop;
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> index 3efeaab4569..cf5f3d6d7e7 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> @@ -2100,7 +2100,7 @@ Converter::visit(nir_intrinsic_instr *insn)
>atom->setIndirect(0, 0, address);
>atom->subOp = getSubOp(op);
>
> -  info->io.globalAccess |= 0x2;
> +  info_out->io.globalAccess |= 0x2;
>break;
> }
> case nir_intrinsic_bindless_image_atomic_add:
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index 5850dc18fec..c2322f3856a 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -1176,7 +1176,7 @@ void Source::scanProperty(const struct 
> tgsi_full_property *prop)
>info_out->prop.gp.instanceCount = prop->u[0].Data;
>break;
> case TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS:
> -  info->prop.fp.separateFragData = true;
> +  info_out->prop.fp.separateFragData = true;
>break;
> case TGSI_PROPERTY_FS_COORD_ORIGIN:
> case TGSI_PROPERTY_FS_COORD_PIXEL_CENTER:
> --
> 2.21.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

mind merging those changes into the 1st patch? Just add a "v2 (mark):
..." note or something.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 6/8] tgsi/util: Change boolean for bool

2020-02-17 Thread Karol Herbst

by the way: Mind creating a MR on gitlab with this and the 2nd patch?
This way we can get them reviewed and tested there and merged before
the nouveau related patches.

On Mon, Feb 17, 2020 at 9:09 PM Karol Herbst  wrote:
>
> Reviewed-by: Karol Herbst 
>
> On Mon, Feb 17, 2020 at 6:41 PM Mark Menzynski  wrote:
> >
> > I was getting errors with "boolean" when compiling. This patch changes
> > boolean to bool from .
> >
> > Signed-off-by: Mark Menzynski 
> > ---
> >  src/gallium/auxiliary/tgsi/tgsi_util.c | 2 +-
> >  src/gallium/auxiliary/tgsi/tgsi_util.h | 5 +++--
> >  2 files changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.c 
> > b/src/gallium/auxiliary/tgsi/tgsi_util.c
> > index 1e5582ba273..e1b604cff0e 100644
> > --- a/src/gallium/auxiliary/tgsi/tgsi_util.c
> > +++ b/src/gallium/auxiliary/tgsi/tgsi_util.c
> > @@ -537,7 +537,7 @@ tgsi_util_get_shadow_ref_src_index(enum 
> > tgsi_texture_type tgsi_tex)
> >  }
> >
> >
> > -boolean
> > +bool
> >  tgsi_is_shadow_target(enum tgsi_texture_type target)
> >  {
> > switch (target) {
> > diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.h 
> > b/src/gallium/auxiliary/tgsi/tgsi_util.h
> > index 686b90f467e..6dc576b1a00 100644
> > --- a/src/gallium/auxiliary/tgsi/tgsi_util.h
> > +++ b/src/gallium/auxiliary/tgsi/tgsi_util.h
> > @@ -28,6 +28,7 @@
> >  #ifndef TGSI_UTIL_H
> >  #define TGSI_UTIL_H
> >
> > +#include 
> >  #include "pipe/p_shader_tokens.h"
> >
> >  #if defined __cplusplus
> > @@ -84,11 +85,11 @@ tgsi_util_get_texture_coord_dim(enum tgsi_texture_type 
> > tgsi_tex);
> >  int
> >  tgsi_util_get_shadow_ref_src_index(enum tgsi_texture_type tgsi_tex);
> >
> > -boolean
> > +bool
> >  tgsi_is_shadow_target(enum tgsi_texture_type target);
> >
> >
> > -static inline boolean
> > +static inline bool
> >  tgsi_is_msaa_target(enum tgsi_texture_type target)
> >  {
> > return (target == TGSI_TEXTURE_2D_MSAA ||
> > --
> > 2.21.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 6/8] tgsi/util: Change boolean for bool

2020-02-17 Thread Karol Herbst

Reviewed-by: Karol Herbst 

On Mon, Feb 17, 2020 at 6:41 PM Mark Menzynski  wrote:
>
> I was getting errors with "boolean" when compiling. This patch changes
> boolean to bool from .
>
> Signed-off-by: Mark Menzynski 
> ---
>  src/gallium/auxiliary/tgsi/tgsi_util.c | 2 +-
>  src/gallium/auxiliary/tgsi/tgsi_util.h | 5 +++--
>  2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.c 
> b/src/gallium/auxiliary/tgsi/tgsi_util.c
> index 1e5582ba273..e1b604cff0e 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_util.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_util.c
> @@ -537,7 +537,7 @@ tgsi_util_get_shadow_ref_src_index(enum tgsi_texture_type 
> tgsi_tex)
>  }
>
>
> -boolean
> +bool
>  tgsi_is_shadow_target(enum tgsi_texture_type target)
>  {
> switch (target) {
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.h 
> b/src/gallium/auxiliary/tgsi/tgsi_util.h
> index 686b90f467e..6dc576b1a00 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_util.h
> +++ b/src/gallium/auxiliary/tgsi/tgsi_util.h
> @@ -28,6 +28,7 @@
>  #ifndef TGSI_UTIL_H
>  #define TGSI_UTIL_H
>
> +#include 
>  #include "pipe/p_shader_tokens.h"
>
>  #if defined __cplusplus
> @@ -84,11 +85,11 @@ tgsi_util_get_texture_coord_dim(enum tgsi_texture_type 
> tgsi_tex);
>  int
>  tgsi_util_get_shadow_ref_src_index(enum tgsi_texture_type tgsi_tex);
>
> -boolean
> +bool
>  tgsi_is_shadow_target(enum tgsi_texture_type target);
>
>
> -static inline boolean
> +static inline bool
>  tgsi_is_msaa_target(enum tgsi_texture_type target)
>  {
> return (target == TGSI_TEXTURE_2D_MSAA ||
> --
> 2.21.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/8] nv50/ir: Add nv50_ir_prog_info serialize

2020-02-17 Thread Karol Herbst

On Mon, Feb 17, 2020 at 6:41 PM Mark Menzynski  wrote:
>
> Adds a function for serializing a nv50_ir_prog_info structure, which is
> needed for shader caching.
>
> Signed-off-by: Mark Menzynski 
> ---
>  .../drivers/nouveau/codegen/nv50_ir_driver.h  |  4 +
>  .../nouveau/codegen/nv50_ir_serialize.cpp | 81 +++
>  2 files changed, 85 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> index 9eb8a4c4798..cdf19eeabcf 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> @@ -278,6 +278,10 @@ namespace nv50_ir
>  extern void
>  nv50_ir_prog_info_out_print(struct nv50_ir_prog_info_out *);
>
> +/* Serialize a nv50_ir_prog_info structure and save it into blob */
> +extern bool
> +nv50_ir_prog_info_serialize(struct blob *, struct nv50_ir_prog_info *);
> +
>  /* Serialize a nv50_ir_prog_info_out structure and save it into blob */
>  extern bool
>  nv50_ir_prog_info_out_serialize(struct blob *, struct nv50_ir_prog_info_out 
> *);
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_serialize.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_serialize.cpp
> index 077f3eba6c8..0f47189f10b 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_serialize.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_serialize.cpp
> @@ -17,6 +17,87 @@ enum InterpApply {
> FLIP_GM107 = 7
>  };
>
> +extern bool
> +nv50_ir_prog_info_serialize(struct blob *blob, struct nv50_ir_prog_info 
> *info)
> +{
> +   blob_write_uint16(blob, info->target);
> +   blob_write_uint8(blob, info->type);
> +   blob_write_uint8(blob, info->optLevel);
> +   blob_write_uint8(blob, info->dbgFlags);
> +   blob_write_uint8(blob, info->omitLineNum);
> +   blob_write_uint32(blob, info->bin.smemSize);
> +   blob_write_uint16(blob, info->bin.maxOutput);
> +   blob_write_uint8(blob, info->bin.sourceRep);
> +
> +   switch(info->bin.sourceRep) {
> +  case PIPE_SHADER_IR_TGSI: {
> + struct tgsi_token *tokens = (struct tgsi_token *)info->bin.source;
> + unsigned int num_tokens = tgsi_num_tokens(tokens);
> +
> + blob_write_uint32(blob, num_tokens);
> + blob_write_bytes(blob, tokens, num_tokens * sizeof(struct 
> tgsi_token));
> + break;
> +  }
> +  case PIPE_SHADER_IR_NIR: {
> + struct nir_shader *nir = (struct nir_shader *)info->bin.source;
> + nir_serialize(blob, nir, false);
> + break;
> +  }
> +  default:
> + assert(!"unhandled info->bin.sourceRep");
> + return false;
> +   }
> +
> +   blob_write_uint16(blob, info->immd.bufSize);
> +   blob_write_bytes(blob, info->immd.buf, info->immd.bufSize * 
> sizeof(*info->immd.buf));
> +   blob_write_uint16(blob, info->immd.count);
> +   blob_write_bytes(blob, info->immd.data, info->immd.count * 
> sizeof(*info->immd.data));
> +   blob_write_bytes(blob, info->immd.type, info->immd.count * 16); // for 
> each vec4 (128 bit)
> +
> +   switch (info->type) {
> +  case PIPE_SHADER_VERTEX:
> + blob_write_bytes(blob, info->prop.vp.inputMask,
> +  4 * sizeof(*info->prop.vp.inputMask)); /* array of 
> size 4 */

we have an ARRAY_SIZE macro, but sizeof(info->prop.vp.inputMask)
should give you the full array size already, no?

> + break;
> +  case PIPE_SHADER_TESS_CTRL:
> + blob_write_uint32(blob, info->prop.cp.inputOffset);
> + blob_write_uint32(blob, info->prop.cp.sharedOffset);
> + blob_write_uint32(blob, info->prop.cp.gridInfoBase);
> + blob_write_bytes(blob, info->prop.cp.numThreads,
> +  3 * sizeof(*info->prop.cp.numThreads)); /* array 
> of size 3 */

same here

> +  case PIPE_SHADER_GEOMETRY:
> + blob_write_uint8(blob, info->prop.gp.inputPrim);
> + break;
> +  case PIPE_SHADER_FRAGMENT:
> + blob_write_uint8(blob, info->prop.fp.persampleInvocation);
> + break;
> +  default:
> + break;
> +   }
> +
> +   blob_write_uint8(blob, info->io.auxCBSlot);
> +   blob_write_uint16(blob, info->io.ucpBase);
> +   blob_write_uint16(blob, info->io.drawInfoBase);
> +   blob_write_uint16(blob, info->io.alphaRefBase);
> +   blob_write_uint8(blob, info->io.pointSize);
> +   blob_write_uint8(blob, info->io.viewportId);
> +   blob_write_bytes(blob, info->io.backFaceColor, 2 * 
> sizeof(*info->io.backFaceColor));

and here

> +   blob_write_uint8(blob, info->io.mul_zero_wins);
> +   blob_write_uint8(blob, info->io.nv50styleSurfaces);
> +   blob_write_uint16(blob, info->io.texBindBase);
> +   blob_write_uint16(blob, info->io.fbtexBindBase);
> +   blob_write_uint16(blob, info->io.suInfoBase);
> +   blob_write_uint16(blob, info->io.bindlessBase);
> +   blob_write_uint16(blob, info->io.bufInfoBase);
> +   blob_write_uint16(blob, info->io.sampleInfoBase);
> +   blob_write_uint8(blob,

Re: [Mesa-dev] [PATCH 4/8] nv50/ir: Add prog_info_out print

2020-02-17 Thread Karol Herbst

On Mon, Feb 17, 2020 at 6:41 PM Mark Menzynski  wrote:
>
> Adds a function for printing nv50_ir_prog_info_out structure
> in JSON-like format, which could be used in debugging.
>
> Signed-off-by: Mark Menzynski 
> ---
>  .../drivers/nouveau/codegen/nv50_ir_driver.h  |   3 +
>  .../drivers/nouveau/codegen/nv50_ir_print.cpp | 155 ++
>  2 files changed, 158 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> index bc92a3bc4ee..9eb8a4c4798 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> @@ -275,6 +275,9 @@ namespace nv50_ir
>  }
>  #endif
>
> +extern void
> +nv50_ir_prog_info_out_print(struct nv50_ir_prog_info_out *);
> +
>  /* Serialize a nv50_ir_prog_info_out structure and save it into blob */
>  extern bool
>  nv50_ir_prog_info_out_serialize(struct blob *, struct nv50_ir_prog_info_out 
> *);
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> index 5dcbf3c3e0c..f19d1a7d280 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> @@ -22,6 +22,7 @@
>
>  #include "codegen/nv50_ir.h"
>  #include "codegen/nv50_ir_target.h"
> +#include "codegen/nv50_ir_driver.h"
>
>  #include 
>
> @@ -852,3 +853,157 @@ Function::printLiveIntervals() const
>  }
>
>  } // namespace nv50_ir
> +
> +extern void
> +nv50_ir_prog_info_out_print(struct nv50_ir_prog_info_out *info_out)
> +{
> +   int i;
> +
> +   INFO("{\n");
> +   INFO("   \"target\":\"%d\",\n", info_out->target);
> +   INFO("   \"type\":\"%d\",\n", info_out->type);
> +
> +   // Bin
> +   INFO("   \"bin\":{\n");
> +   INFO("  \"maxGPR\":\"%d\",\n", info_out->bin.maxGPR);
> +   INFO("  \"tlsSpace\":\"%d\",\n", info_out->bin.tlsSpace);
> +   INFO("  \"smemSize\":\"%d\",\n", info_out->bin.smemSize);
> +   INFO("  \"codeSize\":\"%d\",\n", info_out->bin.codeSize);
> +   INFO("  \"instructions\":\"%d\",\n", info_out->bin.instructions);
> +
> +   // RelocInfo
> +   INFO("  \"RelocInfo\":");
> +   if (!info_out->bin.relocData) {
> +  INFO("\"NULL\",\n");
> +   }
> +   else {

please keep it in one line.

> +  nv50_ir::RelocInfo *reloc = (nv50_ir::RelocInfo 
> *)info_out->bin.relocData;
> +  INFO("{\n");
> +  INFO(" \"codePos\":\"%d\",\n", reloc->codePos);
> +  INFO(" \"libPos\":\"%d\",\n", reloc->libPos);
> +  INFO(" \"dataPos\":\"%d\",\n", reloc->dataPos);
> +  INFO(" \"count\":\"%d\",\n", reloc->count);
> +  INFO(" \"RelocEntry\":[\n");
> +  for (unsigned int i = 0; i < reloc->count; i++) {
> + INFO("
> {\"data\":\"%d\",\t\"mask\":\"%d\",\t\"offset\":\"%d\",\t\"bitPos\":\"%d\",\t\"type\":\"%d\"}",
> +   reloc->entry[i].data, reloc->entry[i].mask, 
> reloc->entry[i].offset, reloc->entry[i].bitPos, reloc->entry[i].type
> +   );
> +  }
> +  INFO("\n");
> +  INFO(" ]\n");
> +  INFO("  },\n");
> +   }
> +
> +   // FixupInfo
> +   INFO("  \"FixupInfo\":");
> +   if (!info_out->bin.fixupData) {
> +  INFO("\"NULL\"\n");
> +   }
> +   else {

here as well

> +  nv50_ir::FixupInfo *fixup = (nv50_ir::FixupInfo 
> *)info_out->bin.fixupData;
> +  INFO("{\n");
> +  INFO(" \"count\":\"%d\"\n", fixup->count);
> +  INFO(" \"FixupEntry\":[\n");
> +  for (unsigned int i = 0; i < fixup->count; i++) {
> + INFO("
> {\"apply\":\"%p\",\t\"ipa\":\"%d\",\t\"reg\":\"%d\",\t\"loc\":\"%d\"}",
> +   fixup->entry[i].apply, fixup->entry[i].ipa, 
> fixup->entry[i].reg, fixup->entry[i].loc);
> +  }
> +  INFO("\n");
> +  INFO(" ]\n");
> +  INFO("  }\n");
> +
> +  INFO("   },\n");
> +   }
> +
> +   if (info_out->numSysVals) {
> +  INFO("   \"sv\":[\n");
> +  for (i = 0; i < info_out->numSysVals; i++) {
> + if (&(info_out->sv[i])) {
> +INFO("  {\"id\":\"%d\", \"sn\":\"%d\", \"si\":\"%d\"}",
> +   info_out->sv[i].id, info_out->sv[i].sn, 
> info_out->sv[i].si);
> + }
> +  }
> +  INFO("\n   ],\n");
> +   }
> +   if (info_out->numInputs) {
> +  INFO("   \"in\":[\n");
> +  for (i = 0; i < info_out->numInputs; i++) {
> + if (&(info_out->in[i])) {
> +INFO("  {\"id\":\"%d\",\t\"sn\":\"%d\",\t\"si\":\"%d\"}",
> +info_out->in[i].id, info_out->in[i].sn, info_out->in[i].si);
> + }
> +  }
> +  INFO("\n   ],\n");
> +   }
> +   if (info_out->numOutputs) {
> +  INFO("   \"out\":[\n");
> +  for (i = 0; i < info_out->numOutputs; i++) {
> + if (&(info_out->out[i])) {
> +INFO("  {\"id\":\"%d\",\t\"sn\":\"%d\",\t\"si\":\"%d\"}",
> +

Re: [Mesa-dev] [PATCH 3/8] nv50/ir: Add nv50_ir_prog_info_out serialize and deserialize

2020-02-17 Thread Karol Herbst

On Mon, Feb 17, 2020 at 6:41 PM Mark Menzynski  wrote:
>
> Adds functions for serializing and deserializing
> nv50_ir_prog_info_out structure, which are needed for shader caching.
>
> Signed-off-by: Mark Menzynski 
> ---
>  .../drivers/nouveau/codegen/nv50_ir_driver.h  |  44 
>  .../nouveau/codegen/nv50_ir_emit_gk110.cpp|  14 +-
>  .../nouveau/codegen/nv50_ir_emit_gm107.cpp|  14 +-
>  .../nouveau/codegen/nv50_ir_emit_nv50.cpp |   6 +-
>  .../nouveau/codegen/nv50_ir_emit_nvc0.cpp |  14 +-
>  .../nouveau/codegen/nv50_ir_serialize.cpp | 196 ++
>  src/gallium/drivers/nouveau/meson.build   |   1 +
>  7 files changed, 265 insertions(+), 24 deletions(-)
>  create mode 100644 src/gallium/drivers/nouveau/codegen/nv50_ir_serialize.cpp
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> index f6b5415bc95..bc92a3bc4ee 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> @@ -25,6 +25,7 @@
>
>  #include "pipe/p_shader_tokens.h"
>
> +#include "util/blob.h"
>  #include "tgsi/tgsi_util.h"
>  #include "tgsi/tgsi_parse.h"
>  #include "tgsi/tgsi_scan.h"
> @@ -242,6 +243,49 @@ nv50_ir_apply_fixups(void *fixupData, uint32_t *code,
>  extern void nv50_ir_get_target_library(uint32_t chipset,
> const uint32_t **code, uint32_t 
> *size);
>
> +
> +#ifdef __cplusplus
> +namespace nv50_ir
> +{
> +   class FixupEntry;
> +   class FixupData;
> +
> +   void
> +   gk110_interpApply(const nv50_ir::FixupEntry *entry, uint32_t *code,
> + const nv50_ir::FixupData& data);
> +   void
> +   gm107_interpApply(const nv50_ir::FixupEntry *entry, uint32_t *code,
> + const nv50_ir::FixupData& data);
> +   void
> +   nv50_interpApply(const nv50_ir::FixupEntry *entry, uint32_t *code,
> +const nv50_ir::FixupData& data);
> +   void
> +   nvc0_interpApply(const nv50_ir::FixupEntry *entry, uint32_t *code,
> +const nv50_ir::FixupData& data);
> +   void
> +   gk110_selpFlip(const nv50_ir::FixupEntry *entry, uint32_t *code,
> +  const nv50_ir::FixupData& data);
> +   void
> +   gm107_selpFlip(const nv50_ir::FixupEntry *entry, uint32_t *code,
> +  const nv50_ir::FixupData& data);
> +   void
> +   nvc0_selpFlip(const nv50_ir::FixupEntry *entry, uint32_t *code,
> + const nv50_ir::FixupData& data);
> +
> +}
> +#endif
> +
> +/* Serialize a nv50_ir_prog_info_out structure and save it into blob */
> +extern bool
> +nv50_ir_prog_info_out_serialize(struct blob *, struct nv50_ir_prog_info_out 
> *);
> +
> +/* Deserialize from data and save into a nv50_ir_prog_info_out structure
> + * using a pointer. Size is a total size of the serialized data.
> + * Offset points to where info_out in data is located. */
> +extern bool
> +nv50_ir_prog_info_out_deserialize(void *data, size_t size, size_t offset,
> + struct nv50_ir_prog_info_out *);

some spaces missing. Also I'd drop the offset argument and require the
callee to pass in an adjusted pointer already.

> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> index 2118c3153f7..e651d7fdcb0 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> @@ -1209,8 +1209,8 @@ CodeEmitterGK110::emitSLCT(const CmpInstruction *i)
> }
>  }
>
> -static void
> -selpFlip(const FixupEntry *entry, uint32_t *code, const FixupData& data)
> +void
> +gk110_selpFlip(const FixupEntry *entry, uint32_t *code, const FixupData& 
> data)
>  {
> int loc = entry->loc;
> if (data.force_persample_interp)
> @@ -1227,7 +1227,7 @@ void CodeEmitterGK110::emitSELP(const Instruction *i)
>code[1] |= 1 << 13;
>
> if (i->subOp == 1) {
> -  addInterp(0, 0, selpFlip);
> +  addInterp(0, 0, gk110_selpFlip);
> }
>  }
>
> @@ -2042,8 +2042,8 @@ CodeEmitterGK110::emitInterpMode(const Instruction *i)
> code[1] |= (i->ipa & 0xc) << (19 - 2);
>  }
>
> -static void
> -interpApply(const FixupEntry *entry, uint32_t *code, const FixupData& data)
> +void
> +gk110_interpApply(const struct FixupEntry *entry, uint32_t *code, const 
> FixupData& data)
>  {
> int ipa = entry->ipa;
> int reg = entry->reg;
> @@ -2078,10 +2078,10 @@ CodeEmitterGK110::emitINTERP(const Instruction *i)
>
> if (i->op == OP_PINTERP) {
>srcId(i->src(1), 23);
> -  addInterp(i->ipa, SDATA(i->src(1)).id, interpApply);
> +  addInterp(i->ipa, SDATA(i->src(1)).id, gk110_interpApply);
> } else {
>code[0] |= 0xff << 23;
> -  addInterp(i->ipa, 0xff, interpApply);
> +  addInterp(i->ipa, 0xff, gk110_interpApply);
> }
>
>

Re: [Mesa-dev] [PATCH 2/8] util/blob: Add overwrite function for uint8

2020-02-17 Thread Karol Herbst

On Mon, Feb 17, 2020 at 6:41 PM Mark Menzynski  wrote:
>
> Overwrite function for this type  was missing and I needed it for my project.
>
> Signed-off-by: Mark Menzynski 
> ---
>  src/util/blob.c |  9 +
>  src/util/blob.h | 15 +++
>  2 files changed, 24 insertions(+)
>
> diff --git a/src/util/blob.c b/src/util/blob.c
> index 94d5a9dea74..5bf4b924c91 100644
> --- a/src/util/blob.c
> +++ b/src/util/blob.c
> @@ -214,6 +214,15 @@ BLOB_WRITE_TYPE(blob_write_intptr, intptr_t)
>  #define ASSERT_ALIGNED(_offset, _align) \
> assert(ALIGN((_offset), (_align)) == (_offset))
>
> +bool
> +blob_overwrite_uint8 (struct blob *blob,
> +  size_t offset,
> +  uint8_t value)
> +{
> +   ASSERT_ALIGNED(offset, sizeof(value));
> +   return blob_overwrite_bytes(blob, offset, , sizeof(value));
> +}
> +

I think it would be better to do the same as with the write functions
and define a macro for the implementation.

>  bool
>  blob_overwrite_uint32 (struct blob *blob,
> size_t offset,
> diff --git a/src/util/blob.h b/src/util/blob.h
> index 9113331254a..d5496fef1cd 100644
> --- a/src/util/blob.h
> +++ b/src/util/blob.h
> @@ -209,6 +209,21 @@ blob_write_uint16(struct blob *blob, uint16_t value);
>  bool
>  blob_write_uint32(struct blob *blob, uint32_t value);
>
> +/**
> + * Overwrite a uint8_t previously written to the blob.
> + *
> + * Writes a uint8_t value to an existing portion of the blob at an offset of
> + * \offset.  This data range must have previously been written to the blob by
> + * one of the blob_write_* calls.
> + *
> + * \return True unless the requested position or position+to_write lie 
> outside
> + * the current blob's size.
> + */
> +bool
> +blob_overwrite_uint8(struct blob *blob,
> + size_t offset,
> + uint8_t value);
> +

following the existing pattern, I think this should be moved after the
blob_write_uint8 declaration.

>  /**
>   * Overwrite a uint32_t previously written to the blob.
>   *
> --
> 2.21.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [ANNOUNCE] Mesa 20.0 branchpoint planned for 2020/01/29, Milestone opened

2020-01-29 Thread Karol Herbst

On Thu, Jan 30, 2020 at 2:37 AM Dieter Nützel  wrote:
>
> Maybe compilation with '-Dopencl-spirv=true', again.
>
> It is broken, now.
> Even LLVM 10.0 won't compile for me with SPIRV-LLVM-Translator,
> currently.
>

do you have any more details on that? It could be that the
spirv-llvm-translator  diverged somewhere as I am only compiling
against llvm 9 right now.

> Greetings,
> Dieter
>
> Am 22.01.2020 19:27, schrieb Dylan Baker:
> > Hi list, due to some last minute changes in plan I'll be managing the
> > 20.0
> > release. The release calendar has been updated, but the gitlab
> > milestone wasn't
> > opened. That has been corrected, and is here
> > https://gitlab.freedesktop.org/mesa/mesa/-/milestones/9, please add any
> > issues
> > or MRs you would like to land before the branchpoint to the milestone.
> >
> > Thanks,
> > Dylan
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 2/2] nv50/ir/ra: fix memory corruption when spilling

2020-01-18 Thread Karol Herbst

0 00 fa
  0x0c2a80075460: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c2a80075470:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2a80075480: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2a80075490: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2a800754a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
  0x0c2a800754b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2a800754c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:   00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:   fa
  Freed heap region:   fd
  Stack left redzone:  f1
  Stack mid redzone:   f2
  Stack right redzone: f3
  Stack after return:  f5
  Stack use after scope:   f8
  Global redzone:  f9
  Global init order:   f6
  Poisoned by user:f7
  Container overflow:  fc
  Array cookie:ac
  Intra object redzone:bb
  ASan internal:   fe
  Left alloca redzone: ca
  Right alloca redzone:cb
  Shadow gap:  cc
==612087==ABORTING

v2: full rework
v3: manage a full copy instead of recreating new lists on every access

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_ra.cpp| 93 ++-
 1 file changed, 71 insertions(+), 22 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index d6d3e70cce6..dabf0cfacc6 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -295,10 +295,53 @@ private:
 
 typedef std::pair ValuePair;
 
+class MergedDefs
+{
+private:
+   std::list& entry(Value *val) {
+  auto it = defs.find(val);
+
+  if (it == defs.end()) {
+ std::list  = defs[val];
+ res = val->defs;
+ return res;
+  } else {
+ return (*it).second;
+  }
+   }
+
+   std::unordered_map > defs;
+
+public:
+   std::list& operator()(Value *val) {
+  return entry(val);
+   }
+
+   void add(Value *val, const std::list ) {
+  assert(val);
+  std::list  = entry(val);
+  valdefs.insert(valdefs.end(), vals.begin(), vals.end());
+   }
+
+   void removeDefsOfInstruction(Instruction *insn) {
+  for (int d = 0; insn->defExists(d); ++d) {
+ ValueDef *def = >def(d);
+ defs.erase(def->get());
+ for (auto  : defs)
+p.second.remove(def);
+  }
+   }
+
+   void merge() {
+  for (auto  : defs)
+ p.first->defs = p.second;
+   }
+};
+
 class SpillCodeInserter
 {
 public:
-   SpillCodeInserter(Function *fn) : func(fn), stackSize(0), stackBase(0) { }
+   SpillCodeInserter(Function *fn, MergedDefs ) : func(fn), 
mergedDefs(mergedDefs), stackSize(0), stackBase(0) { }
 
bool run(const std::list&);
 
@@ -308,6 +351,7 @@ public:
 
 private:
Function *func;
+   MergedDefs 
 
struct SpillSlot
{
@@ -708,7 +752,7 @@ RegAlloc::BuildIntervalsPass::visit(BasicBlock *bb)
 class GCRA
 {
 public:
-   GCRA(Function *, SpillCodeInserter&);
+   GCRA(Function *, SpillCodeInserter&, MergedDefs&);
~GCRA();
 
bool allocateRegisters(ArrayList& insns);
@@ -825,6 +869,8 @@ private:
 
SpillCodeInserter& spill;
std::list mustSpill;
+
+   MergedDefs 
 };
 
 const GCRA::RelDegree GCRA::relDegree;
@@ -954,12 +1000,13 @@ GCRA::coalesceValues(Value *dst, Value *src, bool force)
 rep->id, rep->reg.data.id, val->id);
 
// set join pointer of all values joined with val
-   for (ValueDef *def : val->defs)
+   const std::list  = mergedDefs(val);
+   for (ValueDef *def : defs)
   def->get()->join = rep;
assert(rep->join == rep && val->join == rep);
 
// add val's definitions to rep and extend the live interval of its RIG node
-   rep->defs.insert(rep->defs.end(), val->defs.begin(), val->defs.end());
+   mergedDefs.add(rep, defs);
nRep->livei.unify(nVal->livei);
nRep->degreeLimit = MIN2(nRep->degreeLimit, nVal->degreeLimit);
nRep->maxReg = MIN2(nRep->maxReg, nVal->maxReg);
@@ -1160,10 +1207,11 @@ GCRA::RIG_Node::addRegPreference(RIG_Node *node)
prefRegs.push_back(node);
 }
 
-GCRA::GCRA(Function *fn, SpillCodeInserter& spill) :
+GCRA::GCRA(Function *fn, SpillCodeInserter& spill, MergedDefs& mergedDefs) :
func(fn),
regs(fn->getProgram()->getTarget()),
-   spill(spill)
+   spill(spill),
+   mergedDefs(mergedDefs)
 {
prog = func->getProgram();
 }
@@ -1258,7 +1306,7 @@ GCRA::calculateSpillWeights()
 
   if (!val->noSpill) {
  int rc = 0;
- for (ValueDef *def : val->defs)
+ for (ValueDef *def : mergedDefs(val))
 rc += def->get()->refCount();
 
  nodes[i].weight =
@@ -1360,15 +1408,15 @@ GCRA::checkInterference(const

[Mesa-dev] [PATCH v3 1/2] nv50/ir/ra: convert some for loops to Range-based for loops

2020-01-18 Thread Karol Herbst

I will touch them in the next commit

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_ra.cpp| 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index 6df2664da22..d6d3e70cce6 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -954,9 +954,8 @@ GCRA::coalesceValues(Value *dst, Value *src, bool force)
 rep->id, rep->reg.data.id, val->id);
 
// set join pointer of all values joined with val
-   for (Value::DefIterator def = val->defs.begin(); def != val->defs.end();
-++def)
-  (*def)->get()->join = rep;
+   for (ValueDef *def : val->defs)
+  def->get()->join = rep;
assert(rep->join == rep && val->join == rep);
 
// add val's definitions to rep and extend the live interval of its RIG node
@@ -1259,10 +1258,8 @@ GCRA::calculateSpillWeights()
 
   if (!val->noSpill) {
  int rc = 0;
- for (Value::DefIterator it = val->defs.begin();
-  it != val->defs.end();
-  ++it)
-rc += (*it)->get()->refCount();
+ for (ValueDef *def : val->defs)
+rc += def->get()->refCount();
 
  nodes[i].weight =
 (float)rc * (float)rc / (float)nodes[i].livei.extent();
@@ -1370,10 +1367,10 @@ GCRA::checkInterference(const RIG_Node *node, 
Graph::EdgeIterator& ei)
 
if (vA->compound | vB->compound) {
   // NOTE: this only works for >aligned< register tuples !
-  for (Value::DefCIterator D = vA->defs.begin(); D != vA->defs.end(); ++D) 
{
-  for (Value::DefCIterator d = vB->defs.begin(); d != vB->defs.end(); ++d) 
{
- const LValue *vD = (*D)->get()->asLValue();
- const LValue *vd = (*d)->get()->asLValue();
+  for (const ValueDef *D : vA->defs) {
+  for (const ValueDef *d : vB->defs) {
+ const LValue *vD = D->get()->asLValue();
+ const LValue *vd = d->get()->asLValue();
 
  if (!vD->livei.overlaps(vd->livei)) {
 INFO_DBG(prog->dbgFlags, REG_ALLOC, "(%%%i) X (%%%i): no 
overlap\n",
-- 
2.24.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 1/2] nv50/ir/ra: convert some for loops to Range-based for loops

2020-01-18 Thread Karol Herbst

I will touch them in the next commit

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_ra.cpp| 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index 6df2664da22..d6d3e70cce6 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -954,9 +954,8 @@ GCRA::coalesceValues(Value *dst, Value *src, bool force)
 rep->id, rep->reg.data.id, val->id);
 
// set join pointer of all values joined with val
-   for (Value::DefIterator def = val->defs.begin(); def != val->defs.end();
-++def)
-  (*def)->get()->join = rep;
+   for (ValueDef *def : val->defs)
+  def->get()->join = rep;
assert(rep->join == rep && val->join == rep);
 
// add val's definitions to rep and extend the live interval of its RIG node
@@ -1259,10 +1258,8 @@ GCRA::calculateSpillWeights()
 
   if (!val->noSpill) {
  int rc = 0;
- for (Value::DefIterator it = val->defs.begin();
-  it != val->defs.end();
-  ++it)
-rc += (*it)->get()->refCount();
+ for (ValueDef *def : val->defs)
+rc += def->get()->refCount();
 
  nodes[i].weight =
 (float)rc * (float)rc / (float)nodes[i].livei.extent();
@@ -1370,10 +1367,10 @@ GCRA::checkInterference(const RIG_Node *node, 
Graph::EdgeIterator& ei)
 
if (vA->compound | vB->compound) {
   // NOTE: this only works for >aligned< register tuples !
-  for (Value::DefCIterator D = vA->defs.begin(); D != vA->defs.end(); ++D) 
{
-  for (Value::DefCIterator d = vB->defs.begin(); d != vB->defs.end(); ++d) 
{
- const LValue *vD = (*D)->get()->asLValue();
- const LValue *vd = (*d)->get()->asLValue();
+  for (const ValueDef *D : vA->defs) {
+  for (const ValueDef *d : vB->defs) {
+ const LValue *vD = D->get()->asLValue();
+ const LValue *vd = d->get()->asLValue();
 
  if (!vD->livei.overlaps(vd->livei)) {
 INFO_DBG(prog->dbgFlags, REG_ALLOC, "(%%%i) X (%%%i): no 
overlap\n",
-- 
2.24.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 2/2] nv50/ir/ra: fix memory corruption when spilling

2020-01-15 Thread Karol Herbst

0 00 fa
  0x0c2a80075460: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c2a80075470:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2a80075480: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2a80075490: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2a800754a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
  0x0c2a800754b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2a800754c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:   00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:   fa
  Freed heap region:   fd
  Stack left redzone:  f1
  Stack mid redzone:   f2
  Stack right redzone: f3
  Stack after return:  f5
  Stack use after scope:   f8
  Global redzone:  f9
  Global init order:   f6
  Poisoned by user:f7
  Container overflow:  fc
  Array cookie:ac
  Intra object redzone:bb
  ASan internal:   fe
  Left alloca redzone: ca
  Right alloca redzone:cb
  Shadow gap:      cc
==612087==ABORTING

v2: full rework

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_ra.cpp| 87 ++-
 1 file changed, 66 insertions(+), 21 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index d6d3e70cce6..9a106eff2d1 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -295,10 +295,48 @@ private:
 
 typedef std::pair ValuePair;
 
+class MergedDefs
+{
+public:
+   std::list operator()(Value *val) {
+  std::list res;
+  res.insert(res.end(), val->defs.begin(), val->defs.end());
+  res.insert(res.end(), defs[val].begin(), defs[val].end());
+  return res;
+   }
+
+   void add(Value *val, std::list ) {
+  assert(val);
+  defs[val].insert(defs[val].end(), vals.begin(), vals.end());
+   }
+
+   void remove(Value *val, ValueDef *def) {
+  defs[val].remove(def);
+   }
+
+   void removeDefsOfInstruction(Instruction *insn) {
+  for (int d = 0; insn->defExists(d); ++d) {
+ ValueDef *def = >def(d);
+ defs.erase(def->get());
+ for (auto  : defs)
+p.second.remove(def);
+  }
+   }
+
+   void merge() {
+  for (auto  : defs)
+ p.first->defs.insert(p.first->defs.end(), p.second.begin(), 
p.second.end());
+  defs.clear();
+   }
+
+private:
+   std::unordered_map > defs;
+};
+
 class SpillCodeInserter
 {
 public:
-   SpillCodeInserter(Function *fn) : func(fn), stackSize(0), stackBase(0) { }
+   SpillCodeInserter(Function *fn, MergedDefs ) : func(fn), 
mergedDefs(mergedDefs), stackSize(0), stackBase(0) { }
 
bool run(const std::list&);
 
@@ -308,6 +346,7 @@ public:
 
 private:
Function *func;
+   MergedDefs 
 
struct SpillSlot
{
@@ -708,7 +747,7 @@ RegAlloc::BuildIntervalsPass::visit(BasicBlock *bb)
 class GCRA
 {
 public:
-   GCRA(Function *, SpillCodeInserter&);
+   GCRA(Function *, SpillCodeInserter&, MergedDefs&);
~GCRA();
 
bool allocateRegisters(ArrayList& insns);
@@ -825,6 +864,8 @@ private:
 
SpillCodeInserter& spill;
std::list mustSpill;
+
+   MergedDefs 
 };
 
 const GCRA::RelDegree GCRA::relDegree;
@@ -954,12 +995,13 @@ GCRA::coalesceValues(Value *dst, Value *src, bool force)
 rep->id, rep->reg.data.id, val->id);
 
// set join pointer of all values joined with val
-   for (ValueDef *def : val->defs)
+   std::list defs = mergedDefs(val);
+   for (ValueDef *def : defs)
   def->get()->join = rep;
assert(rep->join == rep && val->join == rep);
 
// add val's definitions to rep and extend the live interval of its RIG node
-   rep->defs.insert(rep->defs.end(), val->defs.begin(), val->defs.end());
+   mergedDefs.add(rep, defs);
nRep->livei.unify(nVal->livei);
nRep->degreeLimit = MIN2(nRep->degreeLimit, nVal->degreeLimit);
nRep->maxReg = MIN2(nRep->maxReg, nVal->maxReg);
@@ -1160,10 +1202,11 @@ GCRA::RIG_Node::addRegPreference(RIG_Node *node)
prefRegs.push_back(node);
 }
 
-GCRA::GCRA(Function *fn, SpillCodeInserter& spill) :
+GCRA::GCRA(Function *fn, SpillCodeInserter& spill, MergedDefs& mergedDefs) :
func(fn),
regs(fn->getProgram()->getTarget()),
-   spill(spill)
+   spill(spill),
+   mergedDefs(mergedDefs)
 {
prog = func->getProgram();
 }
@@ -1258,7 +1301,7 @@ GCRA::calculateSpillWeights()
 
   if (!val->noSpill) {
  int rc = 0;
- for (ValueDef *def : val->defs)
+ for (ValueDef *def : mergedDefs(val))
 rc += def->get()->refCount();
 
  nodes[i].weight =
@@ -1360,15 +1403,15 @@ GCRA::checkInterference(const RIG_Node *node, 
Graph::EdgeIterator& ei)
 
if

[Mesa-dev] [PATCH v2 1/2] nv50/ir/ra: convert some for loops to Range-based for loops

2020-01-15 Thread Karol Herbst

I will touch them in the next commit

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_ra.cpp| 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index 6df2664da22..d6d3e70cce6 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -954,9 +954,8 @@ GCRA::coalesceValues(Value *dst, Value *src, bool force)
 rep->id, rep->reg.data.id, val->id);
 
// set join pointer of all values joined with val
-   for (Value::DefIterator def = val->defs.begin(); def != val->defs.end();
-++def)
-  (*def)->get()->join = rep;
+   for (ValueDef *def : val->defs)
+  def->get()->join = rep;
assert(rep->join == rep && val->join == rep);
 
// add val's definitions to rep and extend the live interval of its RIG node
@@ -1259,10 +1258,8 @@ GCRA::calculateSpillWeights()
 
   if (!val->noSpill) {
  int rc = 0;
- for (Value::DefIterator it = val->defs.begin();
-  it != val->defs.end();
-  ++it)
-rc += (*it)->get()->refCount();
+ for (ValueDef *def : val->defs)
+rc += def->get()->refCount();
 
  nodes[i].weight =
 (float)rc * (float)rc / (float)nodes[i].livei.extent();
@@ -1370,10 +1367,10 @@ GCRA::checkInterference(const RIG_Node *node, 
Graph::EdgeIterator& ei)
 
if (vA->compound | vB->compound) {
   // NOTE: this only works for >aligned< register tuples !
-  for (Value::DefCIterator D = vA->defs.begin(); D != vA->defs.end(); ++D) 
{
-  for (Value::DefCIterator d = vB->defs.begin(); d != vB->defs.end(); ++d) 
{
- const LValue *vD = (*D)->get()->asLValue();
- const LValue *vd = (*d)->get()->asLValue();
+  for (const ValueDef *D : vA->defs) {
+  for (const ValueDef *d : vB->defs) {
+ const LValue *vD = D->get()->asLValue();
+ const LValue *vd = d->get()->asLValue();
 
  if (!vD->livei.overlaps(vd->livei)) {
 INFO_DBG(prog->dbgFlags, REG_ALLOC, "(%%%i) X (%%%i): no 
overlap\n",
-- 
2.24.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50/ir: implement global atomics and handle it for nir

2019-12-05 Thread Karol Herbst

On Thu, Dec 5, 2019 at 11:57 AM Karol Herbst  wrote:
>
> TGSI doesn't have any concept of global memory right now.
>
> Signed-off-by: Karol Herbst 
> ---
>  .../nouveau/codegen/nv50_ir_from_nir.cpp  | 43 +--
>  .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp |  2 +
>  2 files changed, 41 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> index 08365988069..31f764d63d4 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> @@ -582,40 +582,47 @@ Converter::getSubOp(nir_intrinsic_op op)
>  {
> switch (op) {
> case nir_intrinsic_bindless_image_atomic_add:
> +   case nir_intrinsic_global_atomic_add:
> case nir_intrinsic_image_atomic_add:
> case nir_intrinsic_image_deref_atomic_add:
> case nir_intrinsic_shared_atomic_add:
> case nir_intrinsic_ssbo_atomic_add:
>return  NV50_IR_SUBOP_ATOM_ADD;
> case nir_intrinsic_bindless_image_atomic_and:
> +   case nir_intrinsic_global_atomic_and:
> case nir_intrinsic_image_atomic_and:
> case nir_intrinsic_image_deref_atomic_and:
> case nir_intrinsic_shared_atomic_and:
> case nir_intrinsic_ssbo_atomic_and:
>return  NV50_IR_SUBOP_ATOM_AND;
> case nir_intrinsic_bindless_image_atomic_comp_swap:
> +   case nir_intrinsic_global_atomic_comp_swap:
> case nir_intrinsic_image_atomic_comp_swap:
> case nir_intrinsic_image_deref_atomic_comp_swap:
> case nir_intrinsic_shared_atomic_comp_swap:
> case nir_intrinsic_ssbo_atomic_comp_swap:
>return  NV50_IR_SUBOP_ATOM_CAS;
> case nir_intrinsic_bindless_image_atomic_exchange:
> +   case nir_intrinsic_global_atomic_exchange:
> case nir_intrinsic_image_atomic_exchange:
> case nir_intrinsic_image_deref_atomic_exchange:
> case nir_intrinsic_shared_atomic_exchange:
> case nir_intrinsic_ssbo_atomic_exchange:
>return  NV50_IR_SUBOP_ATOM_EXCH;
> case nir_intrinsic_bindless_image_atomic_or:
> +   case nir_intrinsic_global_atomic_or:
> case nir_intrinsic_image_atomic_or:
> case nir_intrinsic_image_deref_atomic_or:
> case nir_intrinsic_shared_atomic_or:
> case nir_intrinsic_ssbo_atomic_or:
>return  NV50_IR_SUBOP_ATOM_OR;
> case nir_intrinsic_bindless_image_atomic_imax:
> -   case nir_intrinsic_image_atomic_imax:
> -   case nir_intrinsic_image_deref_atomic_imax:
> case nir_intrinsic_bindless_image_atomic_umax:
> +   case nir_intrinsic_global_atomic_imax:
> +   case nir_intrinsic_global_atomic_umax:
> +   case nir_intrinsic_image_atomic_imax:
> case nir_intrinsic_image_atomic_umax:
> +   case nir_intrinsic_image_deref_atomic_imax:
> case nir_intrinsic_image_deref_atomic_umax:
> case nir_intrinsic_shared_atomic_imax:
> case nir_intrinsic_shared_atomic_umax:
> @@ -623,10 +630,12 @@ Converter::getSubOp(nir_intrinsic_op op)
> case nir_intrinsic_ssbo_atomic_umax:
>return  NV50_IR_SUBOP_ATOM_MAX;
> case nir_intrinsic_bindless_image_atomic_imin:
> -   case nir_intrinsic_image_atomic_imin:
> -   case nir_intrinsic_image_deref_atomic_imin:
> case nir_intrinsic_bindless_image_atomic_umin:
> +   case nir_intrinsic_global_atomic_imin:
> +   case nir_intrinsic_global_atomic_umin:
> +   case nir_intrinsic_image_atomic_imin:
> case nir_intrinsic_image_atomic_umin:
> +   case nir_intrinsic_image_deref_atomic_imin:
> case nir_intrinsic_image_deref_atomic_umin:
> case nir_intrinsic_shared_atomic_imin:
> case nir_intrinsic_shared_atomic_umin:
> @@ -634,6 +643,7 @@ Converter::getSubOp(nir_intrinsic_op op)
> case nir_intrinsic_ssbo_atomic_umin:
>return  NV50_IR_SUBOP_ATOM_MIN;
> case nir_intrinsic_bindless_image_atomic_xor:
> +   case nir_intrinsic_global_atomic_xor:
> case nir_intrinsic_image_atomic_xor:
> case nir_intrinsic_image_deref_atomic_xor:
> case nir_intrinsic_shared_atomic_xor:
> @@ -2379,6 +2389,30 @@ Converter::visit(nir_intrinsic_instr *insn)
>info->io.globalAccess |= 0x2;
>break;
> }
> +   case nir_intrinsic_global_atomic_add:
> +   case nir_intrinsic_global_atomic_and:
> +   case nir_intrinsic_global_atomic_comp_swap:
> +   case nir_intrinsic_global_atomic_exchange:
> +   case nir_intrinsic_global_atomic_or:
> +   case nir_intrinsic_global_atomic_imax:
> +   case nir_intrinsic_global_atomic_imin:
> +   case nir_intrinsic_global_atomic_umax:
> +   case nir_intrinsic_global_atomic_umin:
> +   case nir_intrinsic_global_atomic_xor: {
> +  const DataType dType = getDType(insn);
> +  LValues

[Mesa-dev] [PATCH] nouveau: limit reported compute max memory and allocation size

2019-12-05 Thread Karol Herbst

Otherwise applications (like the OpenCL CTS) will try to allocate more memory
than what the GPU is actually able to provide.

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/nv50/nv50_screen.c | 7 +--
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 7 +--
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index ad35bd8cd42..5942458b0b2 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -472,6 +472,7 @@ nv50_screen_get_compute_param(struct pipe_screen *pscreen,
   enum pipe_compute_cap param, void *data)
 {
struct nv50_screen *screen = nv50_screen(pscreen);
+   struct nouveau_device *dev = screen->base.device;
 
 #define RET(x) do {  \
if (data) \
@@ -489,7 +490,8 @@ nv50_screen_get_compute_param(struct pipe_screen *pscreen,
case PIPE_COMPUTE_CAP_MAX_THREADS_PER_BLOCK:
   RET((uint64_t []) { 512 });
case PIPE_COMPUTE_CAP_MAX_GLOBAL_SIZE: /* g0-15[] */
-  RET((uint64_t []) { 1ULL << 32 });
+  // TODO what to do if vram_size is 0?
+  RET((uint64_t []) { MIN2(1ULL << 32, dev->vram_size) });
case PIPE_COMPUTE_CAP_MAX_LOCAL_SIZE: /* s[] */
   RET((uint64_t []) { 16 << 10 });
case PIPE_COMPUTE_CAP_MAX_PRIVATE_SIZE: /* l[] */
@@ -499,7 +501,8 @@ nv50_screen_get_compute_param(struct pipe_screen *pscreen,
case PIPE_COMPUTE_CAP_SUBGROUP_SIZE:
   RET((uint32_t []) { 32 });
case PIPE_COMPUTE_CAP_MAX_MEM_ALLOC_SIZE:
-  RET((uint64_t []) { 1ULL << 40 });
+  // TODO what to do if vram_size is 0?
+  RET((uint64_t []) { dev->vram_size });
case PIPE_COMPUTE_CAP_IMAGES_SUPPORTED:
   RET((uint32_t []) { 0 });
case PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS:
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index f5e1373a37e..57b1c70f7b3 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -533,6 +533,7 @@ nvc0_screen_get_compute_param(struct pipe_screen *pscreen,
 {
struct nvc0_screen *screen = nvc0_screen(pscreen);
const uint16_t obj_class = screen->compute->oclass;
+   struct nouveau_device *dev = screen->base.device;
 
 #define RET(x) do {  \
if (data) \
@@ -560,7 +561,8 @@ nvc0_screen_get_compute_param(struct pipe_screen *pscreen,
  RET((uint64_t []) { 512 });
   }
case PIPE_COMPUTE_CAP_MAX_GLOBAL_SIZE: /* g[] */
-  RET((uint64_t []) { 1ULL << 40 });
+  // TODO what to do when vram_size is 0?
+  RET((uint64_t []) { dev->vram_size });
case PIPE_COMPUTE_CAP_MAX_LOCAL_SIZE: /* s[] */
   switch (obj_class) {
   case GM200_COMPUTE_CLASS:
@@ -580,7 +582,8 @@ nvc0_screen_get_compute_param(struct pipe_screen *pscreen,
case PIPE_COMPUTE_CAP_SUBGROUP_SIZE:
   RET((uint32_t []) { 32 });
case PIPE_COMPUTE_CAP_MAX_MEM_ALLOC_SIZE:
-  RET((uint64_t []) { 1ULL << 40 });
+  // TODO what to do when vram_size is 0?
+  RET((uint64_t []) { dev->vram_size });
case PIPE_COMPUTE_CAP_IMAGES_SUPPORTED:
   RET((uint32_t []) { 0 });
case PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS:
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50/ir: implement global atomics and handle it for nir

2019-12-05 Thread Karol Herbst

TGSI doesn't have any concept of global memory right now.

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 43 +--
 .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp |  2 +
 2 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 08365988069..31f764d63d4 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -582,40 +582,47 @@ Converter::getSubOp(nir_intrinsic_op op)
 {
switch (op) {
case nir_intrinsic_bindless_image_atomic_add:
+   case nir_intrinsic_global_atomic_add:
case nir_intrinsic_image_atomic_add:
case nir_intrinsic_image_deref_atomic_add:
case nir_intrinsic_shared_atomic_add:
case nir_intrinsic_ssbo_atomic_add:
   return  NV50_IR_SUBOP_ATOM_ADD;
case nir_intrinsic_bindless_image_atomic_and:
+   case nir_intrinsic_global_atomic_and:
case nir_intrinsic_image_atomic_and:
case nir_intrinsic_image_deref_atomic_and:
case nir_intrinsic_shared_atomic_and:
case nir_intrinsic_ssbo_atomic_and:
   return  NV50_IR_SUBOP_ATOM_AND;
case nir_intrinsic_bindless_image_atomic_comp_swap:
+   case nir_intrinsic_global_atomic_comp_swap:
case nir_intrinsic_image_atomic_comp_swap:
case nir_intrinsic_image_deref_atomic_comp_swap:
case nir_intrinsic_shared_atomic_comp_swap:
case nir_intrinsic_ssbo_atomic_comp_swap:
   return  NV50_IR_SUBOP_ATOM_CAS;
case nir_intrinsic_bindless_image_atomic_exchange:
+   case nir_intrinsic_global_atomic_exchange:
case nir_intrinsic_image_atomic_exchange:
case nir_intrinsic_image_deref_atomic_exchange:
case nir_intrinsic_shared_atomic_exchange:
case nir_intrinsic_ssbo_atomic_exchange:
   return  NV50_IR_SUBOP_ATOM_EXCH;
case nir_intrinsic_bindless_image_atomic_or:
+   case nir_intrinsic_global_atomic_or:
case nir_intrinsic_image_atomic_or:
case nir_intrinsic_image_deref_atomic_or:
case nir_intrinsic_shared_atomic_or:
case nir_intrinsic_ssbo_atomic_or:
   return  NV50_IR_SUBOP_ATOM_OR;
case nir_intrinsic_bindless_image_atomic_imax:
-   case nir_intrinsic_image_atomic_imax:
-   case nir_intrinsic_image_deref_atomic_imax:
case nir_intrinsic_bindless_image_atomic_umax:
+   case nir_intrinsic_global_atomic_imax:
+   case nir_intrinsic_global_atomic_umax:
+   case nir_intrinsic_image_atomic_imax:
case nir_intrinsic_image_atomic_umax:
+   case nir_intrinsic_image_deref_atomic_imax:
case nir_intrinsic_image_deref_atomic_umax:
case nir_intrinsic_shared_atomic_imax:
case nir_intrinsic_shared_atomic_umax:
@@ -623,10 +630,12 @@ Converter::getSubOp(nir_intrinsic_op op)
case nir_intrinsic_ssbo_atomic_umax:
   return  NV50_IR_SUBOP_ATOM_MAX;
case nir_intrinsic_bindless_image_atomic_imin:
-   case nir_intrinsic_image_atomic_imin:
-   case nir_intrinsic_image_deref_atomic_imin:
case nir_intrinsic_bindless_image_atomic_umin:
+   case nir_intrinsic_global_atomic_imin:
+   case nir_intrinsic_global_atomic_umin:
+   case nir_intrinsic_image_atomic_imin:
case nir_intrinsic_image_atomic_umin:
+   case nir_intrinsic_image_deref_atomic_imin:
case nir_intrinsic_image_deref_atomic_umin:
case nir_intrinsic_shared_atomic_imin:
case nir_intrinsic_shared_atomic_umin:
@@ -634,6 +643,7 @@ Converter::getSubOp(nir_intrinsic_op op)
case nir_intrinsic_ssbo_atomic_umin:
   return  NV50_IR_SUBOP_ATOM_MIN;
case nir_intrinsic_bindless_image_atomic_xor:
+   case nir_intrinsic_global_atomic_xor:
case nir_intrinsic_image_atomic_xor:
case nir_intrinsic_image_deref_atomic_xor:
case nir_intrinsic_shared_atomic_xor:
@@ -2379,6 +2389,30 @@ Converter::visit(nir_intrinsic_instr *insn)
   info->io.globalAccess |= 0x2;
   break;
}
+   case nir_intrinsic_global_atomic_add:
+   case nir_intrinsic_global_atomic_and:
+   case nir_intrinsic_global_atomic_comp_swap:
+   case nir_intrinsic_global_atomic_exchange:
+   case nir_intrinsic_global_atomic_or:
+   case nir_intrinsic_global_atomic_imax:
+   case nir_intrinsic_global_atomic_imin:
+   case nir_intrinsic_global_atomic_umax:
+   case nir_intrinsic_global_atomic_umin:
+   case nir_intrinsic_global_atomic_xor: {
+  const DataType dType = getDType(insn);
+  LValues  = convert(>dest);
+  Value *address;
+  uint32_t offset = getIndirect(>src[0], 0, address);
+
+  Symbol *sym = mkSymbol(FILE_MEMORY_GLOBAL, 0, dType, offset);
+  Instruction *atom =
+ mkOp2(OP_ATOM, dType, newDefs[0], sym, getSrc(>src[1], 0));
+  atom->setIndirect(0, 0, address);
+  atom->subOp = getSubOp(op);
+
+  info->io.globalAccess |= 0x2;
+  break;
+   }
case nir_intrinsic_bindless_image_atomic_add:
case nir_intrinsic_bindless_image_atomic_and:
case nir_intrinsic_bindless_image_atomic_

Re: [Mesa-dev] LLVM + SPIRV-LLVM-Translator - compilation errors

2019-11-14 Thread Karol Herbst

might be that those definitions moved elsewhere or the headers were
never directly included.

In llvm 9 there are in llvm/InitializePasses.h, but maybe that's
changed? And if not, maybe that file needs to be included in
SPIRVLowerSPIRBlocks.cpp?

On Fri, Nov 15, 2019 at 2:34 AM Dieter Nützel  wrote:
>
> Hello Karol and Ilya,
>
> do you have any hints/pointers for me to solve these LLVM +
> SPIRV-LLVM-Translator - compilation errors.
>
> llvm-project git taken 'today'.
>
> [-]
> commit 95c770fbfb14b07e1af7c2d427c16745617d9f1f (HEAD -> master,
> origin/master, origin/HEAD)
> Author: Davide Italiano 
> Date:   Thu Nov 14 15:29:28 2019 -0800
>
>  [Utility] Remove a dead header [PPC64LE_ehframe_Registers.h]
> [-]
>
> opt/llvm-project/llvm/projects/SPIRV-LLVM-Translator/lib/SPIRV/SPIRVLowerSPIRBlocks.cpp:617:1:
> note: in expansion of macro ‘INITIALIZE_PASS_DEPENDENCY’
>617 | INITIALIZE_PASS_DEPENDENCY(CallGraphWrapperPass)
>| ^~
> /opt/llvm-project/llvm/include/llvm/PassSupport.h:50:45: error:
> ‘initializeAssumptionCacheTrackerPass’ was not declared in this scope
> 50 | #define INITIALIZE_PASS_DEPENDENCY(
> initialize##depName##Pass(Registry);
>| ^~
> /opt/llvm-project/llvm/projects/SPIRV-LLVM-Translator/lib/SPIRV/SPIRVLowerSPIRBlocks.cpp:618:1:
> note: in expansion of macro ‘INITIALIZE_PASS_DEPENDENCY’
>618 | INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
>| ^~
> /opt/llvm-project/llvm/include/llvm/PassSupport.h:50:45: error:
> ‘initializeAAResultsWrapperPassPass’ was not declared in this scope
> 50 | #define INITIALIZE_PASS_DEPENDENCY(depName)
> initialize##depName##Pass(Registry);
>| ^~
> /opt/llvm-project/llvm/projects/SPIRV-LLVM-Translator/lib/SPIRV/SPIRVLowerSPIRBlocks.cpp:619:1:
> note: in expansion of macro ‘INITIALIZE_PASS_DEPENDENCY’
>619 | INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
>| ^~
>
>
> Thank you very much in advance.
> Dieter
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50/ir/ra: fix memory corruption when spilling

2019-11-12 Thread Karol Herbst

a80075470:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2a80075480: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2a80075490: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2a800754a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
  0x0c2a800754b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2a800754c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:   00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:   fa
  Freed heap region:   fd
  Stack left redzone:  f1
  Stack mid redzone:   f2
  Stack right redzone: f3
  Stack after return:  f5
  Stack use after scope:   f8
  Global redzone:  f9
  Global init order:   f6
  Poisoned by user:f7
  Container overflow:  fc
  Array cookie:ac
  Intra object redzone:bb
  ASan internal:   fe
  Left alloca redzone: ca
  Right alloca redzone:cb
  Shadow gap:  cc
==612087==ABORTING

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_ra.cpp| 34 ++-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index 6df2664da22..d72932748f1 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -1745,6 +1745,34 @@ value_cmp(ValueRef *a, ValueRef *b) {
return ai->serial < bi->serial;
 }
 
+class RepairSSAAfterSpillPass : public Pass
+{
+public:
+   RepairSSAAfterSpillPass(Instruction *insn) : insn(insn) {}
+private:
+   void removeStaleRefs(Instruction *it, ValueDef *def) {
+  for (int d = 0; it->defExists(d); ++d) {
+ std::list  = it->getDef(d)->defs;
+ std::list::iterator it = std::find(defs.begin(), 
defs.end(), def);
+ if (it != defs.end())
+defs.erase(it);
+  }
+   }
+
+   virtual bool visit(Instruction *it)
+   {
+  if (it == insn)
+ return true;
+
+  for (int d = 0; insn->defExists(d); ++d)
+ removeStaleRefs(it, >def(d));
+
+  return true;
+   }
+
+   Instruction *insn;
+};
+
 // For each value that is to be spilled, go through all its definitions.
 // A value can have multiple definitions if it has been coalesced before.
 // For each definition, first go through all its uses and insert an unspill
@@ -1815,8 +1843,12 @@ SpillCodeInserter::run(const std::list& lst)
   }
 
   for (unordered_set::const_iterator it = to_del.begin();
-   it != to_del.end(); ++it)
+   it != to_del.end(); ++it) {
+ Instruction *insn = *it;
+ RepairSSAAfterSpillPass repair(insn);
+ repair.run(insn->bb->getFunction());
  delete_Instruction(func->getProgram(), *it);
+  }
}
 
// TODO: We're not trying to reuse old slots in a potential next iteration.
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50/ir: fix crash in isUniform for undefined values

2019-11-02 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
index a181a13a3b1..ae07d967221 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
@@ -274,6 +274,8 @@ LValue::isUniform() const
if (defs.size() > 1)
   return false;
Instruction *insn = getInsn();
+   if (!insn)
+  return false;
// let's not try too hard here for now ...
return !insn->srcExists(1) && insn->getSrc(0)->isUniform();
 }
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50/ir: remove DUMMY edge type

2019-10-14 Thread Karol Herbst

isn't that what "UNKNOWN" is for?

On Mon, Oct 14, 2019 at 11:16 PM Ilia Mirkin  wrote:
>
> The idea was that this type would be used when you're not sure, and
> then run the classifier afterwards. Otherwise the classifier doesn't
> know which edges to classify...
>
> On Mon, Oct 14, 2019 at 5:10 PM Karol Herbst  wrote:
> >
> > it was never used
> >
> > Signed-off-by: Karol Herbst 
> > ---
> >  src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp| 3 ---
> >  src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp | 8 +---
> >  src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h   | 1 -
> >  src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp| 2 --
> >  4 files changed, 1 insertion(+), 13 deletions(-)
> >
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
> > index 9f0e0733326..76fee8c791e 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
> > @@ -536,9 +536,6 @@ Function::printCFGraph(const char *filePath)
> >   case Graph::Edge::BACK:
> >  fprintf(out, "\t%i -> %i;\n", idA, idB);
> >  break;
> > - case Graph::Edge::DUMMY:
> > -fprintf(out, "\t%i -> %i [style=dotted];\n", idA, idB);
> > -break;
> >   default:
> >  assert(0);
> >  break;
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
> > index b1076cf4129..e9a9981746a 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
> > @@ -77,7 +77,6 @@ const char *Graph::Edge::typeStr() const
> > case FORWARD: return "forward";
> > case BACK:return "back";
> > case CROSS:   return "cross";
> > -   case DUMMY:   return "dummy";
> > case UNKNOWN:
> > default:
> >return "unk";
> > @@ -184,7 +183,7 @@ Graph::Node::reachableBy(const Node *node, const Node 
> > *term) const
> >   continue;
> >
> >for (EdgeIterator ei = pos->outgoing(); !ei.end(); ei.next()) {
> > - if (ei.getType() == Edge::BACK || ei.getType() == Edge::DUMMY)
> > + if (ei.getType() == Edge::BACK)
> >  continue;
> >   if (ei.getNode()->visit(seq))
> >  stack.push(ei.getNode());
> > @@ -301,7 +300,6 @@ private:
> >  switch (ei.getType()) {
> >  case Graph::Edge::TREE:
> >  case Graph::Edge::FORWARD:
> > -case Graph::Edge::DUMMY:
> > if (++(ei.getNode()->tag) == 
> > ei.getNode()->incidentCountFwd())
> >bb.push(ei.getNode());
> > break;
> > @@ -371,8 +369,6 @@ void Graph::classifyDFS(Node *curr, int& seq)
> >
> > for (edge = curr->out; edge; edge = edge->next[0]) {
> >node = edge->target;
> > -  if (edge->type == Edge::DUMMY)
> > - continue;
> >
> >if (node->getSequence() == 0) {
> >   edge->type = Edge::TREE;
> > @@ -387,8 +383,6 @@ void Graph::classifyDFS(Node *curr, int& seq)
> >
> > for (edge = curr->in; edge; edge = edge->next[1]) {
> >node = edge->origin;
> > -  if (edge->type == Edge::DUMMY)
> > - continue;
> >
> >if (node->getSequence() == 0) {
> >   edge->type = Edge::TREE;
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
> > index 115f20e5e99..fc85e78a50c 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
> > @@ -47,7 +47,6 @@ public:
> >   FORWARD,
> >   BACK,
> >   CROSS, // e.g. loop break
> > - DUMMY
> >};
> >
> >Edge(Node *dst, Node *src, Type kind);
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> > index f25bce00884..6df2664da22 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> > @@ -624,8 +624,6 @@ 
> > RegAlloc::BuildInte

[Mesa-dev] [PATCH] nv50/ir: remove DUMMY edge type

2019-10-14 Thread Karol Herbst

it was never used

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp| 3 ---
 src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp | 8 +---
 src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h   | 1 -
 src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp| 2 --
 4 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
index 9f0e0733326..76fee8c791e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
@@ -536,9 +536,6 @@ Function::printCFGraph(const char *filePath)
  case Graph::Edge::BACK:
 fprintf(out, "\t%i -> %i;\n", idA, idB);
 break;
- case Graph::Edge::DUMMY:
-fprintf(out, "\t%i -> %i [style=dotted];\n", idA, idB);
-break;
  default:
 assert(0);
 break;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
index b1076cf4129..e9a9981746a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
@@ -77,7 +77,6 @@ const char *Graph::Edge::typeStr() const
case FORWARD: return "forward";
case BACK:return "back";
case CROSS:   return "cross";
-   case DUMMY:   return "dummy";
case UNKNOWN:
default:
   return "unk";
@@ -184,7 +183,7 @@ Graph::Node::reachableBy(const Node *node, const Node 
*term) const
  continue;
 
   for (EdgeIterator ei = pos->outgoing(); !ei.end(); ei.next()) {
- if (ei.getType() == Edge::BACK || ei.getType() == Edge::DUMMY)
+ if (ei.getType() == Edge::BACK)
 continue;
  if (ei.getNode()->visit(seq))
 stack.push(ei.getNode());
@@ -301,7 +300,6 @@ private:
 switch (ei.getType()) {
 case Graph::Edge::TREE:
 case Graph::Edge::FORWARD:
-case Graph::Edge::DUMMY:
if (++(ei.getNode()->tag) == ei.getNode()->incidentCountFwd())
   bb.push(ei.getNode());
break;
@@ -371,8 +369,6 @@ void Graph::classifyDFS(Node *curr, int& seq)
 
for (edge = curr->out; edge; edge = edge->next[0]) {
   node = edge->target;
-  if (edge->type == Edge::DUMMY)
- continue;
 
   if (node->getSequence() == 0) {
  edge->type = Edge::TREE;
@@ -387,8 +383,6 @@ void Graph::classifyDFS(Node *curr, int& seq)
 
for (edge = curr->in; edge; edge = edge->next[1]) {
   node = edge->origin;
-  if (edge->type == Edge::DUMMY)
- continue;
 
   if (node->getSequence() == 0) {
  edge->type = Edge::TREE;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
index 115f20e5e99..fc85e78a50c 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
@@ -47,7 +47,6 @@ public:
  FORWARD,
  BACK,
  CROSS, // e.g. loop break
- DUMMY
   };
 
   Edge(Node *dst, Node *src, Type kind);
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index f25bce00884..6df2664da22 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -624,8 +624,6 @@ RegAlloc::BuildIntervalsPass::collectLiveValues(BasicBlock 
*bb)
   // trickery to save a loop of OR'ing liveSets
   // aliasing works fine with BitSet::setOr
   for (Graph::EdgeIterator ei = bb->cfg.outgoing(); !ei.end(); ei.next()) {
- if (ei.getType() == Graph::Edge::DUMMY)
-continue;
  if (bbA) {
 bb->liveSet.setOr(>liveSet, >liveSet);
 bbA = bb;
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Nouveau] [PATCH] nv50/ir: mark STORE destination inputs as used

2019-10-14 Thread Karol Herbst

Reviewed-by: Karol Herbst 

On Mon, Oct 14, 2019 at 8:47 AM Ilia Mirkin  wrote:
>
> Observed an issue when looking at the code generatedy by the
> image-vertex-attrib-input-output piglit test. Even though the test
> itself worked fine (due to TIC 0 being used for the image), this needs
> to be fixed.
>
> Signed-off-by: Ilia Mirkin 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index d62d36008e6..8c429026452 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -1591,6 +1591,12 @@ bool Source::scanInstruction(const struct 
> tgsi_full_instruction *inst)
>if (insn.getOpcode() == TGSI_OPCODE_STORE &&
>dst.getFile() != TGSI_FILE_MEMORY) {
>   info->io.globalAccess |= 0x2;
> +
> + if (dst.getFile() == TGSI_FILE_INPUT) {
> +// TODO: Handle indirect somehow?
> +const int i = dst.getIndex(0);
> +info->in[i].mask |= 1;
> + }
>}
>
>if (dst.getFile() == TGSI_FILE_OUTPUT) {
> --
> 2.21.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Nouveau] [PATCH] gm107/ir: fix loading z offset for layered 3d image bindings

2019-10-14 Thread Karol Herbst

I don't think this is a good idea overall.

The way simpler solution would be to disable tiling on the z axis for
3d images so that we don't hurt the most common case, 2d images. And
that's what I was seeing nvidia doing anyway.

So with that we would end up adding a bunch of instructions hurting
the 2d image case, just to support something no user will care about
anyway.

On Mon, Oct 14, 2019 at 7:22 AM Ilia Mirkin  wrote:
>
> Unfortuantely we don't know if a particular load is a real 2d image (as
> would be a cube face or 2d array element), or a layer of a 3d image.
> Since we pass in the TIC reference, the instruction's type has to match
> what's in the TIC (experimentally). In order to properly support
> bindless images, this also can't be done by looking at the current
> bindings and generating appropriate code.
>
> As a result all plain 2d loads are converted into a pair of 2d/3d loads,
> with appropriate predicates to ensure only one of those actually
> executes, and the values are all merged in.
>
> This goes somewhat against the current flow, so for GM107 we do the OOB
> handling directly in the surface processing logic. Perhaps the other
> gens should do something similar, but that is left to another change.
>
> This fixes dEQP tests like image_load_store.3d.*_single_layer and GL-CTS
> tests like shader_image_load_store.non-layered_binding without breaking
> anything else.
>
> Signed-off-by: Ilia Mirkin 
> ---
>
> OK, first of all -- to whoever thought that binding single layers of a 3d
> image and telling the shader they were regular 2d images was a good idea --
> I disagree.
>
> This change feels super super dirty, but I honestly don't see a materially
> cleaner way of handling it. Instead of being able to reuse the OOB
> handling, it's put in with the coord processing (!), and the surface
> conversion function is seriously hacked up.
>
> But splitting it up is harder, since a lot of information has to flow
> from stage to stage, like when to do what kind of access, and cloning
> the surface op is much easier in the coord processing stage.
>
>  .../nouveau/codegen/nv50_ir_emit_gm107.cpp|  34 ++-
>  .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 206 +-
>  .../nouveau/codegen/nv50_ir_lowering_nvc0.h   |   4 +-
>  src/gallium/drivers/nouveau/nvc0/nvc0_tex.c   |  10 +-
>  4 files changed, 201 insertions(+), 53 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index 6eefe8f0025..e244bd0d610 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -122,6 +122,8 @@ private:
> void emitSAM();
> void emitRAM();
>
> +   void emitPSETP();
> +
> void emitMOV();
> void emitS2R();
> void emitCS2R();
> @@ -690,6 +692,31 @@ CodeEmitterGM107::emitRAM()
>   * predicate/cc
>   
> **/
>
> +void
> +CodeEmitterGM107::emitPSETP()
> +{
> +
> +   emitInsn(0x5090);
> +
> +   switch (insn->op) {
> +   case OP_AND: emitField(0x18, 3, 0); break;
> +   case OP_OR:  emitField(0x18, 3, 1); break;
> +   case OP_XOR: emitField(0x18, 3, 2); break;
> +   default:
> +  assert(!"unexpected operation");
> +  break;
> +   }
> +
> +   // emitINV (0x2a);
> +   emitPRED(0x27); // TODO: support 3-arg
> +   emitINV (0x20, insn->src(1));
> +   emitPRED(0x1d, insn->src(1));
> +   emitINV (0x0f, insn->src(0));
> +   emitPRED(0x0c, insn->src(0));
> +   emitPRED(0x03, insn->def(0));
> +   emitPRED(0x00);
> +}
> +
>  
> /***
>   * movement / conversion
>   
> **/
> @@ -3557,7 +3584,12 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
> case OP_AND:
> case OP_OR:
> case OP_XOR:
> -  emitLOP();
> +  switch (insn->def(0).getFile()) {
> +  case FILE_GPR: emitLOP(); break;
> +  case FILE_PREDICATE: emitPSETP(); break;
> +  default:
> + assert(!"invalid bool op");
> +  }
>break;
> case OP_NOT:
>emitNOT();
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> index 1f702a987d8..0f68a9a229f 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> @@ -1802,6 +1802,9 @@ NVC0LoweringPass::loadSuInfo32(Value *ptr, int slot, 
> uint32_t off, bool bindless
>  {
> uint32_t base = slot * NVC0_SU_INFO__STRIDE;
>
> +   // We don't upload surface info for bindless for GM107+
> +   assert(!bindless || targ->getChipset() < NVISA_GM107_CHIPSET);
> +
> if (ptr) {
>ptr = bld.mkOp2v(OP_ADD, TYPE_U32, bld.getSSA(), ptr, bld.mkImm(slot));
>

Re: [Mesa-dev] [clover/spirv] radeonsi/NIR (with Nine) - final linking failed on libOpenCL.so.1.0.0

2019-09-26 Thread Karol Herbst

I think you only need to recompile the translator with -fPIC enabled.
At least that's what the error is saying.

On Thu, Sep 26, 2019 at 6:53 AM Aaron Watry  wrote:
>
> Pretty sure I'm running into the same thing trying to build clover
> with llvm-spirv enabled.  If it's a known solution, I wouldn't mind
> having some time saved :)
>
> --Aaron
>
> On Wed, Sep 25, 2019 at 10:30 AM Dieter Nützel  wrote:
> >
> > Hello Karol and Pierre,
> >
> > tried it on radeonsi/NIR with Nine and OpenCL enabled
> > (-Dgallium-nine=true -Dopencl-spirv=true -Dgallium-opencl=standalone).
> >
> > I think I have all SPIRV-LLVM-Translator stuff in place
> > (/opt/llvm/projects/SPIRV-LLVM-Translator/). Resulting lib is installed
> > at /usr/local/lib/libLLVMSPIRVLib.a.
> >
> > Do I need a shared version (*.so ) of it? 'ld' output point at this
> > (relocation R_X86_64_32 against symbol `_ZTVN4SPIR13PrimitiveTypeE' can
> > not be used when making a shared object; recompile with -fPIC).
> >
> > Thanks,
> > Dieter
> >
> > [1384/1384] Linking target
> > src/gallium/targets/opencl/libOpenCL.so.1.0.0.
> > FAILED: src/gallium/targets/opencl/libOpenCL.so.1.0.0
> > ccache c++  -o src/gallium/targets/opencl/libOpenCL.so.1.0.0
> > -Wl,--no-undefined -Wl,--as-needed -Wl,-O1 -shared -fPIC
> > -Wl,--start-group -Wl,-soname,libOpenCL.so.1 -Wl,--whole-archive
> > src/gallium/state_trackers/clover/libclover.a -Wl,--no-whole-archive
> > src/gallium/auxiliary/pipe-loader/libpipe_loader_dynamic.a
> > src/loader/libloader.a src/util/libxmlconfig.a src/util/libmesa_util.a
> > src/gallium/auxiliary/libgallium.a src/compiler/nir/libnir.a
> > src/compiler/libcompiler.a src/gallium/state_trackers/clover/libclllvm.a
> > src/gallium/state_trackers/clover/libclspirv.a
> > src/gallium/state_trackers/clover/libclnir.a -Wl,--gc-sections
> > -Wl,--version-script /opt/mesa/src/gallium/targets/opencl/opencl.sym
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libz.so -pthread
> > -lm -ldl /usr/lib64/libunwind.so /usr/lib64/libelf.so
> > /usr/local/lib/libclangCodeGen.a /usr/local/lib/libclangFrontendTool.a
> > /usr/local/lib/libclangFrontend.a /usr/local/lib/libclangDriver.a
> > /usr/local/lib/libclangSerialization.a /usr/local/lib/libclangParse.a
> > /usr/local/lib/libclangSema.a /usr/local/lib/libclangAnalysis.a
> > /usr/local/lib/libclangAST.a /usr/local/lib/libclangASTMatchers.a
> > /usr/local/lib/libclangEdit.a /usr/local/lib/libclangLex.a
> > /usr/local/lib/libclangBasic.a /usr/lib64/libdrm.so
> > /usr/lib64/libexpat.so -L/usr/local/lib -lLLVM-10svn -lsensors
> > -L/usr/local/lib -lLLVM-10svn /usr/local/lib/libLLVMSPIRVLib.a
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libSPIRV-Tools.so
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libSPIRV-Tools-link.so
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libSPIRV-Tools-opt.so
> > -Wl,--end-group
> > '-Wl,-rpath,$ORIGIN/../../auxiliary/pipe-loader:$ORIGIN/../../../loader:$ORIGIN/../../../util:$ORIGIN/../../auxiliary:$ORIGIN/../../../compiler/nir:$ORIGIN/../../../compiler'
> > -Wl,-rpath-link,/opt/mesa/build/src/gallium/auxiliary/pipe-loader
> > -Wl,-rpath-link,/opt/mesa/build/src/loader
> > -Wl,-rpath-link,/opt/mesa/build/src/util
> > -Wl,-rpath-link,/opt/mesa/build/src/gallium/auxiliary
> > -Wl,-rpath-link,/opt/mesa/build/src/compiler/nir
> > -Wl,-rpath-link,/opt/mesa/build/src/compiler
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld:
> > /usr/local/lib/libLLVMSPIRVLib.a(SPIRVWriter.cpp.o): relocation
> > R_X86_64_32 against symbol `__pthread_key_create@@GLIBC_2.2.5' can not
> > be used when making a shared object; recompile with -fPIC
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld:
> > /usr/local/lib/libLLVMSPIRVLib.a(PreprocessMetadata.cpp.o): relocation
> > R_X86_64_32 against `.rodata' can not be used when making a shared
> > object; recompile with -fPIC
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld:
> > /usr/local/lib/libLLVMSPIRVLib.a(SPIRVDebug.cpp.o): relocation
> > R_X86_64_32 against `.bss' can not be used when making a shared object;
> > recompile with -fPIC
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld:
> > /usr/local/lib/libLLVMSPIRVLib.a(SPIRVDecorate.cpp.o): relocation
> > R_X86_64_32 against symbol `_ZTVN5SPIRV20SPIRVDecorateGenericE' can not
> > be used when making a shared object; recompile with -fPIC
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld:
> > /usr/local/lib/libLLVMSPIRVLib.a(SPIRVEntry.cpp.o): relocation
> > R_X86_64_32 against symbol `__pthread_key_create@@GLIBC_2.2.5' can not
> > be used when making a shared object; recompile with -fPIC
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld:
> > /usr/local/lib/libLLVMSPIRVLib.a(SPIRVFunction.cpp.o): relocation
> > R_X86_64_32 against symbol `_ZTVN5SPIRV22SPIRVFunctionParameterE' can
> > not be used when making a shared object;

[Mesa-dev] [PATCH 4/4] nv50, nvc0: fix must_check warning of util_dynarray_resize_bytes

2019-09-20 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/nv50/nv50_state.c | 10 +++---
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 10 +++---
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state.c
index a4163aa1713..9390b61b748 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
@@ -1267,9 +1267,13 @@ nv50_set_global_bindings(struct pipe_context *pipe,
 
if (nv50->global_residents.size <= (end * sizeof(struct pipe_resource *))) {
   const unsigned old_size = nv50->global_residents.size;
-  util_dynarray_resize(>global_residents, struct pipe_resource *, 
end);
-  memset((uint8_t *)nv50->global_residents.data + old_size, 0,
- nv50->global_residents.size - old_size);
+  if (util_dynarray_resize(>global_residents, struct pipe_resource 
*, end)) {
+ memset((uint8_t *)nv50->global_residents.data + old_size, 0,
+nv50->global_residents.size - old_size);
+  } else {
+ NOUVEAU_ERR("Could not resize global residents array\n");
+ return;
+  }
}
 
if (resources) {
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index 60dcbe3ec39..956bd78defa 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -1374,9 +1374,13 @@ nvc0_set_global_bindings(struct pipe_context *pipe,
 
if (nvc0->global_residents.size <= (end * sizeof(struct pipe_resource *))) {
   const unsigned old_size = nvc0->global_residents.size;
-  util_dynarray_resize(>global_residents, struct pipe_resource *, 
end);
-  memset((uint8_t *)nvc0->global_residents.data + old_size, 0,
- nvc0->global_residents.size - old_size);
+  if (util_dynarray_resize(>global_residents, struct pipe_resource 
*, end)) {
+ memset((uint8_t *)nvc0->global_residents.data + old_size, 0,
+nvc0->global_residents.size - old_size);
+  } else {
+ NOUVEAU_ERR("Could not resize global residents array\n");
+ return;
+  }
}
 
if (resources) {
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] nv50ir: fix unnecessary parentheses warning

2019-09-20 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_util.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h
index 307c23d5e03..b1766f48205 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h
@@ -145,7 +145,7 @@ public:
 #define DLLIST_EMPTY(__list) ((__list)->next == (__list))
 
 #define DLLIST_FOR_EACH(list, it) \
-   for (DLList::Iterator (it) = (list)->iterator(); !(it).end(); (it).next())
+   for (DLList::Iterator it = (list)->iterator(); !(it).end(); (it).next())
 
 class DLList
 {
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] nv50ir/nir: comparison of integer expressions of different signedness warning

2019-09-20 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 4e86ab8f8cc..95b60d2c7d0 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1957,7 +1957,7 @@ Converter::visit(nir_intrinsic_instr *insn)
  }
  case Program::TYPE_GEOMETRY:
  case Program::TYPE_VERTEX: {
-if (info->io.genUserClip > 0 && idx == clipVertexOutput) {
+if (info->io.genUserClip > 0 && idx == (uint32_t)clipVertexOutput) 
{
mkMov(clipVtx[i], src);
src = clipVtx[i];
 }
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] nv50ir: fix memset on non trivial types warning

2019-09-20 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir.cpp| 4 +---
 src/gallium/drivers/nouveau/codegen/nv50_ir.h  | 2 +-
 src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp | 2 +-
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
index a181a13a3b1..45ee95bb103 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
@@ -903,10 +903,8 @@ Instruction::isCommutationLegal(const Instruction *i) const
 }
 
 TexInstruction::TexInstruction(Function *fn, operation op)
-   : Instruction(fn, op, TYPE_F32)
+   : Instruction(fn, op, TYPE_F32), tex()
 {
-   memset(, 0, sizeof(tex));
-
tex.rIndirectSrc = -1;
tex.sIndirectSrc = -1;
 
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
index b19751ab372..5163e1a7ec2 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
@@ -957,7 +957,7 @@ public:
class Target
{
public:
-  Target(TexTarget targ = TEX_TARGET_2D) : target(targ) { }
+  Target(TexTarget targ = TEX_TARGET_1D) : target(targ) { }
 
   const char *getName() const { return descTable[target].name; }
   unsigned int getArgCount() const { return descTable[target].argc; }
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp
index 5c6d0570ae2..609e7b89290 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp
@@ -455,7 +455,7 @@ CodeEmitter::addInterp(int ipa, int reg, FixupApply apply)
   if (!fixupInfo)
  return false;
   if (n == 0)
- memset(fixupInfo, 0, sizeof(FixupInfo));
+ fixupInfo->count = 0;
}
++fixupInfo->count;
 
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] nvc0: allow a non-user buffer to be bound at position 0

2019-07-26 Thread Karol Herbst

On Fri, Jul 26, 2019 at 2:59 PM Ilia Mirkin  wrote:
>
> Thanks! I had to make a small update to the asserts:
>
> assert(nvc0->constbuf[5][0].user || !nvc0->constbuf[5][0].u.buf);
>
> u.buf is not valid to check when .user is set. (in fact it aliases
> with the "data" pointer)
>
> Let me know if you want me to resend.
>

no, that's fine..

> On Fri, Jul 26, 2019 at 5:51 AM Karol Herbst  wrote:
> >
> > Reviewed-by: Karol Herbst 
> >
> > On Fri, Jul 26, 2019 at 5:31 AM Ilia Mirkin  wrote:
> > >
> > > Previously the code only handled it for positions 1 and up (as would be
> > > for UBO's in GL). It's not a lot of trouble to handle this, and vl or
> > > vdpau want this.
> > >
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
> > > Signed-off-by: Ilia Mirkin 
> > > Cc: mesa-sta...@lists.freedesktop.org
> > > ---
> > >  .../drivers/nouveau/nvc0/nve4_compute.c   | 45 +++
> > >  1 file changed, 27 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/src/gallium/drivers/nouveau/nvc0/nve4_compute.c 
> > > b/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
> > > index c5e4dec20bd..a1c40d1e6b9 100644
> > > --- a/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
> > > +++ b/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
> > > @@ -393,23 +393,24 @@ nve4_compute_validate_constbufs(struct nvc0_context 
> > > *nvc0)
> > >  uint64_t address
> > > = nvc0->screen->uniform_bo->offset + NVC0_CB_AUX_INFO(s);
> > >
> > > -assert(i > 0); /* we really only want uniform buffer objects 
> > > */
> > > -
> > > -BEGIN_NVC0(push, NVE4_CP(UPLOAD_DST_ADDRESS_HIGH), 2);
> > > -PUSH_DATAh(push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
> > > -PUSH_DATA (push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
> > > -BEGIN_NVC0(push, NVE4_CP(UPLOAD_LINE_LENGTH_IN), 2);
> > > -PUSH_DATA (push, 4 * 4);
> > > -PUSH_DATA (push, 0x1);
> > > -BEGIN_1IC0(push, NVE4_CP(UPLOAD_EXEC), 1 + 4);
> > > -PUSH_DATA (push, NVE4_COMPUTE_UPLOAD_EXEC_LINEAR | (0x20 << 
> > > 1));
> > > -
> > > -PUSH_DATA (push, res->address + nvc0->constbuf[s][i].offset);
> > > -PUSH_DATAh(push, res->address + nvc0->constbuf[s][i].offset);
> > > -PUSH_DATA (push, nvc0->constbuf[5][i].size);
> > > -PUSH_DATA (push, 0);
> > > -BCTX_REFN(nvc0->bufctx_cp, CP_CB(i), res, RD);
> > > +/* constbufs above 0 will are fetched via ubo info in the 
> > > shader */
> > > +if (i > 0) {
> > > +   BEGIN_NVC0(push, NVE4_CP(UPLOAD_DST_ADDRESS_HIGH), 2);
> > > +   PUSH_DATAh(push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
> > > +   PUSH_DATA (push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
> > > +   BEGIN_NVC0(push, NVE4_CP(UPLOAD_LINE_LENGTH_IN), 2);
> > > +   PUSH_DATA (push, 4 * 4);
> > > +   PUSH_DATA (push, 0x1);
> > > +   BEGIN_1IC0(push, NVE4_CP(UPLOAD_EXEC), 1 + 4);
> > > +   PUSH_DATA (push, NVE4_COMPUTE_UPLOAD_EXEC_LINEAR | (0x20 
> > > << 1));
> > > +
> > > +   PUSH_DATA (push, res->address + 
> > > nvc0->constbuf[s][i].offset);
> > > +   PUSH_DATAh(push, res->address + 
> > > nvc0->constbuf[s][i].offset);
> > > +   PUSH_DATA (push, nvc0->constbuf[s][i].size);
> > > +   PUSH_DATA (push, 0);
> > > +}
> > >
> > > +BCTX_REFN(nvc0->bufctx_cp, CP_CB(i), res, RD);
> > >  res->cb_bindings[s] |= 1 << i;
> > >   }
> > >}
> > > @@ -554,9 +555,9 @@ nve4_compute_derive_cache_split(struct nvc0_context 
> > > *nvc0, uint32_t shared_size)
> > >  static void
> > >  nve4_compute_setup_buf_cb(struct nvc0_context *nvc0, bool gp100, void 
> > > *desc)
> > >  {
> > > -   // only user constant buffers 1-6 can be put in the descriptor, the 
> > > rest are
> > > +   // only user constant buffers 0-6 can be put in the descriptor, the 
> > > rest are
> > > // loaded through global memory
> > > -   for (int i = 1; i <= 6; i++) {
> > > +   for

Re: [Mesa-dev] [PATCH] nv50/ir: don't consider the main compute function as taking arguments

2019-07-26 Thread Karol Herbst

I think this was there for generic support for functions actually and
that for OpenCL + TGSI the idea was to not inline everything by
default, so return values were handled there as well.

The proper way to handle is, to declare kernel inputs as real inputs,
because kernel inputs are fundamentally different from function
arguments and trying to handle them exactly the same will just result
in pain and fun issues like the VDPAU/VA one.

The correct way to handle that on the TGSI side is to never generate a
MAIN functions out of the actual source, then add a "main" function
which reads the shader IN variables and passes them as arguments to
the entry point called (which should be a named function inside the
TGSI). This way the now new main function has no parameters and no
return value, the world becomes sane and everybody is happy.

That's also how I implemented that for the OpenCL nir path and that
works out quite nicely as now you can just call different entry points
without having to deal with this "if this function is the entry point,
args get passed differently than being a called function" situation.

Anyway, the for the patch itself:
Reviewed-by: Karol Herbst 

On Fri, Jul 26, 2019 at 7:20 AM Ilia Mirkin  wrote:
>
> With OpenCL, kernels can take arguments and return values (?). However
> in practice, there is no more TGSI compute implementation, and even if
> there were, it would probably have named functions and no explicit main.
>
> This improves RA considerably for compute shaders, since temps are not
> kept around as return values.
>
> Signed-off-by: Ilia Mirkin 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index 9d0ab336c75..2dd13e70d0e 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -4298,7 +4298,7 @@ Converter::BindArgumentsPass::visit(Function *f)
>}
> }
>
> -   if (func == prog->main && prog->getType() != Program::TYPE_COMPUTE)
> +   if (func == prog->main /* && prog->getType() != Program::TYPE_COMPUTE */)
>return true;
> updatePrototype(::get(f->cfg.getRoot())->liveSet,
> ::buildLiveSets, ::ins);
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50/ir: handle insn not being there for definition of CVT arg

2019-07-26 Thread Karol Herbst

Reviewed-by: Karol Herbst 

On Fri, Jul 26, 2019 at 7:03 AM Ilia Mirkin  wrote:
>
> This can happen if it's e.g. a uniform or a function argument.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217
> Signed-off-by: Ilia Mirkin 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 0b3220903b9..bfdb923379b 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -2080,14 +2080,15 @@ void
>  AlgebraicOpt::handleCVT_CVT(Instruction *cvt)
>  {
> Instruction *insn = cvt->getSrc(0)->getInsn();
> -   RoundMode rnd = insn->rnd;
>
> -   if (insn->saturate ||
> +   if (!insn ||
> +   insn->saturate ||
> insn->subOp ||
> insn->dType != insn->sType ||
> insn->dType != cvt->sType)
>return;
>
> +   RoundMode rnd = insn->rnd;
> switch (insn->op) {
> case OP_CEIL:
>rnd = ROUND_PI;
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] nouveau: flip DEBUG -> !NDEBUG

2019-07-26 Thread Karol Herbst

Reviewed-by: Karol Herbst 

On Fri, Jul 26, 2019 at 5:31 AM Ilia Mirkin  wrote:
>
> The meson conversion chose to change the meaning of DEBUG to "used for
> debugging" to be "used for expensive things for debugging", primarily
> for nir_validate. Flip things over so that we get nice things with
> optimizations enabled.
>
> While we're at it, also kill off nouveau_statebuf.h which is unused (and
> has a mention of DEBUG which is how I found it).
>
> Signed-off-by: Ilia Mirkin 
> ---
>  src/gallium/drivers/nouveau/Makefile.sources  |  1 -
>  .../drivers/nouveau/codegen/nv50_ir_driver.h  |  2 +-
>  .../drivers/nouveau/codegen/nv50_ir_inlines.h |  2 +-
>  .../drivers/nouveau/codegen/nv50_ir_util.h|  8 ++---
>  src/gallium/drivers/nouveau/meson.build   |  1 -
>  src/gallium/drivers/nouveau/nouveau_screen.h  |  2 +-
>  .../drivers/nouveau/nouveau_statebuf.h| 32 ---
>  .../drivers/nouveau/nv50/nv50_program.c   |  2 +-
>  .../drivers/nouveau/nvc0/nvc0_program.c   |  8 ++---
>  .../drivers/nouveau/nvc0/nve4_compute.c   |  6 ++--
>  10 files changed, 15 insertions(+), 49 deletions(-)
>  delete mode 100644 src/gallium/drivers/nouveau/nouveau_statebuf.h
>
> diff --git a/src/gallium/drivers/nouveau/Makefile.sources 
> b/src/gallium/drivers/nouveau/Makefile.sources
> index c6a1aff7110..6c360992a53 100644
> --- a/src/gallium/drivers/nouveau/Makefile.sources
> +++ b/src/gallium/drivers/nouveau/Makefile.sources
> @@ -12,7 +12,6 @@ C_SOURCES := \
> nouveau_mm.h \
> nouveau_screen.c \
> nouveau_screen.h \
> -   nouveau_statebuf.h \
> nouveau_video.c \
> nouveau_video.h \
> nouveau_vp3_video_bsp.c \
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> index 95b3d633ee6..322bdd02557 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> @@ -54,7 +54,7 @@ struct nv50_ir_varying
> ubyte si; /* TGSI semantic index */
>  };
>
> -#ifdef DEBUG
> +#ifndef NDEBUG
>  # define NV50_IR_DEBUG_BASIC (1 << 0)
>  # define NV50_IR_DEBUG_VERBOSE   (2 << 0)
>  # define NV50_IR_DEBUG_REG_ALLOC (1 << 2)
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_inlines.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_inlines.h
> index 4cb53ab42ed..b4ca5ed8215 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_inlines.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_inlines.h
> @@ -222,7 +222,7 @@ Instruction *Value::getUniqueInsn() const
>  return (*it)->getInsn();
>// should be unreachable and trigger assertion at the end
> }
> -#ifdef DEBUG
> +#ifndef NDEBUG
> if (reg.data.id < 0) {
>int n = 0;
>for (DefCIterator it = defs.begin(); n < 2 && it != defs.end(); ++it)
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h
> index affe04a2dd9..307c23d5e03 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h
> @@ -36,14 +36,14 @@
>  #include "util/u_inlines.h"
>  #include "util/u_memory.h"
>
> -#define ERROR(args...) debug_printf("ERROR: " args)
> -#define WARN(args...) debug_printf("WARNING: " args)
> -#define INFO(args...) debug_printf(args)
> +#define ERROR(args...) _debug_printf("ERROR: " args)
> +#define WARN(args...) _debug_printf("WARNING: " args)
> +#define INFO(args...) _debug_printf(args)
>
>  #define INFO_DBG(m, f, args...)  \
> do {  \
>if (m & NV50_IR_DEBUG_##f) \
> - debug_printf(args); \
> + _debug_printf(args); \
> } while(0)
>
>  #define FATAL(args...)  \
> diff --git a/src/gallium/drivers/nouveau/meson.build 
> b/src/gallium/drivers/nouveau/meson.build
> index 64138212b5b..b3e79bf7089 100644
> --- a/src/gallium/drivers/nouveau/meson.build
> +++ b/src/gallium/drivers/nouveau/meson.build
> @@ -32,7 +32,6 @@ files_libnouveau = files(
>'nouveau_mm.h',
>'nouveau_screen.c',
>'nouveau_screen.h',
> -  'nouveau_statebuf.h',
>'nouveau_video.c',
>'nouveau_video.h',
>'nouveau_vp3_video_bsp.c',
> diff --git a/src/gallium/drivers/nouveau/nouveau_screen.h 
> b/src/gallium/drivers/nouveau/nouveau_screen.h
> index 1302c608bec..450c7c466be 100644
> --- a/src/gallium/drivers/nouveau/nouveau_screen.h
> +++ b/src/gallium/drivers/nouveau/nou

Re: [Mesa-dev] [PATCH 3/4] nvc0: allow a non-user buffer to be bound at position 0

2019-07-26 Thread Karol Herbst

Reviewed-by: Karol Herbst 

On Fri, Jul 26, 2019 at 5:31 AM Ilia Mirkin  wrote:
>
> Previously the code only handled it for positions 1 and up (as would be
> for UBO's in GL). It's not a lot of trouble to handle this, and vl or
> vdpau want this.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
> Signed-off-by: Ilia Mirkin 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  .../drivers/nouveau/nvc0/nve4_compute.c   | 45 +++
>  1 file changed, 27 insertions(+), 18 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nve4_compute.c 
> b/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
> index c5e4dec20bd..a1c40d1e6b9 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
> @@ -393,23 +393,24 @@ nve4_compute_validate_constbufs(struct nvc0_context 
> *nvc0)
>  uint64_t address
> = nvc0->screen->uniform_bo->offset + NVC0_CB_AUX_INFO(s);
>
> -assert(i > 0); /* we really only want uniform buffer objects */
> -
> -BEGIN_NVC0(push, NVE4_CP(UPLOAD_DST_ADDRESS_HIGH), 2);
> -PUSH_DATAh(push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
> -PUSH_DATA (push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
> -BEGIN_NVC0(push, NVE4_CP(UPLOAD_LINE_LENGTH_IN), 2);
> -PUSH_DATA (push, 4 * 4);
> -PUSH_DATA (push, 0x1);
> -BEGIN_1IC0(push, NVE4_CP(UPLOAD_EXEC), 1 + 4);
> -PUSH_DATA (push, NVE4_COMPUTE_UPLOAD_EXEC_LINEAR | (0x20 << 1));
> -
> -PUSH_DATA (push, res->address + nvc0->constbuf[s][i].offset);
> -PUSH_DATAh(push, res->address + nvc0->constbuf[s][i].offset);
> -PUSH_DATA (push, nvc0->constbuf[5][i].size);
> -PUSH_DATA (push, 0);
> -BCTX_REFN(nvc0->bufctx_cp, CP_CB(i), res, RD);
> +/* constbufs above 0 will are fetched via ubo info in the shader 
> */
> +if (i > 0) {
> +   BEGIN_NVC0(push, NVE4_CP(UPLOAD_DST_ADDRESS_HIGH), 2);
> +   PUSH_DATAh(push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
> +   PUSH_DATA (push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
> +   BEGIN_NVC0(push, NVE4_CP(UPLOAD_LINE_LENGTH_IN), 2);
> +   PUSH_DATA (push, 4 * 4);
> +   PUSH_DATA (push, 0x1);
> +   BEGIN_1IC0(push, NVE4_CP(UPLOAD_EXEC), 1 + 4);
> +   PUSH_DATA (push, NVE4_COMPUTE_UPLOAD_EXEC_LINEAR | (0x20 << 
> 1));
> +
> +   PUSH_DATA (push, res->address + nvc0->constbuf[s][i].offset);
> +   PUSH_DATAh(push, res->address + nvc0->constbuf[s][i].offset);
> +   PUSH_DATA (push, nvc0->constbuf[s][i].size);
> +   PUSH_DATA (push, 0);
> +}
>
> +BCTX_REFN(nvc0->bufctx_cp, CP_CB(i), res, RD);
>  res->cb_bindings[s] |= 1 << i;
>   }
>}
> @@ -554,9 +555,9 @@ nve4_compute_derive_cache_split(struct nvc0_context 
> *nvc0, uint32_t shared_size)
>  static void
>  nve4_compute_setup_buf_cb(struct nvc0_context *nvc0, bool gp100, void *desc)
>  {
> -   // only user constant buffers 1-6 can be put in the descriptor, the rest 
> are
> +   // only user constant buffers 0-6 can be put in the descriptor, the rest 
> are
> // loaded through global memory
> -   for (int i = 1; i <= 6; i++) {
> +   for (int i = 0; i <= 6; i++) {
>if (nvc0->constbuf[5][i].user || !nvc0->constbuf[5][i].u.buf)
>   continue;
>
> @@ -609,6 +610,10 @@ nve4_compute_setup_launch_desc(struct nvc0_context *nvc0,
> if (nvc0->constbuf[5][0].user || cp->parm_size) {
>nve4_cp_launch_desc_set_cb(desc, 0, screen->uniform_bo,
>   NVC0_CB_USR_INFO(5), 1 << 16);
> +
> +  // Later logic will attempt to bind a real buffer at position 0. That
> +  // should not happen if we've bound a user buffer.
> +  assert(!nvc0->constbuf[5][0].u.buf);
> }
> nve4_cp_launch_desc_set_cb(desc, 7, screen->uniform_bo,
>NVC0_CB_AUX_INFO(5), 1 << 11);
> @@ -649,6 +654,10 @@ gp100_compute_setup_launch_desc(struct nvc0_context 
> *nvc0,
> if (nvc0->constbuf[5][0].user || cp->parm_size) {
>gp100_cp_launch_desc_set_cb(desc, 0, screen->uniform_bo,
>NVC0_CB_USR_INFO(5), 1 << 16);
> +
> +  // Later logic will attempt to bind a real buffer at position 0. That
> +  // should not happen if we've bound a user buffer.
> +  assert(!nvc0->c

Re: [Mesa-dev] [PATCH 2/4] nv50, nvc0: update sampler/view bind functions to accept NULL array

2019-07-26 Thread Karol Herbst

Reviewed-by: Karol Herbst 

On Fri, Jul 26, 2019 at 5:31 AM Ilia Mirkin  wrote:
>
> Apparently vl (or vdpau) wants to pass that in now. Handle it.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
> Signed-off-by: Ilia Mirkin 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_state.c | 14 --
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 18 ++
>  2 files changed, 18 insertions(+), 14 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> index 8b294be6d86..a4163aa1713 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> @@ -599,19 +599,20 @@ nv50_sampler_state_delete(struct pipe_context *pipe, 
> void *hwcso)
>
>  static inline void
>  nv50_stage_sampler_states_bind(struct nv50_context *nv50, int s,
> -   unsigned nr, void **hwcso)
> +   unsigned nr, void **hwcsos)
>  {
> unsigned highest_found = 0;
> unsigned i;
>
> assert(nr <= PIPE_MAX_SAMPLERS);
> for (i = 0; i < nr; ++i) {
> +  struct nv50_tsc_entry *hwcso = hwcsos ? nv50_tsc_entry(hwcsos[i]) : 
> NULL;
>struct nv50_tsc_entry *old = nv50->samplers[s][i];
>
> -  if (hwcso[i])
> +  if (hwcso)
>   highest_found = i;
>
> -  nv50->samplers[s][i] = nv50_tsc_entry(hwcso[i]);
> +  nv50->samplers[s][i] = hwcso;
>if (old)
>   nv50_screen_tsc_unlock(nv50->screen, old);
> }
> @@ -685,12 +686,13 @@ nv50_stage_set_sampler_views(struct nv50_context *nv50, 
> int s,
>
> assert(nr <= PIPE_MAX_SAMPLERS);
> for (i = 0; i < nr; ++i) {
> +  struct pipe_sampler_view *view = views ? views[i] : NULL;
>struct nv50_tic_entry *old = nv50_tic_entry(nv50->textures[s][i]);
>if (old)
>   nv50_screen_tic_unlock(nv50->screen, old);
>
> -  if (views[i] && views[i]->texture) {
> - struct pipe_resource *res = views[i]->texture;
> +  if (view && view->texture) {
> + struct pipe_resource *res = view->texture;
>   if (res->target == PIPE_BUFFER &&
>   (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT))
>  nv50->textures_coherent[s] |= 1 << i;
> @@ -700,7 +702,7 @@ nv50_stage_set_sampler_views(struct nv50_context *nv50, 
> int s,
>   nv50->textures_coherent[s] &= ~(1 << i);
>}
>
> -  pipe_sampler_view_reference(>textures[s][i], views[i]);
> +  pipe_sampler_view_reference(>textures[s][i], view);
> }
>
> assert(nv50->num_textures[s] <= PIPE_MAX_SAMPLERS);
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> index a9ee7b784bd..60dcbe3ec39 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> @@ -463,22 +463,23 @@ nvc0_sampler_state_delete(struct pipe_context *pipe, 
> void *hwcso)
>  static inline void
>  nvc0_stage_sampler_states_bind(struct nvc0_context *nvc0,
> unsigned s,
> -   unsigned nr, void **hwcso)
> +   unsigned nr, void **hwcsos)
>  {
> unsigned highest_found = 0;
> unsigned i;
>
> for (i = 0; i < nr; ++i) {
> +  struct nv50_tsc_entry *hwcso = hwcsos ? nv50_tsc_entry(hwcsos[i]) : 
> NULL;
>struct nv50_tsc_entry *old = nvc0->samplers[s][i];
>
> -  if (hwcso[i])
> +  if (hwcso)
>   highest_found = i;
>
> -  if (hwcso[i] == old)
> +  if (hwcso == old)
>   continue;
>nvc0->samplers_dirty[s] |= 1 << i;
>
> -  nvc0->samplers[s][i] = nv50_tsc_entry(hwcso[i]);
> +  nvc0->samplers[s][i] = hwcso;
>if (old)
>   nvc0_screen_tsc_unlock(nvc0->screen, old);
> }
> @@ -523,14 +524,15 @@ nvc0_stage_set_sampler_views(struct nvc0_context *nvc0, 
> int s,
> unsigned i;
>
> for (i = 0; i < nr; ++i) {
> +  struct pipe_sampler_view *view = views ? views[i] : NULL;
>struct nv50_tic_entry *old = nv50_tic_entry(nvc0->textures[s][i]);
>
> -  if (views[i] == nvc0->textures[s][i])
> +  if (view == nvc0->textures[s][i])
>   continue;
>nvc0->textures_dirty[s] |= 1 << i;
>
> -  if (views[i] && views[i]->texture) {
> - struct pipe_resource *res = views[i]->texture;
> +

Re: [Mesa-dev] [PATCH] nvc0/ir: Fix assert accessing null pointer

2019-07-24 Thread Karol Herbst

it's only fixing a crash in a build with asserts enabled, but if
somebody wants to apply those to stable, then go ahead.

On Wed, Jul 24, 2019 at 12:48 PM Juan A. Suarez Romero
 wrote:
>
> On Fri, 2019-07-19 at 13:56 +0200, Mark Menzynski wrote:
> > Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=111007
> > Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=67
> > Signed-off-by: Mark Menzynski 
> > ---
> >  src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
>
>
> Looks like a good candidate for 19.1 stable. WDYT?
>
> J.A.
>
> >
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > index aca3b0afb1e..1f702a987d8 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > @@ -51,12 +51,12 @@ NVC0LegalizeSSA::handleDIV(Instruction *i)
> > // Generate movs to the input regs for the call we want to generate
> > for (int s = 0; i->srcExists(s); ++s) {
> >Instruction *ld = i->getSrc(s)->getInsn();
> > -  assert(ld->getSrc(0) != NULL);
> >// check if we are moving an immediate, propagate it in that case
> >if (!ld || ld->fixed || (ld->op != OP_LOAD && ld->op != OP_MOV) ||
> >  !(ld->src(0).getFile() == FILE_IMMEDIATE))
> >   bld.mkMovToReg(s, i->getSrc(s));
> >else {
> > + assert(ld->getSrc(0) != NULL);
> >   bld.mkMovToReg(s, ld->getSrc(0));
> >   // Clear the src, to make code elimination possible here before we
> >   // delete the instruction i later
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50/ir: Add mul and mod constant optimizations

2019-07-23 Thread Karol Herbst

yeah.. I am not quite sure myself about it. But skipping on the div
emulation seems like a good idea in general. But it's also not common
enough to actually care all that much about it.

On Tue, Jul 23, 2019 at 5:18 PM Ilia Mirkin  wrote:
>
> On Tue, Jul 23, 2019 at 11:15 AM Karol Herbst  wrote:
> >
> > On Tue, Jul 23, 2019 at 4:50 PM Ilia Mirkin  wrote:
> > >
> > > You handle 1/n but not 1%n? TBH, the 1/n code isn't 100% obvious to
> > > me... 1/n = |n|-1 > 0 ?  i forget how SLCT works, but I can't
> > > think of a way to finish that expression in terms of |n|-1 and n. And
> > > what about n == 0. I'd just as soon drop that case.
> > >
> >
> > is 1/0 actually defined in glsl? I thought that the result is
> > undefined and we can basically do whatever, no? At least intel seems
> > to return INT_MAX for 1/0
>
> If you guys really like it, just add more comments that cover my questions.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50/ir: Add mul and mod constant optimizations

2019-07-23 Thread Karol Herbst

On Tue, Jul 23, 2019 at 4:50 PM Ilia Mirkin  wrote:
>
> You handle 1/n but not 1%n? TBH, the 1/n code isn't 100% obvious to
> me... 1/n = |n|-1 > 0 ?  i forget how SLCT works, but I can't
> think of a way to finish that expression in terms of |n|-1 and n. And
> what about n == 0. I'd just as soon drop that case.
>

is 1/0 actually defined in glsl? I thought that the result is
undefined and we can basically do whatever, no? At least intel seems
to return INT_MAX for 1/0

> On Tue, Jul 23, 2019 at 10:20 AM Mark Menzynski  wrote:
> >
> > Optimizations for 0/n, 1/n and 0%n.
> > No changes in shader db tests, because it is never used here, but it
> > should become handy.
> >
> > Signed-off-by: Mark Menzynski 
> > ---
> >  .../nouveau/codegen/nv50_ir_peephole.cpp  | 30 +--
> >  1 file changed, 28 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> > index 0b3220903b9..12069e19808 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> > @@ -1177,10 +1177,28 @@ ConstantFolding::opnd(Instruction *i, 
> > ImmediateValue , int s)
> >break;
> >
> > case OP_DIV:
> > -  if (s != 1 || (i->dType != TYPE_S32 && i->dType != TYPE_U32))
> > +  if (i->dType != TYPE_S32 && i->dType != TYPE_U32)
> >   break;
> > +
> >bld.setPosition(i, false);
> > -  if (imm0.reg.data.u32 == 0) {
> > +  if (s == 0) {
> > + if (imm0.reg.data.u32 == 0) {
> > +i->op = OP_MOV;
> > +i->setSrc(1, NULL);
> > + }
> > + else if (imm0.reg.data.u32 == 1) {
> > +Value *tA, *tB;
> > +Instruction *slct;
> > +
> > +tA = bld.mkOp1v(OP_ABS, TYPE_U32, bld.getSSA(), i->getSrc(1));
> > +tB = bld.mkOp2v(OP_ADD, TYPE_S32, bld.getSSA(), tA, 
> > bld.loadImm(NULL, -1));
> > +slct = bld.mkCmp(OP_SLCT, CC_GT, i->dType, bld.getSSA(), 
> > TYPE_U32, bld.loadImm(NULL, 0), i->getSrc(1), tB);
> > +i->def(0).replace(slct->getDef(0), false);
> > + }
> > + break;
> > +  }
> > +
> > +  if (s != 1 || imm0.reg.data.u32 == 0) {
> >   break;
> >} else
> >if (imm0.reg.data.u32 == 1) {
> > @@ -1259,6 +1277,14 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
> > , int s)
> >break;
> >
> > case OP_MOD:
> > +  if (s == 0) {
> > + if (imm0.reg.data.u32 == 0) {
> > +i->op = OP_MOV;
> > +i->setSrc(1, NULL);
> > + }
> > + break;
> > +  }
> > +
> >if (s == 1 && imm0.isPow2()) {
> >   bld.setPosition(i, false);
> >   if (i->sType == TYPE_U32) {
> > --
> > 2.21.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nvc0: remove nvc0_program.tp.input_patch_size

2019-07-08 Thread Karol Herbst

right now that's dead code

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h | 1 -
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c  | 4 
 src/gallium/drivers/nouveau/nvc0/nvc0_program.h  | 1 -
 3 files changed, 6 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
index 7c835ceab8d..95b3d633ee6 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
@@ -123,7 +123,6 @@ struct nv50_ir_prog_info
  bool usesDrawParameters;
   } vp;
   struct {
- uint8_t inputPatchSize;
  uint8_t outputPatchSize;
  uint8_t partitioning;/* PIPE_TESS_PART */
  int8_t winding;  /* +1 (clockwise) / -1 (counter-clockwise) */
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
index 1ff9f19f139..180b31ea893 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
@@ -343,8 +343,6 @@ nvc0_tcp_gen_header(struct nvc0_program *tcp, struct 
nv50_ir_prog_info *info)
 {
unsigned opcs = 6; /* output patch constants (at least the TessFactors) */
 
-   tcp->tp.input_patch_size = info->prop.tp.inputPatchSize;
-
if (info->numPatchConstants)
   opcs = 8 + info->numPatchConstants * 4;
 
@@ -374,8 +372,6 @@ nvc0_tcp_gen_header(struct nvc0_program *tcp, struct 
nv50_ir_prog_info *info)
 static int
 nvc0_tep_gen_header(struct nvc0_program *tep, struct nv50_ir_prog_info *info)
 {
-   tep->tp.input_patch_size = ~0;
-
tep->hdr[0] = 0x20061 | (3 << 10);
tep->hdr[4] = 0xff000;
 
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_program.h
index b73822ea9f7..183b14a42c2 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.h
@@ -54,7 +54,6 @@ struct nvc0_program {
} fp;
struct {
   uint32_t tess_mode; /* ~0 if defined by the other stage */
-  uint32_t input_patch_size;
} tp;
struct {
   uint32_t lmem_size; /* local memory (TGSI PRIVATE resource) size */
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nouveau: handle new CAPS

2019-07-02 Thread Karol Herbst

On Tue, Jul 2, 2019 at 5:54 PM Ilia Mirkin  wrote:
>
> Can you check on PIPE_CAP_COMPUTE_SHADER_DERIVATIVES ? I think we
> should be able to just flip that on for nvc0. Also the
> CS_DERIVED_SYSTEM_VALUES thing might be useful -- I had wanted to do
> that a while ago but laziness defeated me. Now that it's there though
> ... we have sysvals for many of those derived things.
>
> Or at least add commentary about each one, like "should be enabled
> when we get to it" sort of thing.
>

I added a trello card for the PIPE_CAP_COMPUTE_SHADER_DERIVATIVES one,
but I could add another one for CS_DERIVED_SYSTEM_VALUES

> On Tue, Jul 2, 2019 at 11:49 AM Karol Herbst  wrote:
> >
> > Signed-off-by: Karol Herbst 
> > ---
> >  src/gallium/drivers/nouveau/nv50/nv50_screen.c | 13 +
> >  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 13 +
> >  2 files changed, 26 insertions(+)
> >
> > diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
> > b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> > index b84330b4b38..24796aff1ce 100644
> > --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> > +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> > @@ -320,6 +320,13 @@ nv50_screen_get_param(struct pipe_screen *pscreen, 
> > enum pipe_cap param)
> > case PIPE_CAP_NIR_COMPACT_ARRAYS:
> > case PIPE_CAP_COMPUTE:
> > case PIPE_CAP_IMAGE_LOAD_FORMATTED:
> > +   case PIPE_CAP_COMPUTE_SHADER_DERIVATIVES:
> > +   case PIPE_CAP_ATOMIC_FLOAT_MINMAX:
> > +   case PIPE_CAP_CONSERVATIVE_RASTER_INNER_COVERAGE:
> > +   case PIPE_CAP_FRAGMENT_SHADER_INTERLOCK:
> > +   case PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED:
> > +   case PIPE_CAP_FBFETCH_COHERENT:
> > +   case PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS:
> >return 0;
> >
> > case PIPE_CAP_VENDOR_ID:
> > @@ -338,8 +345,14 @@ nv50_screen_get_param(struct pipe_screen *pscreen, 
> > enum pipe_cap param)
> >return dev->vram_size >> 20;
> > case PIPE_CAP_UMA:
> >return 0;
> > +
> > default:
> >debug_printf("%s: unhandled cap %d\n", __func__, param);
> > +  /* fallthrough */
> > +   /* caps where we want the default value */
> > +   case PIPE_CAP_DMABUF:
> > +   case PIPE_CAP_ESSL_FEATURE_LEVEL:
> > +   case PIPE_CAP_MAX_FRAMES_IN_FLIGHT:
> >return u_pipe_screen_get_param_defaults(pscreen, param);
> > }
> >  }
> > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
> > b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> > index 3a543e54d1f..bf883631b86 100644
> > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> > @@ -355,6 +355,13 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, 
> > enum pipe_cap param)
> > case PIPE_CAP_GLSL_TESS_LEVELS_AS_INPUTS:
> > case PIPE_CAP_NIR_COMPACT_ARRAYS:
> > case PIPE_CAP_IMAGE_LOAD_FORMATTED:
> > +   case PIPE_CAP_COMPUTE_SHADER_DERIVATIVES:
> > +   case PIPE_CAP_ATOMIC_FLOAT_MINMAX:
> > +   case PIPE_CAP_CONSERVATIVE_RASTER_INNER_COVERAGE:
> > +   case PIPE_CAP_FRAGMENT_SHADER_INTERLOCK:
> > +   case PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED:
> > +   case PIPE_CAP_FBFETCH_COHERENT:
> > +   case PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS:
> >return 0;
> >
> > case PIPE_CAP_VENDOR_ID:
> > @@ -373,8 +380,14 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, 
> > enum pipe_cap param)
> >return dev->vram_size >> 20;
> > case PIPE_CAP_UMA:
> >return 0;
> > +
> > default:
> >debug_printf("%s: unhandled cap %d\n", __func__, param);
> > +  /* fallthrough */
> > +   /* caps where we want the default value */
> > +   case PIPE_CAP_DMABUF:
> > +   case PIPE_CAP_ESSL_FEATURE_LEVEL:
> > +   case PIPE_CAP_MAX_FRAMES_IN_FLIGHT:
> >return u_pipe_screen_get_param_defaults(pscreen, param);
> > }
> >  }
> > --
> > 2.21.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nouveau: handle new CAPS

2019-07-02 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/nv50/nv50_screen.c | 13 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 13 +
 2 files changed, 26 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index b84330b4b38..24796aff1ce 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -320,6 +320,13 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_NIR_COMPACT_ARRAYS:
case PIPE_CAP_COMPUTE:
case PIPE_CAP_IMAGE_LOAD_FORMATTED:
+   case PIPE_CAP_COMPUTE_SHADER_DERIVATIVES:
+   case PIPE_CAP_ATOMIC_FLOAT_MINMAX:
+   case PIPE_CAP_CONSERVATIVE_RASTER_INNER_COVERAGE:
+   case PIPE_CAP_FRAGMENT_SHADER_INTERLOCK:
+   case PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED:
+   case PIPE_CAP_FBFETCH_COHERENT:
+   case PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
@@ -338,8 +345,14 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
   return dev->vram_size >> 20;
case PIPE_CAP_UMA:
   return 0;
+
default:
   debug_printf("%s: unhandled cap %d\n", __func__, param);
+  /* fallthrough */
+   /* caps where we want the default value */
+   case PIPE_CAP_DMABUF:
+   case PIPE_CAP_ESSL_FEATURE_LEVEL:
+   case PIPE_CAP_MAX_FRAMES_IN_FLIGHT:
   return u_pipe_screen_get_param_defaults(pscreen, param);
}
 }
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index 3a543e54d1f..bf883631b86 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -355,6 +355,13 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_GLSL_TESS_LEVELS_AS_INPUTS:
case PIPE_CAP_NIR_COMPACT_ARRAYS:
case PIPE_CAP_IMAGE_LOAD_FORMATTED:
+   case PIPE_CAP_COMPUTE_SHADER_DERIVATIVES:
+   case PIPE_CAP_ATOMIC_FLOAT_MINMAX:
+   case PIPE_CAP_CONSERVATIVE_RASTER_INNER_COVERAGE:
+   case PIPE_CAP_FRAGMENT_SHADER_INTERLOCK:
+   case PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED:
+   case PIPE_CAP_FBFETCH_COHERENT:
+   case PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
@@ -373,8 +380,14 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
   return dev->vram_size >> 20;
case PIPE_CAP_UMA:
   return 0;
+
default:
   debug_printf("%s: unhandled cap %d\n", __func__, param);
+  /* fallthrough */
+   /* caps where we want the default value */
+   case PIPE_CAP_DMABUF:
+   case PIPE_CAP_ESSL_FEATURE_LEVEL:
+   case PIPE_CAP_MAX_FRAMES_IN_FLIGHT:
   return u_pipe_screen_get_param_defaults(pscreen, param);
}
 }
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nouveau: fix frees in unsupported IR error paths.

2019-06-18 Thread Karol Herbst

On Tue, Jun 18, 2019 at 11:14 PM Dave Airlie  wrote:
>
> From: Dave Airlie 
>
> This is pointless in that we won't ever hit those paths in real life,
> but coverity complains.
>

what does it actually complain about?

> Fixes: f014ae3c7cce ("nouveau: add support for nir")
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_program.c | 1 +
>  src/gallium/drivers/nouveau/nv50/nv50_state.c   | 2 ++
>  src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 1 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   | 2 ++
>  4 files changed, 6 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_program.c
> index 940fb9ce25c..a725aedcd8e 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_program.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_program.c
> @@ -346,6 +346,7 @@ nv50_program_translate(struct nv50_program *prog, 
> uint16_t chipset,
>break;
> default:
>assert(!"unsupported IR!");
> +  free(info);
>return false;
> }
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> index 228feced5d1..89558ee442f 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> @@ -768,6 +768,7 @@ nv50_sp_state_create(struct pipe_context *pipe,
>break;
> default:
>assert(!"unsupported IR!");
> +  free(prog);
>return NULL;
> }
>
> @@ -864,6 +865,7 @@ nv50_cp_state_create(struct pipe_context *pipe,
>break;
> default:
>assert(!"unsupported IR!");
> +  free(prog);
>return NULL;
> }
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> index c81d8952c98..1ff9f19f139 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> @@ -594,6 +594,7 @@ nvc0_program_translate(struct nvc0_program *prog, 
> uint16_t chipset,
>break;
> default:
>assert(!"unsupported IR!");
> +  free(info);
>return false;
> }
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> index 2ab51c8529e..7c0f605dc16 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> @@ -607,6 +607,7 @@ nvc0_sp_state_create(struct pipe_context *pipe,
>break;
> default:
>assert(!"unsupported IR!");
> +  free(prog);
>return NULL;
> }
>
> @@ -739,6 +740,7 @@ nvc0_cp_state_create(struct pipe_context *pipe,
>break;
> default:
>assert(!"unsupported IR!");
> +  free(prog);
>return NULL;
> }
>
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nouveau: fix frees in unsupported IR error paths.

2019-06-18 Thread Karol Herbst

ohh, nvm... I already know...

On Tue, Jun 18, 2019 at 11:18 PM Karol Herbst  wrote:
>
> On Tue, Jun 18, 2019 at 11:14 PM Dave Airlie  wrote:
> >
> > From: Dave Airlie 
> >
> > This is pointless in that we won't ever hit those paths in real life,
> > but coverity complains.
> >
>
> what does it actually complain about?
>
> > Fixes: f014ae3c7cce ("nouveau: add support for nir")
> > ---
> >  src/gallium/drivers/nouveau/nv50/nv50_program.c | 1 +
> >  src/gallium/drivers/nouveau/nv50/nv50_state.c   | 2 ++
> >  src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 1 +
> >  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   | 2 ++
> >  4 files changed, 6 insertions(+)
> >
> > diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.c 
> > b/src/gallium/drivers/nouveau/nv50/nv50_program.c
> > index 940fb9ce25c..a725aedcd8e 100644
> > --- a/src/gallium/drivers/nouveau/nv50/nv50_program.c
> > +++ b/src/gallium/drivers/nouveau/nv50/nv50_program.c
> > @@ -346,6 +346,7 @@ nv50_program_translate(struct nv50_program *prog, 
> > uint16_t chipset,
> >break;
> > default:
> >assert(!"unsupported IR!");
> > +  free(info);
> >return false;
> > }
> >
> > diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
> > b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> > index 228feced5d1..89558ee442f 100644
> > --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
> > +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> > @@ -768,6 +768,7 @@ nv50_sp_state_create(struct pipe_context *pipe,
> >break;
> > default:
> >assert(!"unsupported IR!");
> > +  free(prog);
> >return NULL;
> > }
> >
> > @@ -864,6 +865,7 @@ nv50_cp_state_create(struct pipe_context *pipe,
> >break;
> > default:
> >assert(!"unsupported IR!");
> > +  free(prog);
> >return NULL;
> > }
> >
> > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
> > b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> > index c81d8952c98..1ff9f19f139 100644
> > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> > @@ -594,6 +594,7 @@ nvc0_program_translate(struct nvc0_program *prog, 
> > uint16_t chipset,
> >break;
> > default:
> >assert(!"unsupported IR!");
> > +  free(info);
> >return false;
> > }
> >
> > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
> > b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> > index 2ab51c8529e..7c0f605dc16 100644
> > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> > @@ -607,6 +607,7 @@ nvc0_sp_state_create(struct pipe_context *pipe,
> >break;
> > default:
> >assert(!"unsupported IR!");
> > +  free(prog);
> >return NULL;
> > }
> >
> > @@ -739,6 +740,7 @@ nvc0_cp_state_create(struct pipe_context *pipe,
> >break;
> > default:
> >assert(!"unsupported IR!");
> > +  free(prog);
> >return NULL;
> > }
> >
> > --
> > 2.21.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] undefined behaviour in spirv_to_nir.c

2019-05-17 Thread Karol Herbst

well, the code was required for the old style load_const as we unioned
the arrays. But now that the load_const data is just one 64 bit value
and we 0 out untouched bits I am quite sure we don't have to adjust
the bit size of the shift anymore? Although I would feel better if we
would have some explicit handling about it, even if the compiler just
optimizes it away.

On Fri, May 17, 2019 at 8:55 PM Jason Ekstrand  wrote:
>
> I'm not convinced that code is correct.  In particular, the bit_size value is 
> for the destination and not necessarily that one source.  As Karol points 
> out, it probably is safe to just delete.  However, I'd feel slightly better 
> about it if we figured out the right bit size and just called 
> nir_eval_const_opcode to do a u2u32 on the value.
>
> --Jason
>
> On Fri, May 17, 2019 at 1:24 AM Karol Herbst  wrote:
>>
>> ohhh, yeah.. I think we can actually just remove that code, as it
>> shouldn't have any affect on the constants value.
>>
>> On Fri, May 17, 2019 at 4:07 AM Jason Ekstrand  wrote:
>> >
>> > I think it's fine but I'm not at my computer right now.
>> >
>> > --Jason
>> >
>> > On May 16, 2019 20:58:03 Dave Airlie  wrote:
>> >
>> > > Coverity gave me this:
>> > >
>> > > mesa-19.1.0-rc2/src/compiler/spirv/spirv_to_nir.c:1987:
>> > > overlapping_assignment: Assigning "src[1][i].u8" to "src[1][i].u32",
>> > > which have overlapping memory locations and different types.
>> > >
>> > > and the following lines, I think it's actually undefined behaviour wrt
>> > > the C spec.
>> > >
>> > > Dave.
>> >
>> >
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] undefined behaviour in spirv_to_nir.c

2019-05-17 Thread Karol Herbst

ohhh, yeah.. I think we can actually just remove that code, as it
shouldn't have any affect on the constants value.

On Fri, May 17, 2019 at 4:07 AM Jason Ekstrand  wrote:
>
> I think it's fine but I'm not at my computer right now.
>
> --Jason
>
> On May 16, 2019 20:58:03 Dave Airlie  wrote:
>
> > Coverity gave me this:
> >
> > mesa-19.1.0-rc2/src/compiler/spirv/spirv_to_nir.c:1987:
> > overlapping_assignment: Assigning "src[1][i].u8" to "src[1][i].u32",
> > which have overlapping memory locations and different types.
> >
> > and the following lines, I think it's actually undefined behaviour wrt
> > the C spec.
> >
> > Dave.
>
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50/ir/nir: make use of SYSTEM_VALUE_MAX when iterating read sysvals

2019-05-12 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 7e59b83e8fc..cce9357cec7 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1341,7 +1341,7 @@ bool Converter::assignSlots() {
}
 
info->numSysVals = 0;
-   for (uint8_t i = 0; i < 64; ++i) {
+   for (uint8_t i = 0; i < SYSTEM_VALUE_MAX; ++i) {
   if (!(nir->info.system_values_read & 1ull << i))
  continue;
 
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50/ir/nir: prefer to shift 1ull instead of 1ll

2019-05-12 Thread Karol Herbst

Signed-off-by: Karol Herbst 
Suggested-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index a9a24267245..7e59b83e8fc 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1334,7 +1334,7 @@ bool Converter::assignSlots() {
  else
 info->out[vary].mask |= ((1 << comp) - 1) << frac;
 
- if (nir->info.outputs_read & 1ll << slot)
+ if (nir->info.outputs_read & 1ull << slot)
 info->out[vary].oread = 1;
   }
   info->numOutputs = std::max(info->numOutputs, vary);
@@ -1342,7 +1342,7 @@ bool Converter::assignSlots() {
 
info->numSysVals = 0;
for (uint8_t i = 0; i < 64; ++i) {
-  if (!(nir->info.system_values_read & 1ll << i))
+  if (!(nir->info.system_values_read & 1ull << i))
  continue;
 
   system_val_to_tgsi_semantic(i, , );
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 14/15] clover: add support for consuming spirv

2019-05-11 Thread Karol Herbst

v2: rework arguments to compiler::compile_program
add assert to device::ir_format

Signed-off-by: Karol Herbst 
Reviewed-by: Francisco Jerez 
---
 src/gallium/include/pipe/p_defines.h  |  1 +
 .../state_trackers/clover/core/compiler.hpp   | 68 +++
 .../state_trackers/clover/core/device.cpp | 21 --
 .../state_trackers/clover/core/program.cpp| 10 ++-
 src/gallium/state_trackers/clover/meson.build |  1 +
 5 files changed, 89 insertions(+), 12 deletions(-)
 create mode 100644 src/gallium/state_trackers/clover/core/compiler.hpp

diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index e59a92ea529..90ee1427eb3 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -986,6 +986,7 @@ enum pipe_shader_ir
PIPE_SHADER_IR_TGSI = 0,
PIPE_SHADER_IR_NATIVE,
PIPE_SHADER_IR_NIR,
+   PIPE_SHADER_IR_SPIRV,
 };
 
 /**
diff --git a/src/gallium/state_trackers/clover/core/compiler.hpp 
b/src/gallium/state_trackers/clover/core/compiler.hpp
new file mode 100644
index 000..96004459e14
--- /dev/null
+++ b/src/gallium/state_trackers/clover/core/compiler.hpp
@@ -0,0 +1,68 @@
+//
+// Copyright 2019 Red Hat, Inc.
+//
+// Permission is hereby granted, free of charge, to any person obtaining a
+// copy of this software and associated documentation files (the "Software"),
+// to deal in the Software without restriction, including without limitation
+// the rights to use, copy, modify, merge, publish, distribute, sublicense,
+// and/or sell copies of the Software, and to permit persons to whom the
+// Software is furnished to do so, subject to the following conditions:
+//
+// The above copyright notice and this permission notice shall be included in
+// all copies or substantial portions of the Software.
+//
+// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+// THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+// OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+// ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+// OTHER DEALINGS IN THE SOFTWARE.
+//
+
+#ifndef CLOVER_CORE_COMPILER_HPP
+#define CLOVER_CORE_COMPILER_HPP
+
+#include "core/device.hpp"
+#include "core/module.hpp"
+#include "llvm/invocation.hpp"
+#include "spirv/invocation.hpp"
+
+namespace clover {
+   namespace compiler {
+  static inline module
+  compile_program(const std::string , const header_map ,
+  const device , const std::string ,
+  std::string ) {
+ switch (dev.ir_format()) {
+#ifdef CLOVER_ALLOW_SPIRV
+ case PIPE_SHADER_IR_SPIRV:
+return llvm::compile_to_spirv(source, headers, dev, opts, log);
+#endif
+ case PIPE_SHADER_IR_NATIVE:
+return llvm::compile_program(source, headers, dev, opts, log);
+ default:
+unreachable("device with unsupported IR");
+throw error(CL_INVALID_VALUE);
+ }
+  }
+
+  static inline module
+  link_program(const std::vector , const device ,
+   const std::string , std::string ) {
+ switch (dev.ir_format()) {
+#ifdef CLOVER_ALLOW_SPIRV
+ case PIPE_SHADER_IR_SPIRV:
+return spirv::link_program(ms, dev, opts, log);
+#endif
+ case PIPE_SHADER_IR_NATIVE:
+return llvm::link_program(ms, dev, opts, log);
+ default:
+unreachable("device with unsupported IR");
+throw error(CL_INVALID_VALUE);
+ }
+  }
+   }
+}
+
+#endif
diff --git a/src/gallium/state_trackers/clover/core/device.cpp 
b/src/gallium/state_trackers/clover/core/device.cpp
index de635454857..417a048b4db 100644
--- a/src/gallium/state_trackers/clover/core/device.cpp
+++ b/src/gallium/state_trackers/clover/core/device.cpp
@@ -45,12 +45,17 @@ namespace {
 device::device(clover::platform , pipe_loader_device *ldev) :
platform(platform), ldev(ldev) {
pipe = pipe_loader_create_screen(ldev);
-   if (!pipe || !pipe->get_param(pipe, PIPE_CAP_COMPUTE) ||
-   !supports_ir(PIPE_SHADER_IR_NATIVE)) {
-  if (pipe)
- pipe->destroy(pipe);
-  throw error(CL_INVALID_DEVICE);
+   if (pipe && pipe->get_param(pipe, PIPE_CAP_COMPUTE)) {
+  if (supports_ir(PIPE_SHADER_IR_NATIVE))
+ return;
+#ifdef CLOVER_ALLOW_SPIRV
+  if (supports_ir(PIPE_SHADER_IR_SPIRV))
+ return;
+#endif
}
+   if (pipe)
+  pipe->destroy(pipe);
+   throw error(CL_INVALID_DEVICE);
 }
 
 device::~device() {
@@ -245,7 +250,11 @@ device::vendor_name() const {
 
 enum pipe_shader_ir
 device::ir_format() const {
-   return PIPE_SHADER_IR_NATIVE;
+   i

[Mesa-dev] [PATCH 11/15] rename pipe_llvm_program_header to pipe_binary_program_header

2019-05-11 Thread Karol Herbst

We want to use it for other formats as well, so give it a more generic name

Signed-off-by: Karol Herbst 
Reviewed-by: Francisco Jerez 
---
 src/gallium/drivers/r600/evergreen_compute.c  | 2 +-
 src/gallium/drivers/radeonsi/si_compute.c | 2 +-
 src/gallium/include/pipe/p_state.h| 2 +-
 src/gallium/state_trackers/clover/llvm/codegen/common.cpp | 2 +-
 src/gallium/state_trackers/clover/spirv/invocation.cpp| 4 ++--
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index 34e5755696f..2f4d84405db 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -410,7 +410,7 @@ static void *evergreen_create_compute_state(struct 
pipe_context *ctx,
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_pipe_compute *shader = CALLOC_STRUCT(r600_pipe_compute);
 #ifdef HAVE_OPENCL
-   const struct pipe_llvm_program_header *header;
+   const struct pipe_binary_program_header *header;
void *p;
boolean use_kill;
 #endif
diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index ae10709f2f1..72fc3a197e7 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -232,7 +232,7 @@ static void *si_create_compute_state(
>compiler_ctx_state,
program, 
si_create_compute_state_async);
} else {
-   const struct pipe_llvm_program_header *header;
+   const struct pipe_binary_program_header *header;
header = cso->prog;
 
ac_elf_read(header->blob, header->num_bytes, 
>shader.binary);
diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index 27350091b82..c94dfb0ba78 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -881,7 +881,7 @@ struct pipe_grid_info
 /**
  * Structure used as a header for serialized LLVM programs.
  */
-struct pipe_llvm_program_header
+struct pipe_binary_program_header
 {
uint32_t num_bytes; /**< Number of bytes in the LLVM bytecode program. */
char blob[];
diff --git a/src/gallium/state_trackers/clover/llvm/codegen/common.cpp 
b/src/gallium/state_trackers/clover/llvm/codegen/common.cpp
index 98a9d5ffb5e..3879fb61a02 100644
--- a/src/gallium/state_trackers/clover/llvm/codegen/common.cpp
+++ b/src/gallium/state_trackers/clover/llvm/codegen/common.cpp
@@ -177,7 +177,7 @@ namespace {
 
module::section
make_text_section(const std::vector ) {
-  const pipe_llvm_program_header header { uint32_t(code.size()) };
+  const pipe_binary_program_header header { uint32_t(code.size()) };
   module::section text { 0, module::section::text_executable,
  header.num_bytes, {} };
 
diff --git a/src/gallium/state_trackers/clover/spirv/invocation.cpp 
b/src/gallium/state_trackers/clover/spirv/invocation.cpp
index 2fd5a876a32..5f71e94bf42 100644
--- a/src/gallium/state_trackers/clover/spirv/invocation.cpp
+++ b/src/gallium/state_trackers/clover/spirv/invocation.cpp
@@ -103,7 +103,7 @@ namespace {
module::section
make_text_section(const std::vector ,
  enum module::section::type section_type) {
-  const pipe_llvm_program_header header { uint32_t(code.size()) };
+  const pipe_binary_program_header header { uint32_t(code.size()) };
   module::section text { 0, section_type, header.num_bytes, {} };
 
   text.data.insert(text.data.end(), reinterpret_cast(),
@@ -649,7 +649,7 @@ clover::spirv::link_program(const std::vector 
,
  assert(false);
   }
 
-  const auto c_il = ((struct 
pipe_llvm_program_header*)msec->data.data())->blob;
+  const auto c_il = ((struct 
pipe_binary_program_header*)msec->data.data())->blob;
   const auto length = msec->size;
 
   sections.push_back(reinterpret_cast(c_il));
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/15] nv50/ir/nir: implement load/store_global

2019-05-11 Thread Karol Herbst

required by OpenCL

v2: fix setting globalAccess

Signed-off-by: Karol Herbst 
Reviewed-by: Pierre Moreau 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 36 +++
 1 file changed, 36 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index b18a984eeec..a9b5ba990f3 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -2624,6 +2624,42 @@ Converter::visit(nir_intrinsic_instr *insn)
   mkOp1(OP_RDSV, dType, newDefs[1], mkSysVal(SV_CLOCK, 0))->fixed = 1;
   break;
}
+   case nir_intrinsic_load_global: {
+  const DataType dType = getDType(insn);
+  LValues  = convert(>dest);
+  Value *indirectOffset;
+  uint32_t offset = getIndirect(>src[0], 0, indirectOffset);
+
+  for (auto i = 0u; i < insn->num_components; ++i)
+ loadFrom(FILE_MEMORY_GLOBAL, 0, dType, newDefs[i], offset, i, 
indirectOffset);
+
+  info->io.globalAccess |= 0x1;
+  break;
+   }
+   case nir_intrinsic_store_global: {
+  DataType sType = getSType(insn->src[0], false, false);
+
+  for (auto i = 0u; i < insn->num_components; ++i) {
+ if (!((1u << i) & nir_intrinsic_write_mask(insn)))
+continue;
+ if (typeSizeof(sType) == 8) {
+Value *split[2];
+mkSplit(split, 4, getSrc(>src[0], i));
+
+Symbol *sym = mkSymbol(FILE_MEMORY_GLOBAL, 0, TYPE_U32, i * 
typeSizeof(sType));
+mkStore(OP_STORE, TYPE_U32, sym, getSrc(>src[1], 0), 
split[0]);
+
+sym = mkSymbol(FILE_MEMORY_GLOBAL, 0, TYPE_U32, i * 
typeSizeof(sType) + 4);
+mkStore(OP_STORE, TYPE_U32, sym, getSrc(>src[1], 0), 
split[1]);
+ } else {
+Symbol *sym = mkSymbol(FILE_MEMORY_GLOBAL, 0, sType, i * 
typeSizeof(sType));
+mkStore(OP_STORE, sType, sym, getSrc(>src[1], 0), 
getSrc(>src[0], i));
+ }
+  }
+
+  info->io.globalAccess |= 0x2;
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/15] nir/spirv: add spirv_to_nir_cl

2019-05-11 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/compiler/Makefile.sources|   1 +
 src/compiler/nir/meson.build |   1 +
 src/compiler/spirv/nir_spirv.h   |   4 +
 src/compiler/spirv/spirv_to_nir_cl.c | 124 +++
 4 files changed, 130 insertions(+)
 create mode 100644 src/compiler/spirv/spirv_to_nir_cl.c

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 1b6dc25f1ed..6265bdca359 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -346,6 +346,7 @@ SPIRV_FILES = \
spirv/spirv.h \
spirv/spirv_info.h \
spirv/spirv_to_nir.c \
+   spirv/spirv_to_nir_cl.c \
spirv/vtn_alu.c \
spirv/vtn_amd.c \
spirv/vtn_cfg.c \
diff --git a/src/compiler/nir/meson.build b/src/compiler/nir/meson.build
index 69de4121a73..d115022d134 100644
--- a/src/compiler/nir/meson.build
+++ b/src/compiler/nir/meson.build
@@ -221,6 +221,7 @@ files_libnir = files(
   '../spirv/spirv.h',
   '../spirv/spirv_info.h',
   '../spirv/spirv_to_nir.c',
+  '../spirv/spirv_to_nir_cl.c',
   '../spirv/vtn_alu.c',
   '../spirv/vtn_amd.c',
   '../spirv/vtn_cfg.c',
diff --git a/src/compiler/spirv/nir_spirv.h b/src/compiler/spirv/nir_spirv.h
index 7a16422b291..1ce7cbaf998 100644
--- a/src/compiler/spirv/nir_spirv.h
+++ b/src/compiler/spirv/nir_spirv.h
@@ -101,6 +101,10 @@ nir_function *spirv_to_nir(const uint32_t *words, size_t 
word_count,
const struct spirv_to_nir_options *options,
const nir_shader_compiler_options *nir_options);
 
+nir_shader * spirv_to_nir_cl(const uint32_t *words, size_t word_count,
+ const char *entry_point_name,
+ const nir_shader_compiler_options *nir_options);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/src/compiler/spirv/spirv_to_nir_cl.c 
b/src/compiler/spirv/spirv_to_nir_cl.c
new file mode 100644
index 000..de9bcaf0d20
--- /dev/null
+++ b/src/compiler/spirv/spirv_to_nir_cl.c
@@ -0,0 +1,124 @@
+/*
+ * Copyright © 2019 Red Hat, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *Karol Herbst (kher...@redhat.com)
+ *
+ */
+
+#include "util/u_math.h"
+
+#include "nir/nir.h"
+#include "spirv/nir_spirv.h"
+
+nir_shader *
+spirv_to_nir_cl(const uint32_t *words, size_t word_count,
+const char *entry_point_name,
+const nir_shader_compiler_options *nir_options)
+{
+   struct spirv_to_nir_options spirv_options = {
+  .caps = {
+ .address = true,
+ .float64 = true,
+ .int8 = true,
+ .int16 = true,
+ .int64 = true,
+ .kernel = true,
+  },
+   };
+
+   nir_function *entry_point =
+  spirv_to_nir(words, word_count, NULL, 0, MESA_SHADER_KERNEL,
+   entry_point_name, _options, nir_options);
+
+   if (!entry_point)
+  return NULL;
+
+   nir_shader *nir = entry_point->shader;
+   nir->info.cs.local_size_variable = true;
+
+   nir_validate_shader(nir, "clover");
+
+   /* calculate input offsets */
+   unsigned offset = 0;
+   nir_foreach_variable_safe(var, >inputs) {
+  offset = align(offset, glsl_get_cl_alignment(var->type));
+  var->data.driver_location = offset;
+  offset += glsl_get_cl_size(var->type);
+   }
+
+   /* inline all functions first */
+   NIR_PASS_V(nir, nir_lower_constant_initializers,
+  (nir_variable_mode)(nir_var_function_temp));
+   NIR_PASS_V(nir, nir_lower_returns);
+   NIR_PASS_V(nir, nir_inline_functions);
+   NIR_PASS_V(nir, nir_copy_prop);
+
+   /* Pick off the single entrypoint that we want */
+   foreach_list_typed_safe(nir_function, func, node, >functions) {
+  if (func != entry_point)
+ exec_node_remove(>node);
+   }
+   assert(exec_list_leng

[Mesa-dev] [PATCH 08/15] clover/llvm: Add options for dumping SPIR-V binaries

2019-05-11 Thread Karol Herbst

From: Pierre Moreau 

Reviewed-by: Karol Herbst 
---
 .../state_trackers/clover/llvm/util.hpp   |  4 ++-
 .../clover/spirv/invocation.cpp   | 30 +++
 .../clover/spirv/invocation.hpp   |  4 +++
 3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/clover/llvm/util.hpp 
b/src/gallium/state_trackers/clover/llvm/util.hpp
index 222becd614e..02e73e65071 100644
--- a/src/gallium/state_trackers/clover/llvm/util.hpp
+++ b/src/gallium/state_trackers/clover/llvm/util.hpp
@@ -101,7 +101,8 @@ namespace clover {
  enum flag {
 clc = 1 << 0,
 llvm = 1 << 1,
-native = 1 << 2
+native = 1 << 2,
+spirv = 1 << 3,
  };
 
  inline bool
@@ -111,6 +112,7 @@ namespace clover {
{ "llvm", llvm, "Dump the generated LLVM IR for all kernels." },
{ "native", native, "Dump kernel assembly code for targets "
  "specifying PIPE_SHADER_IR_NATIVE" },
+   { "spirv", spirv, "Dump the generated SPIR-V for all kernels." 
},
DEBUG_NAMED_VALUE_END
 };
 static const unsigned flags =
diff --git a/src/gallium/state_trackers/clover/spirv/invocation.cpp 
b/src/gallium/state_trackers/clover/spirv/invocation.cpp
index 62886e77495..6100fca0065 100644
--- a/src/gallium/state_trackers/clover/spirv/invocation.cpp
+++ b/src/gallium/state_trackers/clover/spirv/invocation.cpp
@@ -709,6 +709,30 @@ clover::spirv::is_valid_spirv(const uint32_t *binary, 
size_t length,
 
return spvTool.Validate(binary, length);
 }
+
+std::string
+clover::spirv::print_module(const std::vector ,
+const std::string _version) {
+   const spv_target_env target_env =
+  convert_opencl_str_to_target_env(opencl_version);
+   spvtools::SpirvTools spvTool(target_env);
+   spv_context spvContext = spvContextCreate(target_env);
+   if (!spvContext)
+  return "Failed to create an spv_context for disassembling the module.";
+
+   spv_text disassembly;
+   spvBinaryToText(spvContext,
+   reinterpret_cast(binary.data()),
+   binary.size() / 4u, SPV_BINARY_TO_TEXT_OPTION_NONE,
+   , nullptr);
+   spvContextDestroy(spvContext);
+
+   const std::string disassemblyStr = disassembly->str;
+   spvTextDestroy(disassembly);
+
+   return disassemblyStr;
+}
+
 #else
 module
 clover::spirv::link_program(const std::vector &/*modules*/,
@@ -724,4 +748,10 @@ clover::spirv::is_valid_spirv(const uint32_t * /*binary*/, 
size_t /*length*/,
   const context::notify_action &/*notify*/) {
return false;
 }
+
+std::string
+clover::spirv::print_module(const std::vector ,
+const std::string _version) {
+   return std::string();
+}
 #endif
diff --git a/src/gallium/state_trackers/clover/spirv/invocation.hpp 
b/src/gallium/state_trackers/clover/spirv/invocation.hpp
index 37cd1377cb2..4818ab5daf4 100644
--- a/src/gallium/state_trackers/clover/spirv/invocation.hpp
+++ b/src/gallium/state_trackers/clover/spirv/invocation.hpp
@@ -53,6 +53,10 @@ namespace clover {
   // link dependencies between them.
   module link_program(const std::vector , const device 
,
   const std::string , std::string _log);
+
+  // Returns a textual representation of the given binary.
+  std::string print_module(const std::vector ,
+   const std::string _version);
}
 }
 
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/15] nv50/ir/nir: handle kernel inputs

2019-05-11 Thread Karol Herbst

required by OpenCL

Signed-off-by: Karol Herbst 
Reviewed-by: Pierre Moreau 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 21 ---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 1bf6b471b31..b18a984eeec 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -96,7 +96,10 @@ private:
// If the found value has not a constant part, the Value gets returned
// through the Value parameter.
uint32_t getIndirect(nir_src *, uint8_t, Value *&);
-   uint32_t getIndirect(nir_intrinsic_instr *, uint8_t s, uint8_t c, Value *&);
+   // isScalar indicates that the addressing is scalar, vec4 addressing is
+   // assumed otherwise
+   uint32_t getIndirect(nir_intrinsic_instr *, uint8_t s, uint8_t c, Value *&,
+bool isScalar = false);
 
uint32_t getSlotAddress(nir_intrinsic_instr *, uint8_t idx, uint8_t slot);
 
@@ -789,10 +792,10 @@ Converter::getIndirect(nir_src *src, uint8_t idx, Value 
*)
 }
 
 uint32_t
-Converter::getIndirect(nir_intrinsic_instr *insn, uint8_t s, uint8_t c, Value 
*)
+Converter::getIndirect(nir_intrinsic_instr *insn, uint8_t s, uint8_t c, Value 
*, bool isScalar)
 {
int32_t idx = nir_intrinsic_base(insn) + getIndirect(>src[s], c, 
indirect);
-   if (indirect)
+   if (indirect && !isScalar)
   indirect = mkOp2v(OP_SHL, TYPE_U32, getSSA(4, FILE_ADDRESS), indirect, 
loadImm(NULL, 4));
return idx;
 }
@@ -2059,6 +2062,18 @@ Converter::visit(nir_intrinsic_instr *insn)
   }
   break;
}
+   case nir_intrinsic_load_kernel_input: {
+  assert(prog->getType() == Program::TYPE_COMPUTE);
+  assert(insn->num_components == 1);
+
+  LValues  = convert(>dest);
+  const DataType dType = getDType(insn);
+  Value *indirect;
+  uint32_t idx = getIndirect(insn, 0, 0, indirect, true);
+
+  mkLoad(dType, newDefs[0], mkSymbol(FILE_SHADER_INPUT, 0, dType, idx), 
indirect);
+  break;
+   }
case nir_intrinsic_load_barycentric_at_offset:
case nir_intrinsic_load_barycentric_at_sample:
case nir_intrinsic_load_barycentric_centroid:
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/15] Clover: support CL through SPIR-V

2019-05-11 Thread Karol Herbst

MR on gitlab: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/563

Due to some nouveau and other random bits I thought it makes sense to post
this series on the Mailing List as well.

In short, this series adds support for providing OpenCL through spirv by
using the spirv-llvm-translator library to convert llvm to spirv.

Karol Herbst (10):
  nv50/ir/nir: parse system values first and stop for compute shaders
  nv50/ir/nir: don't assert on !main
  nv50/ir/nir: handle kernel inputs
  nv50/ir/nir: implement load/store_global
  gallium: add blob field to pipe_llvm_program_header
  rename pipe_llvm_program_header to pipe_binary_program_header
  nir/spirv: add spirv_to_nir_cl
  gallium: add entry_point field to pipe_compute_state
  clover: add support for consuming spirv
  nvc0: expose spirv support

Pierre Moreau (5):
  meson: Check for SPIRV-Tools and llvm-spirv
  clover/spirv: Add functions for validating SPIR-V binaries
  clover/spirv: Add functions for parsing arguments, linking programs,
etc.
  clover/llvm: Add options for dumping SPIR-V binaries
  clover/llvm: Add functions for compiling from source to SPIR-V

 meson.build   |  13 +
 meson_options.txt |   6 +
 src/compiler/Makefile.sources |   1 +
 src/compiler/nir/meson.build  |   1 +
 src/compiler/spirv/nir_spirv.h|   4 +
 src/compiler/spirv/spirv_to_nir_cl.c  | 124 +++
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 120 ++-
 src/gallium/drivers/nouveau/nouveau_screen.c  |   1 +
 src/gallium/drivers/nouveau/nouveau_screen.h  |   1 +
 .../drivers/nouveau/nvc0/nvc0_screen.c|  14 +-
 .../drivers/nouveau/nvc0/nvc0_screen.h|   2 +
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c |  10 +
 src/gallium/drivers/r600/evergreen_compute.c  |   6 +-
 src/gallium/drivers/radeonsi/si_compute.c |   6 +-
 src/gallium/include/pipe/p_defines.h  |   1 +
 src/gallium/include/pipe/p_state.h|   4 +-
 .../state_trackers/clover/Makefile.sources|   4 +
 .../state_trackers/clover/core/compiler.hpp   |  68 ++
 .../state_trackers/clover/core/device.cpp |  21 +-
 .../state_trackers/clover/core/kernel.cpp |   1 +
 .../state_trackers/clover/core/program.cpp|  10 +-
 .../clover/llvm/codegen/common.cpp|   2 +-
 .../state_trackers/clover/llvm/invocation.cpp | 100 ++-
 .../state_trackers/clover/llvm/invocation.hpp |   8 +
 .../state_trackers/clover/llvm/util.hpp   |   4 +-
 src/gallium/state_trackers/clover/meson.build |  23 +-
 .../clover/spirv/invocation.cpp   | 756 ++
 .../clover/spirv/invocation.hpp   |  63 ++
 28 files changed, 1291 insertions(+), 83 deletions(-)
 create mode 100644 src/compiler/spirv/spirv_to_nir_cl.c
 create mode 100644 src/gallium/state_trackers/clover/core/compiler.hpp
 create mode 100644 src/gallium/state_trackers/clover/spirv/invocation.cpp
 create mode 100644 src/gallium/state_trackers/clover/spirv/invocation.hpp

-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/15] nvc0: expose spirv support

2019-05-11 Thread Karol Herbst

required for OpenCL

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/nouveau_screen.c   |  1 +
 src/gallium/drivers/nouveau/nouveau_screen.h   |  1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 14 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.h |  2 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c  | 10 ++
 5 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nouveau_screen.c 
b/src/gallium/drivers/nouveau/nouveau_screen.c
index cbd45a1dc35..34d81e43893 100644
--- a/src/gallium/drivers/nouveau/nouveau_screen.c
+++ b/src/gallium/drivers/nouveau/nouveau_screen.c
@@ -187,6 +187,7 @@ nouveau_screen_init(struct nouveau_screen *screen, struct 
nouveau_device *dev)
   nouveau_mesa_debug = atoi(nv_dbg);
 
screen->prefer_nir = debug_get_bool_option("NV50_PROG_USE_NIR", false);
+   screen->force_enable_cl = debug_get_bool_option("NOUVEAU_ENABLE_CL", false);
 
/* These must be set before any failure is possible, as the cleanup
 * paths assume they're responsible for deleting them.
diff --git a/src/gallium/drivers/nouveau/nouveau_screen.h 
b/src/gallium/drivers/nouveau/nouveau_screen.h
index 1302c608bec..5f74b6b8f72 100644
--- a/src/gallium/drivers/nouveau/nouveau_screen.h
+++ b/src/gallium/drivers/nouveau/nouveau_screen.h
@@ -69,6 +69,7 @@ struct nouveau_screen {
struct disk_cache *disk_shader_cache;
 
bool prefer_nir;
+   bool force_enable_cl;
 
 #ifdef NOUVEAU_ENABLE_DRIVER_STATISTICS
union {
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index fe80c7e9103..f1352548280 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -400,9 +400,13 @@ nvc0_screen_get_shader_param(struct pipe_screen *pscreen,
switch (param) {
case PIPE_SHADER_CAP_PREFERRED_IR:
   return screen->prefer_nir ? PIPE_SHADER_IR_NIR : PIPE_SHADER_IR_TGSI;
-   case PIPE_SHADER_CAP_SUPPORTED_IRS:
-  return 1 << PIPE_SHADER_IR_TGSI |
- 1 << PIPE_SHADER_IR_NIR;
+   case PIPE_SHADER_CAP_SUPPORTED_IRS: {
+  uint32_t irs = 1 << PIPE_SHADER_IR_TGSI |
+ 1 << PIPE_SHADER_IR_NIR;
+  if (screen->force_enable_cl)
+ irs |= 1 << PIPE_SHADER_IR_SPIRV;
+  return irs;
+   }
case PIPE_SHADER_CAP_MAX_INSTRUCTIONS:
case PIPE_SHADER_CAP_MAX_ALU_INSTRUCTIONS:
case PIPE_SHADER_CAP_MAX_TEX_INSTRUCTIONS:
@@ -895,7 +899,7 @@ nvc0_screen_bind_cb_3d(struct nvc0_screen *screen, bool 
*can_serialize,
IMMED_NVC0(push, NVC0_3D(CB_BIND(stage)), (index << 4) | (size >= 0));
 }
 
-static const nir_shader_compiler_options nir_options = {
+const nir_shader_compiler_options nvc0_nir_options = {
.lower_fdiv = false,
.lower_ffma = false,
.fuse_ffma = false, /* nir doesn't track mad vs fma */
@@ -963,7 +967,7 @@ nvc0_screen_get_compiler_options(struct pipe_screen 
*pscreen,
  enum pipe_shader_type shader)
 {
if (ir == PIPE_SHADER_IR_NIR)
-  return _options;
+  return _nir_options;
return NULL;
 }
 
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h
index 392980562bd..6ae70e5b88f 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h
@@ -241,4 +241,6 @@ nvc0_screen_tsc_free(struct nvc0_screen *screen, struct 
nv50_tsc_entry *tsc)
}
 }
 
+extern const struct nir_shader_compiler_options nvc0_nir_options;
+
 #endif
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index 12e21862ee0..817c11de537 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -28,6 +28,7 @@
 
 #include "tgsi/tgsi_parse.h"
 #include "compiler/nir/nir.h"
+#include "compiler/spirv/nir_spirv.h"
 
 #include "nvc0/nvc0_stateobj.h"
 #include "nvc0/nvc0_context.h"
@@ -737,6 +738,15 @@ nvc0_cp_state_create(struct pipe_context *pipe,
case PIPE_SHADER_IR_NIR:
   prog->pipe.ir.nir = (nir_shader *)cso->prog;
   break;
+   case PIPE_SHADER_IR_SPIRV: {
+  const struct pipe_binary_program_header *hdr =
+ (const struct pipe_binary_program_header*)cso->prog;
+  prog->pipe.type = PIPE_SHADER_IR_NIR;
+  prog->pipe.ir.nir = spirv_to_nir_cl((uint32_t*)hdr->blob, hdr->num_bytes 
/ 4,
+  cso->entry_point,
+  _nir_options);
+  break;
+   }
default:
   assert(!"unsupported IR!");
   return NULL;
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 13/15] gallium: add entry_point field to pipe_compute_state

2019-05-11 Thread Karol Herbst

for binaries containing multiple entry_points the driver can if this field is
set, optimize the binary in order to save memory.

For spir-v it can be used to compile against a specific entry_point. It is
guarenteed that the pc field can be ignored by the driver later if it decides
to do so.

Signed-off-by: Karol Herbst 
---
 src/gallium/include/pipe/p_state.h| 1 +
 src/gallium/state_trackers/clover/core/kernel.cpp | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index c94dfb0ba78..d043f0d19af 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -894,6 +894,7 @@ struct pipe_compute_state
unsigned req_local_mem; /**< Required size of the LOCAL resource. */
unsigned req_private_mem; /**< Required size of the PRIVATE resource. */
unsigned req_input_mem; /**< Required size of the INPUT resource. */
+   const char *entry_point; /**< name of the entry point. */
 };
 
 /**
diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp 
b/src/gallium/state_trackers/clover/core/kernel.cpp
index 7fe66ae4ea2..5ac66ab91c7 100644
--- a/src/gallium/state_trackers/clover/core/kernel.cpp
+++ b/src/gallium/state_trackers/clover/core/kernel.cpp
@@ -230,6 +230,7 @@ kernel::exec_context::bind(intrusive_ptr _q,
   cs.prog = &(msec.data[0]);
   cs.req_local_mem = mem_local;
   cs.req_input_mem = input.size();
+  cs.entry_point = kern.name().c_str();
   st = q->pipe->create_compute_state(q->pipe, );
   if (!st) {
  unbind(); // Cleanup
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/15] clover/llvm: Add functions for compiling from source to SPIR-V

2019-05-11 Thread Karol Herbst

From: Pierre Moreau 

Reviewed-by: Karol Herbst 
---
 .../state_trackers/clover/llvm/invocation.cpp | 100 +++---
 .../state_trackers/clover/llvm/invocation.hpp |   8 ++
 src/gallium/state_trackers/clover/meson.build |   2 +-
 3 files changed, 92 insertions(+), 18 deletions(-)

diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index 0a677ce2eaa..b4f59821323 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -30,6 +30,9 @@
 #include 
 #include 
 #include 
+#ifdef CLOVER_ALLOW_SPIRV
+#include 
+#endif
 
 #include 
 #include 
@@ -51,6 +54,9 @@
 #include "llvm/invocation.hpp"
 #include "llvm/metadata.hpp"
 #include "llvm/util.hpp"
+#ifdef CLOVER_ALLOW_SPIRV
+#include "spirv/invocation.hpp"
+#endif
 #include "util/algorithm.hpp"
 
 
@@ -182,7 +188,7 @@ namespace {
}
 
std::unique_ptr
-   create_compiler_instance(const device ,
+   create_compiler_instance(const device , const std::string& ir_target,
 const std::vector ,
 std::string _log) {
   std::unique_ptr c { new clang::CompilerInstance 
};
@@ -196,7 +202,7 @@ namespace {
   const std::vector copts =
  map(std::mem_fn(::string::c_str), opts);
 
-  const target  = dev.ir_target();
+  const target  = ir_target;
   const std::string _clc_version = dev.device_clc_version();
 
   if (!clang::CompilerInvocation::CreateFromArgs(
@@ -235,19 +241,29 @@ namespace {
compile(LLVMContext , clang::CompilerInstance ,
const std::string , const std::string ,
const header_map , const device ,
-   const std::string , std::string _log) {
+   const std::string , bool use_libclc, std::string _log) {
   c.getFrontendOpts().ProgramAction = clang::frontend::EmitLLVMOnly;
   c.getHeaderSearchOpts().UseBuiltinIncludes = true;
   c.getHeaderSearchOpts().UseStandardSystemIncludes = true;
   c.getHeaderSearchOpts().ResourceDir = CLANG_RESOURCE_DIR;
 
-  // Add libclc generic search path
-  c.getHeaderSearchOpts().AddPath(LIBCLC_INCLUDEDIR,
-  clang::frontend::Angled,
-  false, false);
+  if (use_libclc) {
+ // Add libclc generic search path
+ c.getHeaderSearchOpts().AddPath(LIBCLC_INCLUDEDIR,
+ clang::frontend::Angled,
+ false, false);
 
-  // Add libclc include
-  c.getPreprocessorOpts().Includes.push_back("clc/clc.h");
+ // Add libclc include
+ c.getPreprocessorOpts().Includes.push_back("clc/clc.h");
+  } else {
+ // Add opencl-c generic search path
+ c.getHeaderSearchOpts().AddPath(CLANG_RESOURCE_DIR,
+ clang::frontend::Angled,
+ false, false);
+
+ // Add opencl include
+ c.getPreprocessorOpts().Includes.push_back("opencl-c.h");
+  }
 
   // Add definition for the OpenCL version
   c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=" +
@@ -279,8 +295,9 @@ namespace {
   // attribute will prevent Clang from creating illegal uses of
   // barrier() (e.g. Moving barrier() inside a conditional that is
   // no executed by all threads) during its optimizaton passes.
-  compat::add_link_bitcode_file(c.getCodeGenOpts(),
-LIBCLC_LIBEXECDIR + dev.ir_target() + 
".bc");
+  if (use_libclc)
+ compat::add_link_bitcode_file(c.getCodeGenOpts(),
+   LIBCLC_LIBEXECDIR + dev.ir_target() + 
".bc");
 
   // Compile the code
   clang::EmitLLVMOnlyAction act();
@@ -301,8 +318,10 @@ clover::llvm::compile_program(const std::string ,
   debug::log(".cl", "// Options: " + opts + '\n' + source);
 
auto ctx = create_context(r_log);
-   auto c = create_compiler_instance(dev, tokenize(opts + " input.cl"), r_log);
-   auto mod = compile(*ctx, *c, "input.cl", source, headers, dev, opts, r_log);
+   auto c = create_compiler_instance(dev, dev.ir_target(),
+ tokenize(opts + " input.cl"), r_log);
+   auto mod = compile(*ctx, *c, "input.cl", source, headers, dev, opts, true,
+  r_log);
 
if (has_flag(debug::llvm))
   debug::log(".ll", print_module_bitcode(*mod));
@@ -363,14 +382,14 @@ namespace {
 
 module
 clover::llvm::link_program(const std::vector ,
-   const device ,
-   const std::string , std::string _log) {
+   const device , cons

[Mesa-dev] [PATCH 07/15] clover/spirv: Add functions for parsing arguments, linking programs, etc.

2019-05-11 Thread Karol Herbst

From: Pierre Moreau 

v2 (Karol Herbst):
  silence warnings about unhandled enum values
---
 .../clover/spirv/invocation.cpp   | 598 ++
 .../clover/spirv/invocation.hpp   |  12 +
 2 files changed, 610 insertions(+)

diff --git a/src/gallium/state_trackers/clover/spirv/invocation.cpp 
b/src/gallium/state_trackers/clover/spirv/invocation.cpp
index b874f2f061c..62886e77495 100644
--- a/src/gallium/state_trackers/clover/spirv/invocation.cpp
+++ b/src/gallium/state_trackers/clover/spirv/invocation.cpp
@@ -22,10 +22,24 @@
 
 #include "invocation.hpp"
 
+#include 
+#include 
+#include 
+#include 
+#include 
+
 #ifdef CLOVER_ALLOW_SPIRV
 #include 
+#include 
 #endif
 
+#include "core/error.hpp"
+#include "core/platform.hpp"
+#include "invocation.hpp"
+#include "llvm/util.hpp"
+#include "pipe/p_state.h"
+#include "util/algorithm.hpp"
+#include "util/functional.hpp"
 #include "util/u_math.h"
 
 #include "compiler/spirv/spirv.h"
@@ -34,6 +48,472 @@ using namespace clover;
 
 namespace {
 
+   template
+   T get(const char *source, size_t index) {
+  const uint32_t *word_ptr = reinterpret_cast(source);
+  return static_cast(word_ptr[index]);
+   }
+
+   enum module::argument::type
+   convertStorageClass(SpvStorageClass storage_class, std::string ) {
+  switch (storage_class) {
+  case SpvStorageClassFunction:
+ return module::argument::scalar;
+  case SpvStorageClassUniformConstant:
+ return module::argument::constant;
+  case SpvStorageClassWorkgroup:
+ return module::argument::local;
+  case SpvStorageClassCrossWorkgroup:
+ return module::argument::global;
+  default:
+ err += "Invalid storage type " + std::to_string(storage_class) + "\n";
+ throw build_error();
+  }
+   }
+
+   enum module::argument::type
+   convertImageType(SpvId id, SpvDim dim, SpvAccessQualifier access,
+std::string ) {
+#define APPEND_DIM(d) \
+  switch(access) { \
+  case SpvAccessQualifierReadOnly: \
+ return module::argument::image##d##_rd; \
+  case SpvAccessQualifierWriteOnly: \
+ return module::argument::image##d##_wr; \
+  default: \
+ err += "Unsupported access qualifier " #d " for image " + \
+std::to_string(id); \
+ throw build_error(); \
+  }
+
+  switch (dim) {
+  case SpvDim2D:
+ APPEND_DIM(2d)
+  case SpvDim3D:
+ APPEND_DIM(3d)
+  default:
+ err += "Unsupported dimension " + std::to_string(dim) + " for image " 
+
+std::to_string(id);
+ throw build_error();
+  }
+
+#undef APPEND_DIM
+   }
+
+   module::section
+   make_text_section(const std::vector ,
+ enum module::section::type section_type) {
+  const pipe_llvm_program_header header { uint32_t(code.size()) };
+  module::section text { 0, section_type, header.num_bytes, {} };
+
+  text.data.insert(text.data.end(), reinterpret_cast(),
+   reinterpret_cast() + 
sizeof(header));
+  text.data.insert(text.data.end(), code.begin(), code.end());
+
+  return text;
+   }
+
+   module
+   create_module_from_spirv(const std::vector ,
+size_t pointer_byte_size,
+std::string ) {
+  const size_t length = source.size() / sizeof(uint32_t);
+  size_t i = 5u; // Skip header
+
+  std::string kernel_name;
+  size_t kernel_nb = 0u;
+  std::vector args;
+
+  module m;
+
+  std::unordered_map kernels;
+  std::unordered_map types;
+  std::unordered_map pointer_types;
+  std::unordered_map constants;
+  std::unordered_set packed_structures;
+  std::unordered_map>
+ func_param_attr_map;
+
+#define GET_OPERAND(type, operand_id) get(source.data(), i + operand_id)
+
+  while (i < length) {
+ const auto desc_word = get(source.data(), i);
+ const auto opcode = static_cast(desc_word & SpvOpCodeMask);
+ const unsigned int num_operands = desc_word >> SpvWordCountShift;
+
+ switch (opcode) {
+ case SpvOpEntryPoint:
+if (GET_OPERAND(SpvExecutionModel, 1) == SpvExecutionModelKernel)
+   kernels.emplace(GET_OPERAND(SpvId, 2),
+   source.data() + (i + 3u) * sizeof(uint32_t));
+break;
+
+ case SpvOpDecorate: {
+const auto id = GET_OPERAND(SpvId, 1);
+const auto decoration = GET_OPERAND(SpvDecoration, 2);
+if (decoration == SpvDecorationCPacked)
+   packed_structures.emplace(id);
+else if (decoration == SpvDecorationFuncParamAttr)
+   
func_param_attr_map[id].push_back(GET_OPERAND(SpvFunctionParameterAttribute,

[Mesa-dev] [PATCH 06/15] clover/spirv: Add functions for validating SPIR-V binaries

2019-05-11 Thread Karol Herbst

From: Pierre Moreau 

Changes since:
* v12: remove autotools (Karol Herbst)
* v11: Fix compilation error introduced in v11.
* v10:
  - Reuse format_validation_msg in is_valid_spirv.
  - Remove LVL2STR macro in format_validation_msg.
* v9: Add `clover_cpp_std` to the overrides of the `libclspirv` target
  in Meson.
* v7: Add DEFINES to libclspirv and libclover, in autotools, as they
  would otherwise never know whether CLOVER_ALLOW_SPIRV has been
  defined (Dave Airlie)
* v6: Update the dependency name (meson) and the libs variable
  (Makefile) due to the replacement of llvm-spirv to the new
  official SPIRV-LLVM-Translator.
* v5: Changed to match the updated “clover/llvm: Allow translating from
  SPIR-V to LLVM IR” in the v6.

Reviewed-by: Karol Herbst 
---
 .../state_trackers/clover/Makefile.sources|   4 +
 src/gallium/state_trackers/clover/meson.build |  11 +-
 .../clover/spirv/invocation.cpp   | 129 ++
 .../clover/spirv/invocation.hpp   |  47 +++
 4 files changed, 190 insertions(+), 1 deletion(-)
 create mode 100644 src/gallium/state_trackers/clover/spirv/invocation.cpp
 create mode 100644 src/gallium/state_trackers/clover/spirv/invocation.hpp

diff --git a/src/gallium/state_trackers/clover/Makefile.sources 
b/src/gallium/state_trackers/clover/Makefile.sources
index 5167ca75af4..38f94981fb6 100644
--- a/src/gallium/state_trackers/clover/Makefile.sources
+++ b/src/gallium/state_trackers/clover/Makefile.sources
@@ -62,3 +62,7 @@ LLVM_SOURCES := \
llvm/invocation.hpp \
llvm/metadata.hpp \
llvm/util.hpp
+
+SPIRV_SOURCES := \
+   spirv/invocation.cpp \
+   spirv/invocation.hpp
diff --git a/src/gallium/state_trackers/clover/meson.build 
b/src/gallium/state_trackers/clover/meson.build
index 311dcb69a6b..461c69f54c0 100644
--- a/src/gallium/state_trackers/clover/meson.build
+++ b/src/gallium/state_trackers/clover/meson.build
@@ -57,6 +57,15 @@ libclllvm = static_library(
   override_options : clover_cpp_std,
 )
 
+libclspirv = static_library(
+  'clspirv',
+  files('spirv/invocation.cpp', 'spirv/invocation.hpp'),
+  include_directories : clover_incs,
+  cpp_args : [clover_spirv_cpp_args, cpp_vis_args],
+  dependencies : [dep_spirv_tools],
+  override_options : clover_cpp_std,
+)
+
 clover_files = files(
   'api/context.cpp',
   'api/device.cpp',
@@ -117,6 +126,6 @@ libclover = static_library(
   [clover_files, sha1_h],
   include_directories : clover_incs,
   cpp_args : [clover_spirv_cpp_args, clover_cpp_args, cpp_vis_args],
-  link_with : [libclllvm],
+  link_with : [libclllvm, libclspirv],
   override_options : clover_cpp_std,
 )
diff --git a/src/gallium/state_trackers/clover/spirv/invocation.cpp 
b/src/gallium/state_trackers/clover/spirv/invocation.cpp
new file mode 100644
index 000..b874f2f061c
--- /dev/null
+++ b/src/gallium/state_trackers/clover/spirv/invocation.cpp
@@ -0,0 +1,129 @@
+//
+// Copyright 2018 Pierre Moreau
+//
+// Permission is hereby granted, free of charge, to any person obtaining a
+// copy of this software and associated documentation files (the "Software"),
+// to deal in the Software without restriction, including without limitation
+// the rights to use, copy, modify, merge, publish, distribute, sublicense,
+// and/or sell copies of the Software, and to permit persons to whom the
+// Software is furnished to do so, subject to the following conditions:
+//
+// The above copyright notice and this permission notice shall be included in
+// all copies or substantial portions of the Software.
+//
+// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+// THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+// OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+// ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+// OTHER DEALINGS IN THE SOFTWARE.
+//
+
+#include "invocation.hpp"
+
+#ifdef CLOVER_ALLOW_SPIRV
+#include 
+#endif
+
+#include "util/u_math.h"
+
+#include "compiler/spirv/spirv.h"
+
+using namespace clover;
+
+namespace {
+
+#ifdef CLOVER_ALLOW_SPIRV
+   std::string
+   format_validator_msg(spv_message_level_t level, const char * /* source */,
+const spv_position_t , const char *message) {
+  auto const level_to_string = [](spv_message_level_t level){
+ switch (level) {
+case SPV_MSG_FATAL:
+   return std::string("Fatal");
+case SPV_MSG_INTERNAL_ERROR:
+   return std::string("Internal error");
+case SPV_MSG_ERROR:
+   return std::string("Error");
+case SPV_MSG_WARNING:
+   return std::string("Warning");
+case

[Mesa-dev] [PATCH 05/15] meson: Check for SPIRV-Tools and llvm-spirv

2019-05-11 Thread Karol Herbst

From: Pierre Moreau 

Changes since:
* v11 (Karol Herbst):
  - only set new defines for clover to speed up recompilation
  - remove autotools
* v10:
  - Add a new flag (`--enable-opencl-spirv` for autotools, and
`-Dopencl-spirv=true` for meson) for enabling SPIR-V support in
clover, and never automagically enable it without that flag. (Dylan Baker)
  - When enabling the SPIR-V support, the SPIRV-Tools and
SPIRV-LLVM-Translator libraries are now required dependencies.
* v7:
  - Properly align LLVMSPIRVLib comment (Dylan Baker)
  - Only define CLOVER_ALLOW_SPIRV when **both** dependencies are found:
autotools was only requiring one or the other.
* v6: Replace the llvm-spirv repository by the new official
  SPIRV-LLVM-Translator.
* v4: Add a comment saying where to find llvm-spirv (Karol Herbst).
* v3:
  - make SPIRV-Tools and llvm-spirv optional (Francisco Jerez);
  - bump requirement for llvm-spirv to version 0.2
* v2:
  - Bump the required version of SPIRV-Tools to the latest release;
  - Add a dependency on llvm-spirv.

Reviewed-by: Dylan Baker  (v10)
Reviewed-by: Karol Herbst 
---
 meson.build   | 13 +
 meson_options.txt |  6 ++
 src/gallium/state_trackers/clover/meson.build |  9 +++--
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index 2cefbb3f204..dba9f35b28b 100644
--- a/meson.build
+++ b/meson.build
@@ -693,6 +693,16 @@ if _opencl != 'disabled'
   with_gallium_opencl = true
   with_opencl_icd = _opencl == 'icd'
 
+  with_opencl_spirv = get_option('opencl-spirv')
+  if with_opencl_spirv
+dep_spirv_tools = dependency('SPIRV-Tools', required : true, version : '>= 
2018.0')
+# LLVMSPIRVLib is available at 
https://github.com/KhronosGroup/SPIRV-LLVM-Translator
+dep_llvmspirvlib = dependency('LLVMSPIRVLib', required : true, version : 
'>= 0.2.1')
+  else
+dep_spirv_tools = null_dep
+dep_llvmspirvlib = null_dep
+  endif
+
   if host_machine.cpu_family().startswith('ppc') and cpp.compiles('''
   #if !defined(__VEC__) || !defined(__ALTIVEC__)
   #error "AltiVec not enabled"
@@ -702,8 +712,11 @@ if _opencl != 'disabled'
   endif
 else
   dep_clc = null_dep
+  dep_spirv_tools = null_dep
+  dep_llvmspirvlib = null_dep
   with_gallium_opencl = false
   with_opencl_icd = false
+  with_opencl_spirv = false
 endif
 
 gl_pkgconfig_c_flags = []
diff --git a/meson_options.txt b/meson_options.txt
index 1f72faabee8..00f2e7bc949 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -142,6 +142,12 @@ option(
   value : 'disabled',
   description : 'build gallium "clover" OpenCL state tracker.',
 )
+option(
+  'opencl-spirv',
+  type : 'boolean',
+  value : false,
+  description : 'build gallium "clover" OpenCL state tracker with SPIR-V 
binary support.',
+)
 option(
   'd3d-drivers-path',
   type : 'string',
diff --git a/src/gallium/state_trackers/clover/meson.build 
b/src/gallium/state_trackers/clover/meson.build
index 2ff060bf35b..311dcb69a6b 100644
--- a/src/gallium/state_trackers/clover/meson.build
+++ b/src/gallium/state_trackers/clover/meson.build
@@ -19,12 +19,17 @@
 # SOFTWARE.
 
 clover_cpp_args = []
+clover_spirv_cpp_args = []
 clover_incs = [inc_include, inc_src, inc_gallium, inc_gallium_aux]
 
 if with_opencl_icd
   clover_cpp_args += '-DHAVE_CLOVER_ICD'
 endif
 
+if with_opencl_spirv
+  clover_spirv_cpp_args += '-DCLOVER_ALLOW_SPIRV'
+endif
+
 libclllvm = static_library(
   'clllvm',
   files(
@@ -40,7 +45,7 @@ libclllvm = static_library(
   ),
   include_directories : clover_incs,
   cpp_args : [
-cpp_vis_args,
+clover_spirv_cpp_args, cpp_vis_args,
 
'-DLIBCLC_INCLUDEDIR="@0@/"'.format(dep_clc.get_pkgconfig_variable('includedir')),
 
'-DLIBCLC_LIBEXECDIR="@0@/"'.format(dep_clc.get_pkgconfig_variable('libexecdir')),
 '-DCLANG_RESOURCE_DIR="@0@"'.format(join_paths(
@@ -111,7 +116,7 @@ libclover = static_library(
   'clover',
   [clover_files, sha1_h],
   include_directories : clover_incs,
-  cpp_args : [clover_cpp_args, cpp_vis_args],
+  cpp_args : [clover_spirv_cpp_args, clover_cpp_args, cpp_vis_args],
   link_with : [libclllvm],
   override_options : clover_cpp_std,
 )
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/15] gallium: add blob field to pipe_llvm_program_header

2019-05-11 Thread Karol Herbst

makes it easier to consume a IR_NATIVE binary

Signed-off-by: Karol Herbst 
Reviewed-by: Francisco Jerez 
---
 src/gallium/drivers/r600/evergreen_compute.c   | 4 +---
 src/gallium/drivers/radeonsi/si_compute.c  | 4 +---
 src/gallium/include/pipe/p_state.h | 1 +
 src/gallium/state_trackers/clover/spirv/invocation.cpp | 3 +--
 4 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index 1536210c7ef..34e5755696f 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -411,7 +411,6 @@ static void *evergreen_create_compute_state(struct 
pipe_context *ctx,
struct r600_pipe_compute *shader = CALLOC_STRUCT(r600_pipe_compute);
 #ifdef HAVE_OPENCL
const struct pipe_llvm_program_header *header;
-   const char *code;
void *p;
boolean use_kill;
 #endif
@@ -430,9 +429,8 @@ static void *evergreen_create_compute_state(struct 
pipe_context *ctx,
 #ifdef HAVE_OPENCL
COMPUTE_DBG(rctx->screen, "*** evergreen_create_compute_state\n");
header = cso->prog;
-   code = cso->prog + sizeof(struct pipe_llvm_program_header);
radeon_shader_binary_init(>binary);
-   r600_elf_read(code, header->num_bytes, >binary);
+   r600_elf_read(header->blob, header->num_bytes, >binary);
r600_create_shader(>bc, >binary, _kill);
 
/* Upload code + ROdata */
diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index f1a433b72df..ae10709f2f1 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -233,11 +233,9 @@ static void *si_create_compute_state(
program, 
si_create_compute_state_async);
} else {
const struct pipe_llvm_program_header *header;
-   const char *code;
header = cso->prog;
-   code = cso->prog + sizeof(struct pipe_llvm_program_header);
 
-   ac_elf_read(code, header->num_bytes, >shader.binary);
+   ac_elf_read(header->blob, header->num_bytes, 
>shader.binary);
if (program->use_code_object_v2) {
const amd_kernel_code_t *code_object =
si_compute_get_code_object(program, 0);
diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index b7fa76a803a..27350091b82 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -884,6 +884,7 @@ struct pipe_grid_info
 struct pipe_llvm_program_header
 {
uint32_t num_bytes; /**< Number of bytes in the LLVM bytecode program. */
+   char blob[];
 };
 
 struct pipe_compute_state
diff --git a/src/gallium/state_trackers/clover/spirv/invocation.cpp 
b/src/gallium/state_trackers/clover/spirv/invocation.cpp
index 6100fca0065..2fd5a876a32 100644
--- a/src/gallium/state_trackers/clover/spirv/invocation.cpp
+++ b/src/gallium/state_trackers/clover/spirv/invocation.cpp
@@ -649,8 +649,7 @@ clover::spirv::link_program(const std::vector 
,
  assert(false);
   }
 
-  const auto c_il = msec->data.data() +
-sizeof(struct pipe_llvm_program_header);
+  const auto c_il = ((struct 
pipe_llvm_program_header*)msec->data.data())->blob;
   const auto length = msec->size;
 
   sections.push_back(reinterpret_cast(c_il));
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/15] nv50/ir/nir: parse system values first and stop for compute shaders

2019-05-11 Thread Karol Herbst

required by OpenCL

Signed-off-by: Karol Herbst 
Reviewed-by: Pierre Moreau 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 61 ++-
 1 file changed, 32 insertions(+), 29 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 02ae5d73b99..8dded71f461 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1169,6 +1169,7 @@ bool Converter::assignSlots() {
 
info->io.viewportId = -1;
info->numInputs = 0;
+   info->numOutputs = 0;
 
// we have to fixup the uniform locations for arrays
unsigned numImages = 0;
@@ -1180,6 +1181,37 @@ bool Converter::assignSlots() {
   numImages += type->is_array() ? type->arrays_of_arrays_size() : 1;
}
 
+   info->numSysVals = 0;
+   for (uint8_t i = 0; i < 64; ++i) {
+  if (!(nir->info.system_values_read & 1ll << i))
+ continue;
+
+  system_val_to_tgsi_semantic(i, , );
+  info->sv[info->numSysVals].sn = name;
+  info->sv[info->numSysVals].si = index;
+  info->sv[info->numSysVals].input = 0; // TODO inferSysValDirection(sn);
+
+  switch (i) {
+  case SYSTEM_VALUE_INSTANCE_ID:
+ info->io.instanceId = info->numSysVals;
+ break;
+  case SYSTEM_VALUE_TESS_LEVEL_INNER:
+  case SYSTEM_VALUE_TESS_LEVEL_OUTER:
+ info->sv[info->numSysVals].patch = 1;
+ break;
+  case SYSTEM_VALUE_VERTEX_ID:
+ info->io.vertexId = info->numSysVals;
+ break;
+  default:
+ break;
+  }
+
+  info->numSysVals += 1;
+   }
+
+   if (prog->getType() == Program::TYPE_COMPUTE)
+  return true;
+
nir_foreach_variable(var, >inputs) {
   const glsl_type *type = var->type;
   int slot = var->data.location;
@@ -1244,7 +1276,6 @@ bool Converter::assignSlots() {
   info->numInputs = std::max(info->numInputs, vary);
}
 
-   info->numOutputs = 0;
nir_foreach_variable(var, >outputs) {
   const glsl_type *type = var->type;
   int slot = var->data.location;
@@ -1336,34 +1367,6 @@ bool Converter::assignSlots() {
   info->numOutputs = std::max(info->numOutputs, vary);
}
 
-   info->numSysVals = 0;
-   for (uint8_t i = 0; i < 64; ++i) {
-  if (!(nir->info.system_values_read & 1ll << i))
- continue;
-
-  system_val_to_tgsi_semantic(i, , );
-  info->sv[info->numSysVals].sn = name;
-  info->sv[info->numSysVals].si = index;
-  info->sv[info->numSysVals].input = 0; // TODO inferSysValDirection(sn);
-
-  switch (i) {
-  case SYSTEM_VALUE_INSTANCE_ID:
- info->io.instanceId = info->numSysVals;
- break;
-  case SYSTEM_VALUE_TESS_LEVEL_INNER:
-  case SYSTEM_VALUE_TESS_LEVEL_OUTER:
- info->sv[info->numSysVals].patch = 1;
- break;
-  case SYSTEM_VALUE_VERTEX_ID:
- info->io.vertexId = info->numSysVals;
- break;
-  default:
- break;
-  }
-
-  info->numSysVals += 1;
-   }
-
if (info->io.genUserClip > 0) {
   info->io.clipDistances = info->io.genUserClip;
 
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/15] nv50/ir/nir: don't assert on !main

2019-05-11 Thread Karol Herbst

required for OpenCL

Signed-off-by: Karol Herbst 
Reviewed-by: Pierre Moreau 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 8dded71f461..1bf6b471b31 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1560,8 +1560,6 @@ Converter::parseNIR()
 bool
 Converter::visit(nir_function *function)
 {
-   // we only support emiting the main function for now
-   assert(!strcmp(function->name, "main"));
assert(function->impl);
 
// usually the blocks will set everything up, but main is special
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: Constant values are per-column not per-component

2019-03-20 Thread Karol Herbst

Reviewed-by: Karol Herbst 

On Wed, Mar 20, 2019 at 1:22 PM Lionel Landwerlin
 wrote:
>
> Reviewed-by: Lionel Landwerlin 
>
> On 19/03/2019 19:15, Jason Ekstrand wrote:
> > ---
> >   src/compiler/nir/nir.h | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > index 67304af1d64..e4f012809e5 100644
> > --- a/src/compiler/nir/nir.h
> > +++ b/src/compiler/nir/nir.h
> > @@ -59,6 +59,7 @@ extern "C" {
> >   #define NIR_FALSE 0u
> >   #define NIR_TRUE (~0u)
> >   #define NIR_MAX_VEC_COMPONENTS 4
> > +#define NIR_MAX_MATRIX_COLUMNS 4
> >   typedef uint8_t nir_component_mask_t;
> >
> >   /** Defines a cast function
> > @@ -141,7 +142,7 @@ typedef struct nir_constant {
> >   * by the type associated with the \c nir_variable.  Constants may be
> >   * scalars, vectors, or matrices.
> >   */
> > -   nir_const_value values[NIR_MAX_VEC_COMPONENTS];
> > +   nir_const_value values[NIR_MAX_MATRIX_COLUMNS];
> >
> >  /* we could get this from the var->type but makes clone *much* easier 
> > to
> >   * not have to care about the type.
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] android: nouveau: add support for nir

2019-03-17 Thread Karol Herbst

On Sun, Mar 17, 2019 at 11:56 PM Mauro Rossi  wrote:
>
> Hi Karol,
>
> On Sun, Mar 17, 2019 at 11:25 PM Karol Herbst  wrote:
> >
> > On Sun, Mar 17, 2019 at 10:52 PM Mauro Rossi  wrote:
> > >
> > > Add the necessary build rules for android, to avoid building errors.
> > >
> > > Fixes: f014ae3 ("nouveau: add support for nir")
> > > Signed-off-by: Mauro Rossi 
> > > ---
> > >  src/gallium/drivers/nouveau/Android.mk | 7 ++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/src/gallium/drivers/nouveau/Android.mk 
> > > b/src/gallium/drivers/nouveau/Android.mk
> > > index cd2dd0938f..49a341c831 100644
> > > --- a/src/gallium/drivers/nouveau/Android.mk
> > > +++ b/src/gallium/drivers/nouveau/Android.mk
> > > @@ -37,8 +37,13 @@ LOCAL_SRC_FILES := \
> > > $(NVC0_C_SOURCES)
> > >
> > >  LOCAL_C_INCLUDES := \
> > > -   $(MESA_TOP)/include
> > > +   $(MESA_TOP)/include \
> > > +   $(call 
> > > generated-sources-dir-for,STATIC_LIBRARIES,libmesa_nir,,)/nir \
> > > +   $(MESA_TOP)/src/compiler/nir \
> > > +   $(MESA_TOP)/src/mapi \
> > > +   $(MESA_TOP)/src/mesa
> >
> > do we really have to add all those includes? freedreno doesn't seem to
> > add those either and just has the libmesa_nir dependency
>
> Hi Karol,
>
> the first build error "main/config.h" not found
> was caused by $(MESA_TOP)/src/mesa missing,
> then I did not chased the others,
> I replicated with fidelity your Autotools build rules
> as I've always been instructed this way by Emil Velikov
>
> The patch is working and Android booting on GTX950 with the patch
> KR
>
> Mauro

yeah, I was mainly wondering if only some of those includes would be
sufficient.. anyway, I've pushed the patch.

>
> >
> > >
> > > +LOCAL_STATIC_LIBRARIES := libmesa_nir
> > >  LOCAL_SHARED_LIBRARIES := libdrm_nouveau
> > >  LOCAL_MODULE := libmesa_pipe_nouveau
> > >
> > > --
> > > 2.19.1
> > >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] android: nouveau: add support for nir

2019-03-17 Thread Karol Herbst

On Sun, Mar 17, 2019 at 10:52 PM Mauro Rossi  wrote:
>
> Add the necessary build rules for android, to avoid building errors.
>
> Fixes: f014ae3 ("nouveau: add support for nir")
> Signed-off-by: Mauro Rossi 
> ---
>  src/gallium/drivers/nouveau/Android.mk | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/Android.mk 
> b/src/gallium/drivers/nouveau/Android.mk
> index cd2dd0938f..49a341c831 100644
> --- a/src/gallium/drivers/nouveau/Android.mk
> +++ b/src/gallium/drivers/nouveau/Android.mk
> @@ -37,8 +37,13 @@ LOCAL_SRC_FILES := \
> $(NVC0_C_SOURCES)
>
>  LOCAL_C_INCLUDES := \
> -   $(MESA_TOP)/include
> +   $(MESA_TOP)/include \
> +   $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_nir,,)/nir \
> +   $(MESA_TOP)/src/compiler/nir \
> +   $(MESA_TOP)/src/mapi \
> +   $(MESA_TOP)/src/mesa

do we really have to add all those includes? freedreno doesn't seem to
add those either and just has the libmesa_nir dependency

>
> +LOCAL_STATIC_LIBRARIES := libmesa_nir
>  LOCAL_SHARED_LIBRARIES := libdrm_nouveau
>  LOCAL_MODULE := libmesa_pipe_nouveau
>
> --
> 2.19.1
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 33/34] nv50/ir/nir: handle user clip planes for each emitted vertex

2019-03-11 Thread Karol Herbst

On Tue, Mar 12, 2019 at 1:09 AM Ilia Mirkin  wrote:
>
> On Mon, Mar 11, 2019 at 8:05 PM Karol Herbst  wrote:
> >
> > v9: convert to C++ style comments
> > Signed-off-by: Karol Herbst 
> > ---
> >  src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> > index 627848a457f..fdc6eaf759a 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> > @@ -1561,7 +1561,7 @@ Converter::visit(nir_function *function)
> > bb->cfg.attach(>cfg, Graph::Edge::TREE);
> > setPosition(exit, true);
> >
> > -   if (info->io.genUserClip > 0)
> > +   if (prog->getType() == Program::TYPE_VERTEX && info->io.genUserClip > 0)
>
> What about TES? Did you mean && !TYPE_GEOMETRY perhaps?
>

yeah, that's missing. Thanks for pointing it out! Apparently we have
no piglit test testing that.

> >handleUserClipPlanes();
> >
> > // TODO: for non main function this needs to be a OP_RETURN
> > @@ -1889,6 +1889,7 @@ Converter::visit(nir_intrinsic_instr *insn)
> >  }
> >  break;
> >   }
> > + case Program::TYPE_GEOMETRY:
> >   case Program::TYPE_VERTEX: {
> >  if (info->io.genUserClip > 0 && idx == clipVertexOutput) {
> > mkMov(clipVtx[i], src);
> > @@ -2187,6 +2188,9 @@ Converter::visit(nir_intrinsic_instr *insn)
> >break;
> > }
> > case nir_intrinsic_emit_vertex:
> > +  if (info->io.genUserClip > 0)
> > + handleUserClipPlanes();
> > +  // fallthrough
> > case nir_intrinsic_end_primitive: {
> >uint32_t idx = nir_intrinsic_stream_id(insn);
> >mkOp1(getOperation(op), TYPE_U32, NULL, mkImm(idx))->fixed = 1;
> > --
> > 2.20.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 17/34] nv50/ir/nir: implement nir_intrinsic_store_(per_vertex_)output

2019-03-11 Thread Karol Herbst

v3: add workaround for RA issues
indirects have to be multiplied by 0x10
fix indirect access
v4: use smarter getIndirect helper
use storeTo helper
v5: don't use const_offset directly
v8: don't require C++11 features
v9: convert to C++ style comments
handle clip planes correctly

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 57 ++-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index dc8dbcfb48b..6e26e00d91f 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -145,6 +145,8 @@ private:
BasicBlock *exit;
Value *zero;
 
+   int clipVertexOutput;
+
union {
   struct {
  Value *position;
@@ -155,7 +157,8 @@ private:
 Converter::Converter(Program *prog, nir_shader *nir, nv50_ir_prog_info *info)
: ConverterCommon(prog, info),
  nir(nir),
- curLoopDepth(0)
+ curLoopDepth(0),
+ clipVertexOutput(-1)
 {
zero = mkImm((uint32_t)0);
 }
@@ -1082,9 +1085,16 @@ bool Converter::assignSlots() {
  case TGSI_SEMANTIC_CLIPDIST:
 info->io.genUserClip = -1;
 break;
+ case TGSI_SEMANTIC_CLIPVERTEX:
+clipVertexOutput = vary;
+break;
  case TGSI_SEMANTIC_EDGEFLAG:
 info->io.edgeFlagOut = vary;
 break;
+ case TGSI_SEMANTIC_POSITION:
+if (clipVertexOutput < 0)
+   clipVertexOutput = vary;
+break;
  default:
 break;
  }
@@ -1346,6 +1356,11 @@ Converter::visit(nir_function *function)
 
setPosition(entry, true);
 
+   if (info->io.genUserClip > 0) {
+  for (int c = 0; c < 4; ++c)
+ clipVtx[c] = getScratch();
+   }
+
switch (prog->getType()) {
case Program::TYPE_TESSELLATION_CONTROL:
   outBase = mkOp2v(
@@ -1372,6 +1387,9 @@ Converter::visit(nir_function *function)
bb->cfg.attach(>cfg, Graph::Edge::TREE);
setPosition(exit, true);
 
+   if (info->io.genUserClip > 0)
+  handleUserClipPlanes();
+
// TODO: for non main function this needs to be a OP_RETURN
mkOp(OP_EXIT, TYPE_NONE, NULL)->terminator = 1;
return true;
@@ -1542,6 +1560,43 @@ Converter::visit(nir_intrinsic_instr *insn)
   }
   break;
}
+   case nir_intrinsic_store_output:
+   case nir_intrinsic_store_per_vertex_output: {
+  Value *indirect;
+  DataType dType = getSType(insn->src[0], false, false);
+  uint32_t idx = getIndirect(insn, op == nir_intrinsic_store_output ? 1 : 
2, 0, indirect);
+
+  for (uint8_t i = 0u; i < insn->num_components; ++i) {
+ if (!((1u << i) & nir_intrinsic_write_mask(insn)))
+continue;
+
+ uint8_t offset = 0;
+ Value *src = getSrc(>src[0], i);
+ switch (prog->getType()) {
+ case Program::TYPE_FRAGMENT: {
+if (info->out[idx].sn == TGSI_SEMANTIC_POSITION) {
+   // TGSI uses a different interface than NIR, TGSI stores that
+   // value in the z component, NIR in X
+   offset += 2;
+   src = mkOp1v(OP_SAT, TYPE_F32, getScratch(), src);
+}
+break;
+ }
+ case Program::TYPE_VERTEX: {
+if (info->io.genUserClip > 0 && idx == clipVertexOutput) {
+   mkMov(clipVtx[i], src);
+   src = clipVtx[i];
+}
+break;
+ }
+ default:
+break;
+ }
+
+ storeTo(insn, FILE_SHADER_OUTPUT, OP_EXPORT, dType, src, idx, i + 
offset, indirect);
+  }
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 23/34] nv50/ir/nir: add skeleton getOperation for intrinsics

2019-03-11 Thread Karol Herbst

v7: don't assert in default case for getSubOp

Signed-off-by: Karol Herbst 
Reviewed-by: Pierre Moreau 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 22 +++
 1 file changed, 22 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 2c4513aad02..ab3bf7f843a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -116,10 +116,12 @@ private:
std::vector getSTypes(nir_alu_instr *);
DataType getSType(nir_src &, bool isFloat, bool isSigned);
 
+   operation getOperation(nir_intrinsic_op);
operation getOperation(nir_op);
operation getOperation(nir_texop);
operation preOperationNeeded(nir_op);
 
+   int getSubOp(nir_intrinsic_op);
int getSubOp(nir_op);
 
CondCode getCondCode(nir_op);
@@ -457,6 +459,17 @@ Converter::getOperation(nir_texop op)
}
 }
 
+operation
+Converter::getOperation(nir_intrinsic_op op)
+{
+   switch (op) {
+   default:
+  ERROR("couldn't get operation for nir_intrinsic_op %u\n", op);
+  assert(false);
+  return OP_NOP;
+   }
+}
+
 operation
 Converter::preOperationNeeded(nir_op op)
 {
@@ -481,6 +494,15 @@ Converter::getSubOp(nir_op op)
}
 }
 
+int
+Converter::getSubOp(nir_intrinsic_op op)
+{
+   switch (op) {
+   default:
+  return 0;
+   }
+}
+
 CondCode
 Converter::getCondCode(nir_op op)
 {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 20/34] nv50/ir/nir: implement loading system values

2019-03-11 Thread Karol Herbst

v2: support more sys values
fixed a bug where for multi component reads all values ended up in x
v3: add load_patch_vertices_in
v4: add subgroup stuff
v5: add helper invocation
v6: fix loading 64 bit system values
v8: don't require C++11 features
v9: convert to C++ style comments

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 122 ++
 1 file changed, 122 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 5c372794e02..43c9a468f5a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -70,6 +70,7 @@ private:
LValues& convert(nir_alu_dest *);
BasicBlock* convert(nir_block *);
LValues& convert(nir_dest *);
+   SVSemantic convert(nir_intrinsic_op);
LValues& convert(nir_register *);
LValues& convert(nir_ssa_def *);
 
@@ -1544,6 +1545,70 @@ Converter::visit(nir_instr *insn)
return true;
 }
 
+SVSemantic
+Converter::convert(nir_intrinsic_op intr)
+{
+   switch (intr) {
+   case nir_intrinsic_load_base_vertex:
+  return SV_BASEVERTEX;
+   case nir_intrinsic_load_base_instance:
+  return SV_BASEINSTANCE;
+   case nir_intrinsic_load_draw_id:
+  return SV_DRAWID;
+   case nir_intrinsic_load_front_face:
+  return SV_FACE;
+   case nir_intrinsic_load_helper_invocation:
+  return SV_THREAD_KILL;
+   case nir_intrinsic_load_instance_id:
+  return SV_INSTANCE_ID;
+   case nir_intrinsic_load_invocation_id:
+  return SV_INVOCATION_ID;
+   case nir_intrinsic_load_local_group_size:
+  return SV_NTID;
+   case nir_intrinsic_load_local_invocation_id:
+  return SV_TID;
+   case nir_intrinsic_load_num_work_groups:
+  return SV_NCTAID;
+   case nir_intrinsic_load_patch_vertices_in:
+  return SV_VERTEX_COUNT;
+   case nir_intrinsic_load_primitive_id:
+  return SV_PRIMITIVE_ID;
+   case nir_intrinsic_load_sample_id:
+  return SV_SAMPLE_INDEX;
+   case nir_intrinsic_load_sample_mask_in:
+  return SV_SAMPLE_MASK;
+   case nir_intrinsic_load_sample_pos:
+  return SV_SAMPLE_POS;
+   case nir_intrinsic_load_subgroup_eq_mask:
+  return SV_LANEMASK_EQ;
+   case nir_intrinsic_load_subgroup_ge_mask:
+  return SV_LANEMASK_GE;
+   case nir_intrinsic_load_subgroup_gt_mask:
+  return SV_LANEMASK_GT;
+   case nir_intrinsic_load_subgroup_le_mask:
+  return SV_LANEMASK_LE;
+   case nir_intrinsic_load_subgroup_lt_mask:
+  return SV_LANEMASK_LT;
+   case nir_intrinsic_load_subgroup_invocation:
+  return SV_LANEID;
+   case nir_intrinsic_load_tess_coord:
+  return SV_TESS_COORD;
+   case nir_intrinsic_load_tess_level_inner:
+  return SV_TESS_INNER;
+   case nir_intrinsic_load_tess_level_outer:
+  return SV_TESS_OUTER;
+   case nir_intrinsic_load_vertex_id:
+  return SV_VERTEX_ID;
+   case nir_intrinsic_load_work_group_id:
+  return SV_CTAID;
+   default:
+  ERROR("unknown SVSemantic for nir_intrinsic_op %s\n",
+nir_intrinsic_infos[intr].name);
+  assert(false);
+  return SV_LAST;
+   }
+}
+
 bool
 Converter::visit(nir_intrinsic_instr *insn)
 {
@@ -1746,6 +1811,63 @@ Converter::visit(nir_intrinsic_instr *insn)
   mkOp(OP_DISCARD, TYPE_NONE, NULL)->setPredicate(CC_P, pred);
   break;
}
+   case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_base_instance:
+   case nir_intrinsic_load_draw_id:
+   case nir_intrinsic_load_front_face:
+   case nir_intrinsic_load_helper_invocation:
+   case nir_intrinsic_load_instance_id:
+   case nir_intrinsic_load_invocation_id:
+   case nir_intrinsic_load_local_group_size:
+   case nir_intrinsic_load_local_invocation_id:
+   case nir_intrinsic_load_num_work_groups:
+   case nir_intrinsic_load_patch_vertices_in:
+   case nir_intrinsic_load_primitive_id:
+   case nir_intrinsic_load_sample_id:
+   case nir_intrinsic_load_sample_mask_in:
+   case nir_intrinsic_load_sample_pos:
+   case nir_intrinsic_load_subgroup_eq_mask:
+   case nir_intrinsic_load_subgroup_ge_mask:
+   case nir_intrinsic_load_subgroup_gt_mask:
+   case nir_intrinsic_load_subgroup_le_mask:
+   case nir_intrinsic_load_subgroup_lt_mask:
+   case nir_intrinsic_load_subgroup_invocation:
+   case nir_intrinsic_load_tess_coord:
+   case nir_intrinsic_load_tess_level_inner:
+   case nir_intrinsic_load_tess_level_outer:
+   case nir_intrinsic_load_vertex_id:
+   case nir_intrinsic_load_work_group_id: {
+  const DataType dType = getDType(insn);
+  SVSemantic sv = convert(op);
+  LValues  = convert(>dest);
+
+  for (uint8_t i = 0u; i < insn->num_components; ++i) {
+ Value *def;
+ if (typeSizeof(dType) == 8)
+def = getSSA();
+ else
+def = newDefs[i];
+
+ if (sv == SV_TID && info->prop.cp.numThreads[i] == 1) {
+loadImm(def, 0u);
+

[Mesa-dev] [PATCH 29/34] nv50/ir/nir: implement images

2019-03-11 Thread Karol Herbst

v3: fix compiler warnings
v4: use loadFrom helper
v5: fix signed min/max
v6: set tex mask
add support for indirect image access
set cache mode
v7: make compatible with 884d27bcf688d36c3bbe01bceca525595add3b33
rework the whole deref thing to prepare for bindless
v8: port to deref instructions
don't require C++11 features
v9: implement MS images
rebase on master (image modifiers)
fix regressions due to variable src compnents
replace '(*it).' with 'it->'
convert to C++ style comments

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 390 +-
 1 file changed, 380 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 320f90329ef..ecdc667b25a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -36,6 +36,7 @@
 #else
 #include 
 #endif
+#include 
 #include 
 
 namespace {
@@ -76,6 +77,8 @@ private:
LValues& convert(nir_register *);
LValues& convert(nir_ssa_def *);
 
+   ImgFormat convertGLImgFormat(GLuint);
+
Value* getSrc(nir_alu_src *, uint8_t component = 0);
Value* getSrc(nir_register *, uint8_t);
Value* getSrc(nir_src *, uint8_t, bool indirect = false);
@@ -112,6 +115,7 @@ private:
 
DataType getDType(nir_alu_instr *);
DataType getDType(nir_intrinsic_instr *);
+   DataType getDType(nir_intrinsic_instr *, bool isSigned);
DataType getDType(nir_op, uint8_t);
 
std::vector getSTypes(nir_alu_instr *);
@@ -133,6 +137,7 @@ private:
bool visit(nir_alu_instr *);
bool visit(nir_block *);
bool visit(nir_cf_node *);
+   bool visit(nir_deref_instr *);
bool visit(nir_function *);
bool visit(nir_if *);
bool visit(nir_instr *);
@@ -145,6 +150,11 @@ private:
 
// tex stuff
Value* applyProjection(Value *src, Value *proj);
+   unsigned int getNIRArgCount(TexInstruction::Target&);
+
+   // image stuff
+   uint16_t handleDeref(nir_deref_instr *, Value * & indirect, const 
nir_variable * &);
+   CacheMode getCacheModeFromVar(const nir_variable *);
 
nir_shader *nir;
 
@@ -240,11 +250,30 @@ Converter::getDType(nir_alu_instr *insn)
 
 DataType
 Converter::getDType(nir_intrinsic_instr *insn)
+{
+   bool isSigned;
+   switch (insn->intrinsic) {
+   case nir_intrinsic_shared_atomic_imax:
+   case nir_intrinsic_shared_atomic_imin:
+   case nir_intrinsic_ssbo_atomic_imax:
+   case nir_intrinsic_ssbo_atomic_imin:
+  isSigned = true;
+  break;
+   default:
+  isSigned = false;
+  break;
+   }
+
+   return getDType(insn, isSigned);
+}
+
+DataType
+Converter::getDType(nir_intrinsic_instr *insn, bool isSigned)
 {
if (insn->dest.is_ssa)
-  return typeOfSize(insn->dest.ssa.bit_size / 8, false, false);
+  return typeOfSize(insn->dest.ssa.bit_size / 8, false, isSigned);
else
-  return typeOfSize(insn->dest.reg.reg->bit_size / 8, false, false);
+  return typeOfSize(insn->dest.reg.reg->bit_size / 8, false, isSigned);
 }
 
 DataType
@@ -469,6 +498,22 @@ Converter::getOperation(nir_intrinsic_op op)
   return OP_EMIT;
case nir_intrinsic_end_primitive:
   return OP_RESTART;
+   case nir_intrinsic_image_deref_atomic_add:
+   case nir_intrinsic_image_deref_atomic_and:
+   case nir_intrinsic_image_deref_atomic_comp_swap:
+   case nir_intrinsic_image_deref_atomic_exchange:
+   case nir_intrinsic_image_deref_atomic_max:
+   case nir_intrinsic_image_deref_atomic_min:
+   case nir_intrinsic_image_deref_atomic_or:
+   case nir_intrinsic_image_deref_atomic_xor:
+  return OP_SUREDP;
+   case nir_intrinsic_image_deref_load:
+  return OP_SULDP;
+   case nir_intrinsic_image_deref_samples:
+   case nir_intrinsic_image_deref_size:
+  return OP_SUQ;
+   case nir_intrinsic_image_deref_store:
+  return OP_SUSTP;
default:
   ERROR("couldn't get operation for nir_intrinsic_op %u\n", op);
   assert(false);
@@ -504,24 +549,42 @@ int
 Converter::getSubOp(nir_intrinsic_op op)
 {
switch (op) {
+   case nir_intrinsic_image_deref_atomic_add:
+   case nir_intrinsic_shared_atomic_add:
case nir_intrinsic_ssbo_atomic_add:
-  return NV50_IR_SUBOP_ATOM_ADD;
+  return  NV50_IR_SUBOP_ATOM_ADD;
+   case nir_intrinsic_image_deref_atomic_and:
+   case nir_intrinsic_shared_atomic_and:
case nir_intrinsic_ssbo_atomic_and:
-  return NV50_IR_SUBOP_ATOM_AND;
+  return  NV50_IR_SUBOP_ATOM_AND;
+   case nir_intrinsic_image_deref_atomic_comp_swap:
+   case nir_intrinsic_shared_atomic_comp_swap:
case nir_intrinsic_ssbo_atomic_comp_swap:
-  return NV50_IR_SUBOP_ATOM_CAS;
+  return  NV50_IR_SUBOP_ATOM_CAS;
+   case nir_intrinsic_image_deref_atomic_exchange:
+   case nir_intrinsic_shared_atomic_exchange:
case nir_intrinsic_ssbo_atomic_exchange:
-  return NV50_IR_SUBOP_AT

[Mesa-dev] [PATCH 34/34] nv50ir/nir: move immediates before use

2019-03-11 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 59 +--
 1 file changed, 41 insertions(+), 18 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index fdc6eaf759a..a16c014c01c 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -66,6 +66,7 @@ public:
 private:
typedef std::vector LValues;
typedef unordered_map NirDefMap;
+   typedef unordered_map ImmediateMap;
typedef unordered_map NirArrayLMemOffsets;
typedef unordered_map NirBlockMap;
 
@@ -74,6 +75,7 @@ private:
BasicBlock* convert(nir_block *);
LValues& convert(nir_dest *);
SVSemantic convert(nir_intrinsic_op);
+   Value* convert(nir_load_const_instr*, uint8_t);
LValues& convert(nir_register *);
LValues& convert(nir_ssa_def *);
 
@@ -160,12 +162,14 @@ private:
 
NirDefMap ssaDefs;
NirDefMap regDefs;
+   ImmediateMap immediates;
NirArrayLMemOffsets regToLmemOffset;
NirBlockMap blocks;
unsigned int curLoopDepth;
 
BasicBlock *exit;
Value *zero;
+   Instruction *immInsertPos;
 
int clipVertexOutput;
 
@@ -715,6 +719,10 @@ Converter::getSrc(nir_src *src, uint8_t idx, bool indirect)
 Value*
 Converter::getSrc(nir_ssa_def *src, uint8_t idx)
 {
+   ImmediateMap::iterator iit = immediates.find(src->index);
+   if (iit != immediates.end())
+  return convert((*iit).second, idx);
+
NirDefMap::iterator it = ssaDefs.find(src->index);
if (it == ssaDefs.end()) {
   ERROR("SSA value %u not found\n", src->index);
@@ -1702,6 +1710,8 @@ Converter::visit(nir_loop *loop)
 bool
 Converter::visit(nir_instr *insn)
 {
+   // we need an insertion point for on the fly generated immediate loads
+   immInsertPos = bb->getExit();
switch (insn->type) {
case nir_instr_type_alu:
   return visit(nir_instr_as_alu(insn));
@@ -2491,28 +2501,41 @@ Converter::visit(nir_jump_instr *insn)
return true;
 }
 
+Value*
+Converter::convert(nir_load_const_instr *insn, uint8_t idx)
+{
+   Value *val;
+
+   if (immInsertPos)
+  setPosition(immInsertPos, true);
+   else
+  setPosition(bb, false);
+
+   switch (insn->def.bit_size) {
+   case 64:
+  val = loadImm(getSSA(8), insn->value.u64[idx]);
+  break;
+   case 32:
+  val = loadImm(getSSA(4), insn->value.u32[idx]);
+  break;
+   case 16:
+  val = loadImm(getSSA(2), insn->value.u16[idx]);
+  break;
+   case 8:
+  val = loadImm(getSSA(1), insn->value.u8[idx]);
+  break;
+   default:
+  unreachable("unhandled bit size!\n");
+   }
+   setPosition(bb, true);
+   return val;
+}
+
 bool
 Converter::visit(nir_load_const_instr *insn)
 {
assert(insn->def.bit_size <= 64);
-
-   LValues  = convert(>def);
-   for (int i = 0; i < insn->def.num_components; i++) {
-  switch (insn->def.bit_size) {
-  case 64:
- loadImm(newDefs[i], insn->value.u64[i]);
- break;
-  case 32:
- loadImm(newDefs[i], insn->value.u32[i]);
- break;
-  case 16:
- loadImm(newDefs[i], insn->value.u16[i]);
- break;
-  case 8:
- loadImm(newDefs[i], insn->value.u8[i]);
- break;
-  }
-   }
+   immediates[insn->def.index] = insn;
return true;
 }
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/34] nv50/ir/nir: parse NIR shader info

2019-03-11 Thread Karol Herbst

v2: parse a few more fields
v3: add special handling for GL_ISOLINES
v8: set info->prop.fp.readsSampleLocations
don't require C++11 features
v9: replace '(*it).' with 'it->'
convert to C++ style comments

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 323 +-
 1 file changed, 320 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index d3cba9a63c3..3c5eac17cf9 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -63,10 +63,12 @@ public:
 
bool run();
 private:
-   typedef std::vector LValues;
+   typedef std::vector LValues;
typedef unordered_map NirDefMap;
+   typedef unordered_map NirBlockMap;
 
LValues& convert(nir_alu_dest *);
+   BasicBlock* convert(nir_block *);
LValues& convert(nir_dest *);
LValues& convert(nir_register *);
LValues& convert(nir_ssa_def *);
@@ -113,16 +115,48 @@ private:
DataType getSType(nir_src &, bool isFloat, bool isSigned);
 
bool assignSlots();
+   bool parseNIR();
+
+   bool visit(nir_block *);
+   bool visit(nir_cf_node *);
+   bool visit(nir_function *);
+   bool visit(nir_if *);
+   bool visit(nir_instr *);
+   bool visit(nir_jump_instr *);
+   bool visit(nir_loop *);
 
nir_shader *nir;
 
NirDefMap ssaDefs;
NirDefMap regDefs;
+   NirBlockMap blocks;
+   unsigned int curLoopDepth;
+
+   BasicBlock *exit;
+
+   union {
+  struct {
+ Value *position;
+  } fp;
+   };
 };
 
 Converter::Converter(Program *prog, nir_shader *nir, nv50_ir_prog_info *info)
: ConverterCommon(prog, info),
- nir(nir) {}
+ nir(nir),
+ curLoopDepth(0) {}
+
+BasicBlock *
+Converter::convert(nir_block *block)
+{
+   NirBlockMap::iterator it = blocks.find(block->index);
+   if (it != blocks.end())
+  return it->second;
+
+   BasicBlock *bb = new BasicBlock(func);
+   blocks[block->index] = bb;
+   return bb;
+}
 
 bool
 Converter::isFloatType(nir_alu_type type)
@@ -1041,6 +1075,279 @@ Converter::storeTo(nir_intrinsic_instr *insn, DataFile 
file, operation op,
}
 }
 
+bool
+Converter::parseNIR()
+{
+   info->io.clipDistances = nir->info.clip_distance_array_size;
+   info->io.cullDistances = nir->info.cull_distance_array_size;
+
+   switch(prog->getType()) {
+   case Program::TYPE_COMPUTE:
+  info->prop.cp.numThreads[0] = nir->info.cs.local_size[0];
+  info->prop.cp.numThreads[1] = nir->info.cs.local_size[1];
+  info->prop.cp.numThreads[2] = nir->info.cs.local_size[2];
+  info->bin.smemSize = nir->info.cs.shared_size;
+  break;
+   case Program::TYPE_FRAGMENT:
+  info->prop.fp.earlyFragTests = nir->info.fs.early_fragment_tests;
+  info->prop.fp.persampleInvocation =
+ (nir->info.system_values_read & SYSTEM_BIT_SAMPLE_ID) ||
+ (nir->info.system_values_read & SYSTEM_BIT_SAMPLE_POS);
+  info->prop.fp.postDepthCoverage = nir->info.fs.post_depth_coverage;
+  info->prop.fp.readsSampleLocations =
+ (nir->info.system_values_read & SYSTEM_BIT_SAMPLE_POS);
+  info->prop.fp.usesDiscard = nir->info.fs.uses_discard;
+  info->prop.fp.usesSampleMaskIn =
+ !!(nir->info.system_values_read & SYSTEM_BIT_SAMPLE_MASK_IN);
+  break;
+   case Program::TYPE_GEOMETRY:
+  info->prop.gp.inputPrim = nir->info.gs.input_primitive;
+  info->prop.gp.instanceCount = nir->info.gs.invocations;
+  info->prop.gp.maxVertices = nir->info.gs.vertices_out;
+  info->prop.gp.outputPrim = nir->info.gs.output_primitive;
+  break;
+   case Program::TYPE_TESSELLATION_CONTROL:
+   case Program::TYPE_TESSELLATION_EVAL:
+  if (nir->info.tess.primitive_mode == GL_ISOLINES)
+ info->prop.tp.domain = GL_LINES;
+  else
+ info->prop.tp.domain = nir->info.tess.primitive_mode;
+  info->prop.tp.outputPatchSize = nir->info.tess.tcs_vertices_out;
+  info->prop.tp.outputPrim =
+ nir->info.tess.point_mode ? PIPE_PRIM_POINTS : PIPE_PRIM_TRIANGLES;
+  info->prop.tp.partitioning = (nir->info.tess.spacing + 1) % 3;
+  info->prop.tp.winding = !nir->info.tess.ccw;
+  break;
+   case Program::TYPE_VERTEX:
+  info->prop.vp.usesDrawParameters =
+ (nir->info.system_values_read & 
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX)) ||
+ (nir->info.system_values_read & 
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE)) ||
+ (nir->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID));
+  break;
+   default:
+  break;
+   }
+
+   return true;
+}
+
+bool
+Converter::visit(nir_function *function)
+{
+   // we only support emiting the main function for now
+   assert(!st

[Mesa-dev] [PATCH 32/34] nv50/ir/nir: implement intrinsic shader_clock

2019-03-11 Thread Karol Herbst

v9: mark as fixed

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index c379eb72c1e..627848a457f 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -2444,6 +2444,14 @@ Converter::visit(nir_intrinsic_instr *insn)
   bar->subOp = getSubOp(op);
   break;
}
+   case nir_intrinsic_shader_clock: {
+  const DataType dType = getDType(insn);
+  LValues  = convert(>dest);
+
+  loadImm(newDefs[0], 0u);
+  mkOp1(OP_RDSV, dType, newDefs[1], mkSysVal(SV_CLOCK, 0))->fixed = 1;
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 25/34] nv50/ir/nir: implement variable indexing

2019-03-11 Thread Karol Herbst

We store those arrays in local memory and reserve some space for each of the
arrays. With NIR we could store those arrays packed, but we don't do that yet
as it causes MemoryOpt to generate unaligned memory accesses.

v3: use fixed size vec4 arrays until we fix MemoryOpt
v4: fix for 64 bit types
v5: use loadFrom helper
v8: don't require C++11 features
v9: convert to C++ style comments

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 58 +++
 1 file changed, 58 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 7a10a408b70..5b7a3303e78 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -65,6 +65,7 @@ public:
 private:
typedef std::vector LValues;
typedef unordered_map NirDefMap;
+   typedef unordered_map NirArrayLMemOffsets;
typedef unordered_map NirBlockMap;
 
TexTarget convert(glsl_sampler_dim, bool isArray, bool isShadow);
@@ -149,6 +150,7 @@ private:
 
NirDefMap ssaDefs;
NirDefMap regDefs;
+   NirArrayLMemOffsets regToLmemOffset;
NirBlockMap blocks;
unsigned int curLoopDepth;
 
@@ -1353,6 +1355,7 @@ Converter::storeTo(nir_intrinsic_instr *insn, DataFile 
file, operation op,
 bool
 Converter::parseNIR()
 {
+   info->bin.tlsSpace = 0;
info->io.clipDistances = nir->info.clip_distance_array_size;
info->io.cullDistances = nir->info.cull_distance_array_size;
 
@@ -1444,6 +1447,16 @@ Converter::visit(nir_function *function)
   break;
}
 
+   nir_foreach_register(reg, >impl->registers) {
+  if (reg->num_array_elems) {
+ // TODO: packed variables would be nice, but MemoryOpt fails
+ // replace 4 with reg->num_components
+ uint32_t size = 4 * reg->num_array_elems * (reg->bit_size / 8);
+ regToLmemOffset[reg->index] = info->bin.tlsSpace;
+ info->bin.tlsSpace += size;
+  }
+   }
+
nir_index_ssa_defs(function->impl);
foreach_list_typed(nir_cf_node, node, node, >impl->body) {
   if (!visit(node))
@@ -2199,6 +2212,51 @@ Converter::visit(nir_alu_instr *insn)
//   2. they basically just merge multiple values into one data type
case nir_op_imov:
case nir_op_fmov:
+  if (!insn->dest.dest.is_ssa && insn->dest.dest.reg.reg->num_array_elems) 
{
+ nir_reg_dest& reg = insn->dest.dest.reg;
+ uint32_t goffset = regToLmemOffset[reg.reg->index];
+ uint8_t comps = reg.reg->num_components;
+ uint8_t size = reg.reg->bit_size / 8;
+ uint8_t csize = 4 * size; // TODO after fixing MemoryOpts: comps * 
size;
+ uint32_t aoffset = csize * reg.base_offset;
+ Value *indirect = NULL;
+
+ if (reg.indirect)
+indirect = mkOp2v(OP_MUL, TYPE_U32, getSSA(4, FILE_ADDRESS),
+  getSrc(reg.indirect, 0), mkImm(csize));
+
+ for (uint8_t i = 0u; i < comps; ++i) {
+if (!((1u << i) & insn->dest.write_mask))
+   continue;
+
+Symbol *sym = mkSymbol(FILE_MEMORY_LOCAL, 0, dType, goffset + 
aoffset + i * size);
+mkStore(OP_STORE, dType, sym, indirect, getSrc(>src[0], i));
+ }
+ break;
+  } else if (!insn->src[0].src.is_ssa && 
insn->src[0].src.reg.reg->num_array_elems) {
+ LValues  = convert(>dest);
+ nir_reg_src& reg = insn->src[0].src.reg;
+ uint32_t goffset = regToLmemOffset[reg.reg->index];
+ // uint8_t comps = reg.reg->num_components;
+ uint8_t size = reg.reg->bit_size / 8;
+ uint8_t csize = 4 * size; // TODO after fixing MemoryOpts: comps * 
size;
+ uint32_t aoffset = csize * reg.base_offset;
+ Value *indirect = NULL;
+
+ if (reg.indirect)
+indirect = mkOp2v(OP_MUL, TYPE_U32, getSSA(4, FILE_ADDRESS), 
getSrc(reg.indirect, 0), mkImm(csize));
+
+ for (uint8_t i = 0u; i < newDefs.size(); ++i)
+loadFrom(FILE_MEMORY_LOCAL, 0, dType, newDefs[i], goffset + 
aoffset, i, indirect);
+
+ break;
+  } else {
+ LValues  = convert(>dest);
+ for (LValues::size_type c = 0u; c < newDefs.size(); ++c) {
+mkMov(newDefs[c], getSrc(>src[0], c), dType);
+ }
+  }
+  break;
case nir_op_vec2:
case nir_op_vec3:
case nir_op_vec4: {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 24/34] nv50/ir/nir: implement vote and ballot

2019-03-11 Thread Karol Herbst

v2: add vote_eq support
use the new subop intrinsic helper
add ballot
v3: add read_(first_)invocation
v8: handle vectorized intrinsics
don't require C++11 features
v9: lower_subgroups to 32 bit (produces less instructions)
use getSSA and getScratch instead of new_LValue

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 48 +++
 1 file changed, 48 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index ab3bf7f843a..7a10a408b70 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -498,6 +498,12 @@ int
 Converter::getSubOp(nir_intrinsic_op op)
 {
switch (op) {
+   case nir_intrinsic_vote_all:
+  return NV50_IR_SUBOP_VOTE_ALL;
+   case nir_intrinsic_vote_any:
+  return NV50_IR_SUBOP_VOTE_ANY;
+   case nir_intrinsic_vote_ieq:
+  return NV50_IR_SUBOP_VOTE_UNI;
default:
   return 0;
}
@@ -1931,6 +1937,42 @@ Converter::visit(nir_intrinsic_instr *insn)
   loadImm(newDefs[0], 32u);
   break;
}
+   case nir_intrinsic_vote_all:
+   case nir_intrinsic_vote_any:
+   case nir_intrinsic_vote_ieq: {
+  LValues  = convert(>dest);
+  Value *pred = getScratch(1, FILE_PREDICATE);
+  mkCmp(OP_SET, CC_NE, TYPE_U32, pred, TYPE_U32, getSrc(>src[0], 0), 
zero);
+  mkOp1(OP_VOTE, TYPE_U32, pred, pred)->subOp = getSubOp(op);
+  mkCvt(OP_CVT, TYPE_U32, newDefs[0], TYPE_U8, pred);
+  break;
+   }
+   case nir_intrinsic_ballot: {
+  LValues  = convert(>dest);
+  Value *pred = getSSA(1, FILE_PREDICATE);
+  mkCmp(OP_SET, CC_NE, TYPE_U32, pred, TYPE_U32, getSrc(>src[0], 0), 
zero);
+  mkOp1(OP_VOTE, TYPE_U32, newDefs[0], pred)->subOp = 
NV50_IR_SUBOP_VOTE_ANY;
+  break;
+   }
+   case nir_intrinsic_read_first_invocation:
+   case nir_intrinsic_read_invocation: {
+  LValues  = convert(>dest);
+  const DataType dType = getDType(insn);
+  Value *tmp = getScratch();
+
+  if (op == nir_intrinsic_read_first_invocation) {
+ mkOp1(OP_VOTE, TYPE_U32, tmp, mkImm(1))->subOp = 
NV50_IR_SUBOP_VOTE_ANY;
+ mkOp2(OP_EXTBF, TYPE_U32, tmp, tmp, mkImm(0x2000))->subOp = 
NV50_IR_SUBOP_EXTBF_REV;
+ mkOp1(OP_BFIND, TYPE_U32, tmp, tmp)->subOp = NV50_IR_SUBOP_BFIND_SAMT;
+  } else
+ tmp = getSrc(>src[1], 0);
+
+  for (uint8_t i = 0; i < insn->num_components; ++i) {
+ mkOp3(OP_SHFL, dType, newDefs[i], getSrc(>src[0], i), tmp, 
mkImm(0x1f))
+->subOp = NV50_IR_SUBOP_SHFL_IDX;
+  }
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
@@ -2566,7 +2608,13 @@ Converter::run()
if (prog->dbgFlags & NV50_IR_DEBUG_VERBOSE)
   nir_print_shader(nir, stderr);
 
+   struct nir_lower_subgroups_options subgroup_options = {
+  .subgroup_size = 32,
+  .ballot_bit_size = 32,
+   };
+
NIR_PASS_V(nir, nir_lower_io, nir_var_all, type_size, 
(nir_lower_io_options)0);
+   NIR_PASS_V(nir, nir_lower_subgroups, _options);
NIR_PASS_V(nir, nir_lower_regs_to_ssa);
NIR_PASS_V(nir, nir_lower_load_const_to_scalar);
NIR_PASS_V(nir, nir_lower_vars_to_ssa);
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 26/34] nv50/ir/nir: implement geometry shader nir_intrinsics

2019-03-11 Thread Karol Herbst

v4: use smarter getIndirect helper
use new getSlotAddress helper
use loadFrom helper
v8: don't require C++11 features

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 5b7a3303e78..991c1283a0f 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -465,6 +465,10 @@ operation
 Converter::getOperation(nir_intrinsic_op op)
 {
switch (op) {
+   case nir_intrinsic_emit_vertex:
+  return OP_EMIT;
+   case nir_intrinsic_end_primitive:
+  return OP_RESTART;
default:
   ERROR("couldn't get operation for nir_intrinsic_op %u\n", op);
   assert(false);
@@ -1986,6 +1990,29 @@ Converter::visit(nir_intrinsic_instr *insn)
   }
   break;
}
+   case nir_intrinsic_load_per_vertex_input: {
+  const DataType dType = getDType(insn);
+  LValues  = convert(>dest);
+  Value *indirectVertex;
+  Value *indirectOffset;
+  uint32_t baseVertex = getIndirect(>src[0], 0, indirectVertex);
+  uint32_t idx = getIndirect(insn, 1, 0, indirectOffset);
+
+  Value *vtxBase = mkOp2v(OP_PFETCH, TYPE_U32, getSSA(4, FILE_ADDRESS),
+  mkImm(baseVertex), indirectVertex);
+  for (uint8_t i = 0u; i < insn->num_components; ++i) {
+ uint32_t address = getSlotAddress(insn, idx, i);
+ loadFrom(FILE_SHADER_INPUT, 0, dType, newDefs[i], address, 0,
+  indirectOffset, vtxBase, info->in[idx].patch);
+  }
+  break;
+   }
+   case nir_intrinsic_emit_vertex:
+   case nir_intrinsic_end_primitive: {
+  uint32_t idx = nir_intrinsic_stream_id(insn);
+  mkOp1(getOperation(op), TYPE_U32, NULL, mkImm(idx))->fixed = 1;
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 21/34] nv50/ir/nir: implement nir_ssa_undef_instr

2019-03-11 Thread Karol Herbst

v2: use mkOp
v8: don't require C++11 features

Signed-off-by: Karol Herbst 
Reviewed-by: Pierre Moreau 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp| 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 43c9a468f5a..2ed508bbc2d 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -135,6 +135,7 @@ private:
bool visit(nir_jump_instr *);
bool visit(nir_load_const_instr*);
bool visit(nir_loop *);
+   bool visit(nir_ssa_undef_instr *);
 
nir_shader *nir;
 
@@ -1538,6 +1539,8 @@ Converter::visit(nir_instr *insn)
   return visit(nir_instr_as_jump(insn));
case nir_instr_type_load_const:
   return visit(nir_instr_as_load_const(insn));
+   case nir_instr_type_ssa_undef:
+  return visit(nir_instr_as_ssa_undef(insn));
default:
   ERROR("unknown nir_instr type %u\n", insn->type);
   return false;
@@ -2289,6 +2292,16 @@ Converter::visit(nir_alu_instr *insn)
 }
 #undef DEFAULT_CHECKS
 
+bool
+Converter::visit(nir_ssa_undef_instr *insn)
+{
+   LValues  = convert(>def);
+   for (uint8_t i = 0u; i < insn->def.num_components; ++i) {
+  mkOp(OP_NOP, TYPE_NONE, newDefs[i]);
+   }
+   return true;
+}
+
 bool
 Converter::run()
 {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 13/34] nv50/ir/nir: implement nir_load_const_instr

2019-03-11 Thread Karol Herbst

v8: fix loading 8/16 bit constants

Signed-off-by: Karol Herbst 
Reviewed-by: Pierre Moreau 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 28 +++
 1 file changed, 28 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 3c5eac17cf9..3fa590a4655 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -123,6 +123,7 @@ private:
bool visit(nir_if *);
bool visit(nir_instr *);
bool visit(nir_jump_instr *);
+   bool visit(nir_load_const_instr*);
bool visit(nir_loop *);
 
nir_shader *nir;
@@ -1314,6 +1315,8 @@ Converter::visit(nir_instr *insn)
switch (insn->type) {
case nir_instr_type_jump:
   return visit(nir_instr_as_jump(insn));
+   case nir_instr_type_load_const:
+  return visit(nir_instr_as_load_const(insn));
default:
   ERROR("unknown nir_instr type %u\n", insn->type);
   return false;
@@ -1348,6 +1351,31 @@ Converter::visit(nir_jump_instr *insn)
return true;
 }
 
+bool
+Converter::visit(nir_load_const_instr *insn)
+{
+   assert(insn->def.bit_size <= 64);
+
+   LValues  = convert(>def);
+   for (int i = 0; i < insn->def.num_components; i++) {
+  switch (insn->def.bit_size) {
+  case 64:
+ loadImm(newDefs[i], insn->value.u64[i]);
+ break;
+  case 32:
+ loadImm(newDefs[i], insn->value.u32[i]);
+ break;
+  case 16:
+ loadImm(newDefs[i], insn->value.u16[i]);
+ break;
+  case 8:
+ loadImm(newDefs[i], insn->value.u8[i]);
+ break;
+  }
+   }
+   return true;
+}
+
 bool
 Converter::run()
 {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 22/34] nv50/ir/nir: implement nir_instr_type_tex

2019-03-11 Thread Karol Herbst

a lot of those fields are not valid for a lot of tex ops. Not quite sure if
it's worth the effort to check for those or just keep it like that. It seems
to kind of work.

v2: reworked offset handling
add tex support with indirect R/S arguments
handle GLSL_SAMPLER_DIM_EXTERNAL
drop reference in convert(glsl_sampler_dim&, bool, bool)
fix tg4 component selection
v5: fill up coords args with scratch values if coords provided is less than 
TexTarget.getArgCount()
v7: prepare for bindless_texture support
v8: don't require C++11 features
v9: convert to C++ style comments
fix txf with a uniform constant 0 lod

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 234 ++
 1 file changed, 234 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 2ed508bbc2d..2c4513aad02 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -67,6 +67,7 @@ private:
typedef unordered_map NirDefMap;
typedef unordered_map NirBlockMap;
 
+   TexTarget convert(glsl_sampler_dim, bool isArray, bool isShadow);
LValues& convert(nir_alu_dest *);
BasicBlock* convert(nir_block *);
LValues& convert(nir_dest *);
@@ -116,6 +117,7 @@ private:
DataType getSType(nir_src &, bool isFloat, bool isSigned);
 
operation getOperation(nir_op);
+   operation getOperation(nir_texop);
operation preOperationNeeded(nir_op);
 
int getSubOp(nir_op);
@@ -136,6 +138,10 @@ private:
bool visit(nir_load_const_instr*);
bool visit(nir_loop *);
bool visit(nir_ssa_undef_instr *);
+   bool visit(nir_tex_instr *);
+
+   // tex stuff
+   Value* applyProjection(Value *src, Value *proj);
 
nir_shader *nir;
 
@@ -421,6 +427,36 @@ Converter::getOperation(nir_op op)
}
 }
 
+operation
+Converter::getOperation(nir_texop op)
+{
+   switch (op) {
+   case nir_texop_tex:
+  return OP_TEX;
+   case nir_texop_lod:
+  return OP_TXLQ;
+   case nir_texop_txb:
+  return OP_TXB;
+   case nir_texop_txd:
+  return OP_TXD;
+   case nir_texop_txf:
+   case nir_texop_txf_ms:
+  return OP_TXF;
+   case nir_texop_tg4:
+  return OP_TXG;
+   case nir_texop_txl:
+  return OP_TXL;
+   case nir_texop_query_levels:
+   case nir_texop_texture_samples:
+   case nir_texop_txs:
+  return OP_TXQ;
+   default:
+  ERROR("couldn't get operation for nir_texop %u\n", op);
+  assert(false);
+  return OP_NOP;
+   }
+}
+
 operation
 Converter::preOperationNeeded(nir_op op)
 {
@@ -1541,6 +1577,8 @@ Converter::visit(nir_instr *insn)
   return visit(nir_instr_as_load_const(insn));
case nir_instr_type_ssa_undef:
   return visit(nir_instr_as_ssa_undef(insn));
+   case nir_instr_type_tex:
+  return visit(nir_instr_as_tex(insn));
default:
   ERROR("unknown nir_instr type %u\n", insn->type);
   return false;
@@ -2302,6 +2340,202 @@ Converter::visit(nir_ssa_undef_instr *insn)
return true;
 }
 
+#define CASE_SAMPLER(ty) \
+   case GLSL_SAMPLER_DIM_ ## ty : \
+  if (isArray && !isShadow) \
+ return TEX_TARGET_ ## ty ## _ARRAY; \
+  else if (!isArray && isShadow) \
+ return TEX_TARGET_## ty ## _SHADOW; \
+  else if (isArray && isShadow) \
+ return TEX_TARGET_## ty ## _ARRAY_SHADOW; \
+  else \
+ return TEX_TARGET_ ## ty
+
+TexTarget
+Converter::convert(glsl_sampler_dim dim, bool isArray, bool isShadow)
+{
+   switch (dim) {
+   CASE_SAMPLER(1D);
+   CASE_SAMPLER(2D);
+   CASE_SAMPLER(CUBE);
+   case GLSL_SAMPLER_DIM_3D:
+  return TEX_TARGET_3D;
+   case GLSL_SAMPLER_DIM_MS:
+  if (isArray)
+ return TEX_TARGET_2D_MS_ARRAY;
+  return TEX_TARGET_2D_MS;
+   case GLSL_SAMPLER_DIM_RECT:
+  if (isShadow)
+ return TEX_TARGET_RECT_SHADOW;
+  return TEX_TARGET_RECT;
+   case GLSL_SAMPLER_DIM_BUF:
+  return TEX_TARGET_BUFFER;
+   case GLSL_SAMPLER_DIM_EXTERNAL:
+  return TEX_TARGET_2D;
+   default:
+  ERROR("unknown glsl_sampler_dim %u\n", dim);
+  assert(false);
+  return TEX_TARGET_COUNT;
+   }
+}
+#undef CASE_SAMPLER
+
+Value*
+Converter::applyProjection(Value *src, Value *proj)
+{
+   if (!proj)
+  return src;
+   return mkOp2v(OP_MUL, TYPE_F32, getScratch(), src, proj);
+}
+
+bool
+Converter::visit(nir_tex_instr *insn)
+{
+   switch (insn->op) {
+   case nir_texop_lod:
+   case nir_texop_query_levels:
+   case nir_texop_tex:
+   case nir_texop_texture_samples:
+   case nir_texop_tg4:
+   case nir_texop_txb:
+   case nir_texop_txd:
+   case nir_texop_txf:
+   case nir_texop_txf_ms:
+   case nir_texop_txl:
+   case nir_texop_txs: {
+  LValues  = convert(>dest);
+  std::vector srcs;
+  std::vector defs;
+  std::vector offsets;
+  uint8_t mask = 0;
+  bool lz = f

[Mesa-dev] [PATCH 09/34] nv50/ir/nir: add nir type helper functions

2019-03-11 Thread Karol Herbst

v4: treat imul as unsigned
v5: remove pointless !!
v7: inot is unsigned as well
v8: don't require C++11 features
v9: convert to C++ style comments
improve formatting
print error in all cases where codegen doesn't support a given type

Signed-off-by: Karol Herbst 
Acked-by: Pierre Moreau 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 127 ++
 1 file changed, 127 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index f7908876e96..2ac6d8c1d07 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -86,6 +86,18 @@ private:
uint32_t getIndirect(nir_src *, uint8_t, Value *&);
uint32_t getIndirect(nir_intrinsic_instr *, uint8_t s, uint8_t c, Value *&);
 
+   bool isFloatType(nir_alu_type);
+   bool isSignedType(nir_alu_type);
+   bool isResultFloat(nir_op);
+   bool isResultSigned(nir_op);
+
+   DataType getDType(nir_alu_instr *);
+   DataType getDType(nir_intrinsic_instr *);
+   DataType getDType(nir_op, uint8_t);
+
+   std::vector getSTypes(nir_alu_instr *);
+   DataType getSType(nir_src &, bool isFloat, bool isSigned);
+
nir_shader *nir;
 
NirDefMap ssaDefs;
@@ -96,6 +108,121 @@ Converter::Converter(Program *prog, nir_shader *nir, 
nv50_ir_prog_info *info)
: ConverterCommon(prog, info),
  nir(nir) {}
 
+bool
+Converter::isFloatType(nir_alu_type type)
+{
+   return nir_alu_type_get_base_type(type) == nir_type_float;
+}
+
+bool
+Converter::isSignedType(nir_alu_type type)
+{
+   return nir_alu_type_get_base_type(type) == nir_type_int;
+}
+
+bool
+Converter::isResultFloat(nir_op op)
+{
+   const nir_op_info  = nir_op_infos[op];
+   if (info.output_type != nir_type_invalid)
+  return isFloatType(info.output_type);
+
+   ERROR("isResultFloat not implemented for %s\n", nir_op_infos[op].name);
+   assert(false);
+   return true;
+}
+
+bool
+Converter::isResultSigned(nir_op op)
+{
+   switch (op) {
+   // there is no umul and we get wrong results if we treat all muls as signed
+   case nir_op_imul:
+   case nir_op_inot:
+  return false;
+   default:
+  const nir_op_info  = nir_op_infos[op];
+  if (info.output_type != nir_type_invalid)
+ return isSignedType(info.output_type);
+  ERROR("isResultSigned not implemented for %s\n", nir_op_infos[op].name);
+  assert(false);
+  return true;
+   }
+}
+
+DataType
+Converter::getDType(nir_alu_instr *insn)
+{
+   if (insn->dest.dest.is_ssa)
+  return getDType(insn->op, insn->dest.dest.ssa.bit_size);
+   else
+  return getDType(insn->op, insn->dest.dest.reg.reg->bit_size);
+}
+
+DataType
+Converter::getDType(nir_intrinsic_instr *insn)
+{
+   if (insn->dest.is_ssa)
+  return typeOfSize(insn->dest.ssa.bit_size / 8, false, false);
+   else
+  return typeOfSize(insn->dest.reg.reg->bit_size / 8, false, false);
+}
+
+DataType
+Converter::getDType(nir_op op, uint8_t bitSize)
+{
+   DataType ty = typeOfSize(bitSize / 8, isResultFloat(op), 
isResultSigned(op));
+   if (ty == TYPE_NONE) {
+  ERROR("couldn't get Type for op %s with bitSize %u\n", 
nir_op_infos[op].name, bitSize);
+  assert(false);
+   }
+   return ty;
+}
+
+std::vector
+Converter::getSTypes(nir_alu_instr *insn)
+{
+   const nir_op_info  = nir_op_infos[insn->op];
+   std::vector res(info.num_inputs);
+
+   for (uint8_t i = 0; i < info.num_inputs; ++i) {
+  if (info.input_types[i] != nir_type_invalid) {
+ res[i] = getSType(insn->src[i].src, isFloatType(info.input_types[i]), 
isSignedType(info.input_types[i]));
+  } else {
+ ERROR("getSType not implemented for %s idx %u\n", info.name, i);
+ assert(false);
+ res[i] = TYPE_NONE;
+ break;
+  }
+   }
+
+   return res;
+}
+
+DataType
+Converter::getSType(nir_src , bool isFloat, bool isSigned)
+{
+   uint8_t bitSize;
+   if (src.is_ssa)
+  bitSize = src.ssa->bit_size;
+   else
+  bitSize = src.reg.reg->bit_size;
+
+   DataType ty = typeOfSize(bitSize / 8, isFloat, isSigned);
+   if (ty == TYPE_NONE) {
+  const char *str;
+  if (isFloat)
+ str = "float";
+  else if (isSigned)
+ str = "int";
+  else
+ str = "uint";
+  ERROR("couldn't get Type for %s with bitSize %u\n", str, bitSize);
+  assert(false);
+   }
+   return ty;
+}
+
 Converter::LValues&
 Converter::convert(nir_dest *dest)
 {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 19/34] nv50/ir/nir: implement intrinsic_discard(_if)

2019-03-11 Thread Karol Herbst

v9: use getSSA instead of new_LValue

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 70c4aecd699..5c372794e02 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1732,6 +1732,20 @@ Converter::visit(nir_intrinsic_instr *insn)
   loadImm(newDefs[1], mode);
   break;
}
+   case nir_intrinsic_discard:
+  mkOp(OP_DISCARD, TYPE_NONE, NULL);
+  break;
+   case nir_intrinsic_discard_if: {
+  Value *pred = getSSA(1, FILE_PREDICATE);
+  if (insn->num_components > 1) {
+ ERROR("nir_intrinsic_discard_if only with 1 component supported!\n");
+ assert(false);
+ return false;
+  }
+  mkCmp(OP_SET, CC_NE, TYPE_U8, pred, TYPE_U32, getSrc(>src[0], 0), 
zero);
+  mkOp(OP_DISCARD, TYPE_NONE, NULL)->setPredicate(CC_P, pred);
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 18/34] nv50/ir/nir: implement load_(interpolated_)input/output

2019-03-11 Thread Karol Herbst

v3: and load_output
v4: use smarter getIndirect helper
use new getSlotAddress helper
v5: don't use const_offset directly
fix for indirects
v6: add support for interpolateAt
v7: fix compiler warnings
add load_barycentric_sample
handle load_output for fragment shaders
v8: set info->prop.fp.readsSampleLocations for at_sample interpolation
don't require C++11 features
v9: convert to C++ style comments

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 135 ++
 1 file changed, 135 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 6e26e00d91f..70c4aecd699 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1597,6 +1597,141 @@ Converter::visit(nir_intrinsic_instr *insn)
   }
   break;
}
+   case nir_intrinsic_load_input:
+   case nir_intrinsic_load_interpolated_input:
+   case nir_intrinsic_load_output: {
+  LValues  = convert(>dest);
+
+  // FBFetch
+  if (prog->getType() == Program::TYPE_FRAGMENT &&
+  op == nir_intrinsic_load_output) {
+ std::vector defs, srcs;
+ uint8_t mask = 0;
+
+ srcs.push_back(getSSA());
+ srcs.push_back(getSSA());
+ Value *x = mkOp1v(OP_RDSV, TYPE_F32, getSSA(), mkSysVal(SV_POSITION, 
0));
+ Value *y = mkOp1v(OP_RDSV, TYPE_F32, getSSA(), mkSysVal(SV_POSITION, 
1));
+ mkCvt(OP_CVT, TYPE_U32, srcs[0], TYPE_F32, x)->rnd = ROUND_Z;
+ mkCvt(OP_CVT, TYPE_U32, srcs[1], TYPE_F32, y)->rnd = ROUND_Z;
+
+ srcs.push_back(mkOp1v(OP_RDSV, TYPE_U32, getSSA(), mkSysVal(SV_LAYER, 
0)));
+ srcs.push_back(mkOp1v(OP_RDSV, TYPE_U32, getSSA(), 
mkSysVal(SV_SAMPLE_INDEX, 0)));
+
+ for (uint8_t i = 0u; i < insn->num_components; ++i) {
+defs.push_back(newDefs[i]);
+mask |= 1 << i;
+ }
+
+ TexInstruction *texi = mkTex(OP_TXF, TEX_TARGET_2D_MS_ARRAY, 0, 0, 
defs, srcs);
+ texi->tex.levelZero = 1;
+ texi->tex.mask = mask;
+ texi->tex.useOffsets = 0;
+ texi->tex.r = 0x;
+ texi->tex.s = 0x;
+
+ info->prop.fp.readsFramebuffer = true;
+ break;
+  }
+
+  const DataType dType = getDType(insn);
+  Value *indirect;
+  bool input = op != nir_intrinsic_load_output;
+  operation nvirOp;
+  uint32_t mode = 0;
+
+  uint32_t idx = getIndirect(insn, op == 
nir_intrinsic_load_interpolated_input ? 1 : 0, 0, indirect);
+  nv50_ir_varying& vary = input ? info->in[idx] : info->out[idx];
+
+  // see load_barycentric_* handling
+  if (prog->getType() == Program::TYPE_FRAGMENT) {
+ mode = translateInterpMode(, nvirOp);
+ if (op == nir_intrinsic_load_interpolated_input) {
+ImmediateValue immMode;
+if (getSrc(>src[0], 
1)->getUniqueInsn()->src(0).getImmediate(immMode))
+   mode |= immMode.reg.data.u32;
+ }
+  }
+
+  for (uint8_t i = 0u; i < insn->num_components; ++i) {
+ uint32_t address = getSlotAddress(insn, idx, i);
+ Symbol *sym = mkSymbol(input ? FILE_SHADER_INPUT : 
FILE_SHADER_OUTPUT, 0, dType, address);
+ if (prog->getType() == Program::TYPE_FRAGMENT) {
+int s = 1;
+if (typeSizeof(dType) == 8) {
+   Value *lo = getSSA();
+   Value *hi = getSSA();
+   Instruction *interp;
+
+   interp = mkOp1(nvirOp, TYPE_U32, lo, sym);
+   if (nvirOp == OP_PINTERP)
+  interp->setSrc(s++, fp.position);
+   if (mode & NV50_IR_INTERP_OFFSET)
+  interp->setSrc(s++, getSrc(>src[0], 0));
+   interp->setInterpolate(mode);
+   interp->setIndirect(0, 0, indirect);
+
+   Symbol *sym1 = mkSymbol(input ? FILE_SHADER_INPUT : 
FILE_SHADER_OUTPUT, 0, dType, address + 4);
+   interp = mkOp1(nvirOp, TYPE_U32, hi, sym1);
+   if (nvirOp == OP_PINTERP)
+  interp->setSrc(s++, fp.position);
+   if (mode & NV50_IR_INTERP_OFFSET)
+  interp->setSrc(s++, getSrc(>src[0], 0));
+   interp->setInterpolate(mode);
+   interp->setIndirect(0, 0, indirect);
+
+   mkOp2(OP_MERGE, dType, newDefs[i], lo, hi);
+} else {
+   Instruction *interp = mkOp1(nvirOp, dType, newDefs[i], sym);
+   if (nvirOp == OP_PINTERP)
+  interp->setSrc(s++, fp.position);
+   if (mode & NV50_IR_INTERP_OFFSET)
+  interp->setSrc(s++, getSrc(>src[0], 0));
+   interp->setInterpolate(mode);
+   interp-&g

[Mesa-dev] [PATCH 16/34] nv50/ir/nir: implement nir_intrinsic_load_uniform

2019-03-11 Thread Karol Herbst

v2: use new getIndirect helper
fixes symbols for 64 bit types
v4: use smarter getIndirect helper
simplify address calculation
use loadFrom helper
v8: don't require C++11 features

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp   | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index a553e42e08a..dc8dbcfb48b 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1532,6 +1532,16 @@ Converter::visit(nir_intrinsic_instr *insn)
nir_intrinsic_op op = insn->intrinsic;
 
switch (op) {
+   case nir_intrinsic_load_uniform: {
+  LValues  = convert(>dest);
+  const DataType dType = getDType(insn);
+  Value *indirect;
+  uint32_t coffset = getIndirect(insn, 0, 0, indirect);
+  for (uint8_t i = 0; i < insn->num_components; ++i) {
+ loadFrom(FILE_MEMORY_CONST, 0, dType, newDefs[i], 16 * coffset, i, 
indirect);
+  }
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 14/34] nv50/ir/nir: add skeleton for nir_intrinsic_instr

2019-03-11 Thread Karol Herbst

Signed-off-by: Karol Herbst 
Reviewed-by: Pierre Moreau 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp| 17 +
 1 file changed, 17 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 3fa590a4655..a99f3bbbc05 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -122,6 +122,7 @@ private:
bool visit(nir_function *);
bool visit(nir_if *);
bool visit(nir_instr *);
+   bool visit(nir_intrinsic_instr *);
bool visit(nir_jump_instr *);
bool visit(nir_load_const_instr*);
bool visit(nir_loop *);
@@ -1313,6 +1314,8 @@ bool
 Converter::visit(nir_instr *insn)
 {
switch (insn->type) {
+   case nir_instr_type_intrinsic:
+  return visit(nir_instr_as_intrinsic(insn));
case nir_instr_type_jump:
   return visit(nir_instr_as_jump(insn));
case nir_instr_type_load_const:
@@ -1324,6 +1327,20 @@ Converter::visit(nir_instr *insn)
return true;
 }
 
+bool
+Converter::visit(nir_intrinsic_instr *insn)
+{
+   nir_intrinsic_op op = insn->intrinsic;
+
+   switch (op) {
+   default:
+  ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
+  return false;
+   }
+
+   return true;
+}
+
 bool
 Converter::visit(nir_jump_instr *insn)
 {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 30/34] nv50/ir/nir: add memory barriers

2019-03-11 Thread Karol Herbst

v5: add more barrier intrinsics

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 21 +++
 1 file changed, 21 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index ecdc667b25a..ad68fb4505f 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -585,6 +585,16 @@ Converter::getSubOp(nir_intrinsic_op op)
case nir_intrinsic_shared_atomic_xor:
case nir_intrinsic_ssbo_atomic_xor:
   return  NV50_IR_SUBOP_ATOM_XOR;
+
+   case nir_intrinsic_group_memory_barrier:
+   case nir_intrinsic_memory_barrier:
+   case nir_intrinsic_memory_barrier_atomic_counter:
+   case nir_intrinsic_memory_barrier_buffer:
+   case nir_intrinsic_memory_barrier_image:
+  return NV50_IR_SUBOP_MEMBAR(M, GL);
+   case nir_intrinsic_memory_barrier_shared:
+  return NV50_IR_SUBOP_MEMBAR(M, CTA);
+
case nir_intrinsic_vote_all:
   return NV50_IR_SUBOP_VOTE_ALL;
case nir_intrinsic_vote_any:
@@ -2400,6 +2410,17 @@ Converter::visit(nir_intrinsic_instr *insn)
   bar->subOp = NV50_IR_SUBOP_BAR_SYNC;
   break;
}
+   case nir_intrinsic_group_memory_barrier:
+   case nir_intrinsic_memory_barrier:
+   case nir_intrinsic_memory_barrier_atomic_counter:
+   case nir_intrinsic_memory_barrier_buffer:
+   case nir_intrinsic_memory_barrier_image:
+   case nir_intrinsic_memory_barrier_shared: {
+  Instruction *bar = mkOp(OP_MEMBAR, TYPE_NONE, NULL);
+  bar->fixed = 1;
+  bar->subOp = getSubOp(op);
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 33/34] nv50/ir/nir: handle user clip planes for each emitted vertex

2019-03-11 Thread Karol Herbst

v9: convert to C++ style comments
Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 627848a457f..fdc6eaf759a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -1561,7 +1561,7 @@ Converter::visit(nir_function *function)
bb->cfg.attach(>cfg, Graph::Edge::TREE);
setPosition(exit, true);
 
-   if (info->io.genUserClip > 0)
+   if (prog->getType() == Program::TYPE_VERTEX && info->io.genUserClip > 0)
   handleUserClipPlanes();
 
// TODO: for non main function this needs to be a OP_RETURN
@@ -1889,6 +1889,7 @@ Converter::visit(nir_intrinsic_instr *insn)
 }
 break;
  }
+ case Program::TYPE_GEOMETRY:
  case Program::TYPE_VERTEX: {
 if (info->io.genUserClip > 0 && idx == clipVertexOutput) {
mkMov(clipVtx[i], src);
@@ -2187,6 +2188,9 @@ Converter::visit(nir_intrinsic_instr *insn)
   break;
}
case nir_intrinsic_emit_vertex:
+  if (info->io.genUserClip > 0)
+ handleUserClipPlanes();
+  // fallthrough
case nir_intrinsic_end_primitive: {
   uint32_t idx = nir_intrinsic_stream_id(insn);
   mkOp1(getOperation(op), TYPE_U32, NULL, mkImm(idx))->fixed = 1;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 28/34] nv50/ir/nir: implement ssbo intrinsics

2019-03-11 Thread Karol Herbst

v4: use loadFrom helper
v5: support indirect buffer access
v8: don't require C++11 features

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 90 +++
 1 file changed, 90 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index 11403bea674..320f90329ef 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -504,6 +504,24 @@ int
 Converter::getSubOp(nir_intrinsic_op op)
 {
switch (op) {
+   case nir_intrinsic_ssbo_atomic_add:
+  return NV50_IR_SUBOP_ATOM_ADD;
+   case nir_intrinsic_ssbo_atomic_and:
+  return NV50_IR_SUBOP_ATOM_AND;
+   case nir_intrinsic_ssbo_atomic_comp_swap:
+  return NV50_IR_SUBOP_ATOM_CAS;
+   case nir_intrinsic_ssbo_atomic_exchange:
+  return NV50_IR_SUBOP_ATOM_EXCH;
+   case nir_intrinsic_ssbo_atomic_or:
+  return NV50_IR_SUBOP_ATOM_OR;
+   case nir_intrinsic_ssbo_atomic_imax:
+   case nir_intrinsic_ssbo_atomic_umax:
+  return NV50_IR_SUBOP_ATOM_MAX;
+   case nir_intrinsic_ssbo_atomic_imin:
+   case nir_intrinsic_ssbo_atomic_umin:
+  return NV50_IR_SUBOP_ATOM_MIN;
+   case nir_intrinsic_ssbo_atomic_xor:
+  return NV50_IR_SUBOP_ATOM_XOR;
case nir_intrinsic_vote_all:
   return NV50_IR_SUBOP_VOTE_ALL;
case nir_intrinsic_vote_any:
@@ -2027,6 +2045,78 @@ Converter::visit(nir_intrinsic_instr *insn)
   }
   break;
}
+   case nir_intrinsic_get_buffer_size: {
+  LValues  = convert(>dest);
+  const DataType dType = getDType(insn);
+  Value *indirectBuffer;
+  uint32_t buffer = getIndirect(>src[0], 0, indirectBuffer);
+
+  Symbol *sym = mkSymbol(FILE_MEMORY_BUFFER, buffer, dType, 0);
+  mkOp1(OP_BUFQ, dType, newDefs[0], sym)->setIndirect(0, 0, 
indirectBuffer);
+  break;
+   }
+   case nir_intrinsic_store_ssbo: {
+  DataType sType = getSType(insn->src[0], false, false);
+  Value *indirectBuffer;
+  Value *indirectOffset;
+  uint32_t buffer = getIndirect(>src[1], 0, indirectBuffer);
+  uint32_t offset = getIndirect(>src[2], 0, indirectOffset);
+
+  for (uint8_t i = 0u; i < insn->num_components; ++i) {
+ if (!((1u << i) & nir_intrinsic_write_mask(insn)))
+continue;
+ Symbol *sym = mkSymbol(FILE_MEMORY_BUFFER, buffer, sType,
+offset + i * typeSizeof(sType));
+ mkStore(OP_STORE, sType, sym, indirectOffset, getSrc(>src[0], 
i))
+->setIndirect(0, 1, indirectBuffer);
+  }
+  info->io.globalAccess |= 0x2;
+  break;
+   }
+   case nir_intrinsic_load_ssbo: {
+  const DataType dType = getDType(insn);
+  LValues  = convert(>dest);
+  Value *indirectBuffer;
+  Value *indirectOffset;
+  uint32_t buffer = getIndirect(>src[0], 0, indirectBuffer);
+  uint32_t offset = getIndirect(>src[1], 0, indirectOffset);
+
+  for (uint8_t i = 0u; i < insn->num_components; ++i)
+ loadFrom(FILE_MEMORY_BUFFER, buffer, dType, newDefs[i], offset, i,
+  indirectOffset, indirectBuffer);
+
+  info->io.globalAccess |= 0x1;
+  break;
+   }
+   case nir_intrinsic_ssbo_atomic_add:
+   case nir_intrinsic_ssbo_atomic_and:
+   case nir_intrinsic_ssbo_atomic_comp_swap:
+   case nir_intrinsic_ssbo_atomic_exchange:
+   case nir_intrinsic_ssbo_atomic_or:
+   case nir_intrinsic_ssbo_atomic_imax:
+   case nir_intrinsic_ssbo_atomic_imin:
+   case nir_intrinsic_ssbo_atomic_umax:
+   case nir_intrinsic_ssbo_atomic_umin:
+   case nir_intrinsic_ssbo_atomic_xor: {
+  const DataType dType = getDType(insn);
+  LValues  = convert(>dest);
+  Value *indirectBuffer;
+  Value *indirectOffset;
+  uint32_t buffer = getIndirect(>src[0], 0, indirectBuffer);
+  uint32_t offset = getIndirect(>src[1], 0, indirectOffset);
+
+  Symbol *sym = mkSymbol(FILE_MEMORY_BUFFER, buffer, dType, offset);
+  Instruction *atom = mkOp2(OP_ATOM, dType, newDefs[0], sym,
+getSrc(>src[2], 0));
+  if (op == nir_intrinsic_ssbo_atomic_comp_swap)
+ atom->setSrc(2, getSrc(>src[3], 0));
+  atom->setIndirect(0, 0, indirectOffset);
+  atom->setIndirect(0, 1, indirectBuffer);
+  atom->subOp = getSubOp(op);
+
+  info->io.globalAccess |= 0x2;
+  break;
+   }
default:
   ERROR("unknown nir_intrinsic_op %s\n", nir_intrinsic_infos[op].name);
   return false;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/34] nv50/ir/nir: add loadFrom and storeTo helpler

2019-03-11 Thread Karol Herbst

v8: don't require C++11 features

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 72 +++
 1 file changed, 72 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index a0a36d95b41..d3cba9a63c3 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -93,6 +93,13 @@ private:
bool centroid,
unsigned semantics);
 
+   Instruction *loadFrom(DataFile, uint8_t, DataType, Value *def, uint32_t 
base,
+ uint8_t c, Value *indirect0 = NULL,
+ Value *indirect1 = NULL, bool patch = false);
+   void storeTo(nir_intrinsic_instr *, DataFile, operation, DataType,
+Value *src, uint8_t idx, uint8_t c, Value *indirect0 = NULL,
+Value *indirect1 = NULL);
+
bool isFloatType(nir_alu_type);
bool isSignedType(nir_alu_type);
bool isResultFloat(nir_op);
@@ -969,6 +976,71 @@ Converter::getSlotAddress(nir_intrinsic_instr *insn, 
uint8_t idx, uint8_t slot)
return vary[idx].slot[slot] * 4;
 }
 
+Instruction *
+Converter::loadFrom(DataFile file, uint8_t i, DataType ty, Value *def,
+uint32_t base, uint8_t c, Value *indirect0,
+Value *indirect1, bool patch)
+{
+   unsigned int tySize = typeSizeof(ty);
+
+   if (tySize == 8 &&
+   (file == FILE_MEMORY_CONST || file == FILE_MEMORY_BUFFER || indirect0)) 
{
+  Value *lo = getSSA();
+  Value *hi = getSSA();
+
+  Instruction *loi =
+ mkLoad(TYPE_U32, lo,
+mkSymbol(file, i, TYPE_U32, base + c * tySize),
+indirect0);
+  loi->setIndirect(0, 1, indirect1);
+  loi->perPatch = patch;
+
+  Instruction *hii =
+ mkLoad(TYPE_U32, hi,
+mkSymbol(file, i, TYPE_U32, base + c * tySize + 4),
+indirect0);
+  hii->setIndirect(0, 1, indirect1);
+  hii->perPatch = patch;
+
+  return mkOp2(OP_MERGE, ty, def, lo, hi);
+   } else {
+  Instruction *ld =
+ mkLoad(ty, def, mkSymbol(file, i, ty, base + c * tySize), indirect0);
+  ld->setIndirect(0, 1, indirect1);
+  ld->perPatch = patch;
+  return ld;
+   }
+}
+
+void
+Converter::storeTo(nir_intrinsic_instr *insn, DataFile file, operation op,
+   DataType ty, Value *src, uint8_t idx, uint8_t c,
+   Value *indirect0, Value *indirect1)
+{
+   uint8_t size = typeSizeof(ty);
+   uint32_t address = getSlotAddress(insn, idx, c);
+
+   if (size == 8 && indirect0) {
+  Value *split[2];
+  mkSplit(split, 4, src);
+
+  if (op == OP_EXPORT) {
+ split[0] = mkMov(getSSA(), split[0], ty)->getDef(0);
+ split[1] = mkMov(getSSA(), split[1], ty)->getDef(0);
+  }
+
+  mkStore(op, TYPE_U32, mkSymbol(file, 0, TYPE_U32, address), indirect0,
+  split[0])->perPatch = info->out[idx].patch;
+  mkStore(op, TYPE_U32, mkSymbol(file, 0, TYPE_U32, address + 4), 
indirect0,
+  split[1])->perPatch = info->out[idx].patch;
+   } else {
+  if (op == OP_EXPORT)
+ src = mkMov(getSSA(size), src, ty)->getDef(0);
+  mkStore(op, ty, mkSymbol(file, 0, ty, address), indirect0,
+  src)->perPatch = info->out[idx].patch;
+   }
+}
+
 bool
 Converter::run()
 {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/34] nv50/ir/nir: implement nir_alu_instr handling

2019-03-11 Thread Karol Herbst

v2: user bitfield_insert instead of bfi
rework switch helper macros
remove some lowering code (LoweringHelper is now used for this)
v3: add pack_half_2x16_split
add unpack_half_2x16_split_x/y
v5: replace first argument with nullptr in loadImm calls
prefer getSSA over getScratch
v8: fix setting precise modifier for first instruction inside a block
add guard in case no instruction gets inserted into an empty block
don't require C++11 features
v9: use CC_NE for integer compares
convert to C++ style comments
fix b2f for doubles
remove macros around nir ops to make it easier to grep them
add handling for fpow

Signed-off-by: Karol Herbst 
---
 .../nouveau/codegen/nv50_ir_from_nir.cpp  | 562 +-
 1 file changed, 561 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
index a99f3bbbc05..a553e42e08a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
@@ -114,9 +114,17 @@ private:
std::vector getSTypes(nir_alu_instr *);
DataType getSType(nir_src &, bool isFloat, bool isSigned);
 
+   operation getOperation(nir_op);
+   operation preOperationNeeded(nir_op);
+
+   int getSubOp(nir_op);
+
+   CondCode getCondCode(nir_op);
+
bool assignSlots();
bool parseNIR();
 
+   bool visit(nir_alu_instr *);
bool visit(nir_block *);
bool visit(nir_cf_node *);
bool visit(nir_function *);
@@ -135,6 +143,7 @@ private:
unsigned int curLoopDepth;
 
BasicBlock *exit;
+   Value *zero;
 
union {
   struct {
@@ -146,7 +155,10 @@ private:
 Converter::Converter(Program *prog, nir_shader *nir, nv50_ir_prog_info *info)
: ConverterCommon(prog, info),
  nir(nir),
- curLoopDepth(0) {}
+ curLoopDepth(0)
+{
+   zero = mkImm((uint32_t)0);
+}
 
 BasicBlock *
 Converter::convert(nir_block *block)
@@ -275,6 +287,191 @@ Converter::getSType(nir_src , bool isFloat, bool 
isSigned)
return ty;
 }
 
+operation
+Converter::getOperation(nir_op op)
+{
+   switch (op) {
+   // basic ops with float and int variants
+   case nir_op_fabs:
+   case nir_op_iabs:
+  return OP_ABS;
+   case nir_op_fadd:
+   case nir_op_iadd:
+  return OP_ADD;
+   case nir_op_fand:
+   case nir_op_iand:
+  return OP_AND;
+   case nir_op_ifind_msb:
+   case nir_op_ufind_msb:
+  return OP_BFIND;
+   case nir_op_fceil:
+  return OP_CEIL;
+   case nir_op_fcos:
+  return OP_COS;
+   case nir_op_f2f32:
+   case nir_op_f2f64:
+   case nir_op_f2i32:
+   case nir_op_f2i64:
+   case nir_op_f2u32:
+   case nir_op_f2u64:
+   case nir_op_i2f32:
+   case nir_op_i2f64:
+   case nir_op_i2i32:
+   case nir_op_i2i64:
+   case nir_op_u2f32:
+   case nir_op_u2f64:
+   case nir_op_u2u32:
+   case nir_op_u2u64:
+  return OP_CVT;
+   case nir_op_fddx:
+   case nir_op_fddx_coarse:
+   case nir_op_fddx_fine:
+  return OP_DFDX;
+   case nir_op_fddy:
+   case nir_op_fddy_coarse:
+   case nir_op_fddy_fine:
+  return OP_DFDY;
+   case nir_op_fdiv:
+   case nir_op_idiv:
+   case nir_op_udiv:
+  return OP_DIV;
+   case nir_op_fexp2:
+  return OP_EX2;
+   case nir_op_ffloor:
+  return OP_FLOOR;
+   case nir_op_ffma:
+  return OP_FMA;
+   case nir_op_flog2:
+  return OP_LG2;
+   case nir_op_fmax:
+   case nir_op_imax:
+   case nir_op_umax:
+  return OP_MAX;
+   case nir_op_pack_64_2x32_split:
+  return OP_MERGE;
+   case nir_op_fmin:
+   case nir_op_imin:
+   case nir_op_umin:
+  return OP_MIN;
+   case nir_op_fmod:
+   case nir_op_imod:
+   case nir_op_umod:
+   case nir_op_frem:
+   case nir_op_irem:
+  return OP_MOD;
+   case nir_op_fmul:
+   case nir_op_imul:
+   case nir_op_imul_high:
+   case nir_op_umul_high:
+  return OP_MUL;
+   case nir_op_fneg:
+   case nir_op_ineg:
+  return OP_NEG;
+   case nir_op_fnot:
+   case nir_op_inot:
+  return OP_NOT;
+   case nir_op_for:
+   case nir_op_ior:
+  return OP_OR;
+   case nir_op_fpow:
+  return OP_POW;
+   case nir_op_frcp:
+  return OP_RCP;
+   case nir_op_frsq:
+  return OP_RSQ;
+   case nir_op_fsat:
+  return OP_SAT;
+   case nir_op_feq32:
+   case nir_op_ieq32:
+   case nir_op_fge32:
+   case nir_op_ige32:
+   case nir_op_uge32:
+   case nir_op_flt32:
+   case nir_op_ilt32:
+   case nir_op_ult32:
+   case nir_op_fne32:
+   case nir_op_ine32:
+  return OP_SET;
+   case nir_op_ishl:
+  return OP_SHL;
+   case nir_op_ishr:
+   case nir_op_ushr:
+  return OP_SHR;
+   case nir_op_fsin:
+  return OP_SIN;
+   case nir_op_fsqrt:
+  return OP_SQRT;
+   case nir_op_fsub:
+   case nir_op_isub:
+  return OP_SUB;
+   case nir_op_ftrunc:
+  return OP_TRUNC;
+   case nir_op_fxor:
+   case nir_op_ixor:
+  return OP_XOR;
+   default:
+  ERROR("couldn't get operation for op %s\n", nir_op_infos[op].name);
+  assert(false);

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 923 matches

Mail list logo