[Mesa-dev] meson doesn't build gallium drivers in parallel with src/mesa

2020-03-20 Thread Marek Olšák
Hi,

I think the problem is that src/mesa is linked using "link_with", while
gallium drivers are linked using "dependencies" in meson, but I might be
wrong.

It looks like meson only compiles the dependencies in "link_with" in
parallel, then waits for completion, and then meson looks at
"dependencies", which triggers python scripts like sid_tables_h WHICH RUN
SINGLE-THREADED UNTIL COMPLETION, and then meson starts another wave of
compilation.

Any idea how to fix this?

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH v2 1/6] nv50/ir: add nv50_ir_prog_info_out

2020-03-20 Thread Karol Herbst
On Fri, Mar 20, 2020 at 10:20 AM Juan A. Suarez Romero
 wrote:
>
> On Thu, 2020-03-19 at 21:57 +0100, Mark Menzynski wrote:
> > From: Karol Herbst 
> >
> > Split out the output relevant fields from the nv50_ir_prog_info struct
> > in order to have a cleaner separation between the input and output of
> > the compilation.
> >
>
>
> Please, submit the series through GitLab (
> https://www.mesa3d.org/submittingpatches.html#submit)
>
> Thanks!
>
> J.A.
>

it's fine for nouveau patches, but yeah, I know it makes sense to do
that on gitlab, but if not everybody feels fine in a community doing
stuff on gitlab, I also don't want to enforce that, and that's the
case for nouveau right now.

>
> > gned-off-by: Karol Herbst 
> > ---
> >  .../drivers/nouveau/codegen/nv50_ir.cpp   |  49 ++--
> >  src/gallium/drivers/nouveau/codegen/nv50_ir.h |   9 +-
> >  .../drivers/nouveau/codegen/nv50_ir_driver.h  | 117 +---
> >  .../nouveau/codegen/nv50_ir_from_common.cpp   |  14 +-
> >  .../nouveau/codegen/nv50_ir_from_common.h |   3 +-
> >  .../nouveau/codegen/nv50_ir_from_nir.cpp  | 204 +++---
> >  .../nouveau/codegen/nv50_ir_from_tgsi.cpp | 256 +-
> >  .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp |   6 +-
> >  .../nouveau/codegen/nv50_ir_target.cpp|   2 +-
> >  .../drivers/nouveau/codegen/nv50_ir_target.h  |   5 +-
> >  .../nouveau/codegen/nv50_ir_target_nv50.cpp   |  17 +-
> >  .../nouveau/codegen/nv50_ir_target_nv50.h |   3 +-
> >  .../drivers/nouveau/nouveau_compiler.c|   9 +-
> >  .../drivers/nouveau/nv50/nv50_program.c   |  62 +++--
> >  .../drivers/nouveau/nvc0/nvc0_program.c   |  87 +++---
> >  15 files changed, 449 insertions(+), 394 deletions(-)
> >
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> > index c65853578f6..c2c5956874a 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> > @@ -1241,15 +1241,18 @@ void Program::releaseValue(Value *value)
> >  extern "C" {
> >
> >  static void
> > -nv50_ir_init_prog_info(struct nv50_ir_prog_info *info)
> > +nv50_ir_init_prog_info(struct nv50_ir_prog_info *info,
> > +   struct nv50_ir_prog_info_out *info_out)
> >  {
> > +   info_out->target = info->target;
> > +   info_out->type = info->type;
> > if (info->type == PIPE_SHADER_TESS_CTRL || info->type == 
> > PIPE_SHADER_TESS_EVAL) {
> > -  info->prop.tp.domain = PIPE_PRIM_MAX;
> > -  info->prop.tp.outputPrim = PIPE_PRIM_MAX;
> > +  info_out->prop.tp.domain = PIPE_PRIM_MAX;
> > +  info_out->prop.tp.outputPrim = PIPE_PRIM_MAX;
> > }
> > if (info->type == PIPE_SHADER_GEOMETRY) {
> > -  info->prop.gp.instanceCount = 1;
> > -  info->prop.gp.maxVertices = 1;
> > +  info_out->prop.gp.instanceCount = 1;
> > +  info_out->prop.gp.maxVertices = 1;
> > }
> > if (info->type == PIPE_SHADER_COMPUTE) {
> >info->prop.cp.numThreads[0] =
> > @@ -1257,23 +1260,26 @@ nv50_ir_init_prog_info(struct nv50_ir_prog_info 
> > *info)
> >info->prop.cp.numThreads[2] = 1;
> > }
> > info->io.pointSize = 0xff;
> > -   info->io.instanceId = 0xff;
> > -   info->io.vertexId = 0xff;
> > -   info->io.edgeFlagIn = 0xff;
> > -   info->io.edgeFlagOut = 0xff;
> > -   info->io.fragDepth = 0xff;
> > -   info->io.sampleMask = 0xff;
> > +   info_out->bin.smemSize = info->bin.smemSize;
> > +   info_out->io.genUserClip = info->io.genUserClip;
> > +   info_out->io.instanceId = 0xff;
> > +   info_out->io.vertexId = 0xff;
> > +   info_out->io.edgeFlagIn = 0xff;
> > +   info_out->io.edgeFlagOut = 0xff;
> > +   info_out->io.fragDepth = 0xff;
> > +   info_out->io.sampleMask = 0xff;
> > info->io.backFaceColor[0] = info->io.backFaceColor[1] = 0xff;
> >  }
> >
> >  int
> > -nv50_ir_generate_code(struct nv50_ir_prog_info *info)
> > +nv50_ir_generate_code(struct nv50_ir_prog_info *info,
> > +  struct nv50_ir_prog_info_out *info_out)
> >  {
> > int ret = 0;
> >
> > nv50_ir::Program::Type type;
> >
> > -   nv50_ir_init_prog_info(info);
> > +   nv50_ir_init_prog_info(info, info_out);
> >
> >  #define PROG_TYPE_CASE(a, b)  \
> > case PIPE_SHADER_##a: type = nv50_ir::Program::TYPE_##b; break
> > @@ -1301,15 +1307,16 @@ nv50_ir_generate_code(struct nv50_ir_prog_info 
> > *info)
> >return -1;
> > }
> > prog->driver = info;
> > +   prog->driver_out = info_out;
> > prog->dbgFlags = info->dbgFlags;
> > prog->optLevel = info->optLevel;
> >
> > switch (info->bin.sourceRep) {
> > case PIPE_SHADER_IR_NIR:
> > -  ret = prog->makeFromNIR(info) ? 0 : -2;
> > +  ret = prog->makeFromNIR(info, info_out) ? 0 : -2;
> >break;
> > case PIPE_SHADER_IR_TGSI:
> > -  ret = prog->makeFromTGSI(info) ? 0 : -2;
> > +  ret = prog->makeFromTGSI(info, info_out) ? 0 : -2;
> >

Re: [Mesa-dev] [RFC PATCH v2 1/6] nv50/ir: add nv50_ir_prog_info_out

2020-03-20 Thread Juan A. Suarez Romero
On Thu, 2020-03-19 at 21:57 +0100, Mark Menzynski wrote:
> From: Karol Herbst 
> 
> Split out the output relevant fields from the nv50_ir_prog_info struct
> in order to have a cleaner separation between the input and output of
> the compilation.
> 


Please, submit the series through GitLab (
https://www.mesa3d.org/submittingpatches.html#submit)

Thanks!

J.A.


> gned-off-by: Karol Herbst 
> ---
>  .../drivers/nouveau/codegen/nv50_ir.cpp   |  49 ++--
>  src/gallium/drivers/nouveau/codegen/nv50_ir.h |   9 +-
>  .../drivers/nouveau/codegen/nv50_ir_driver.h  | 117 +---
>  .../nouveau/codegen/nv50_ir_from_common.cpp   |  14 +-
>  .../nouveau/codegen/nv50_ir_from_common.h |   3 +-
>  .../nouveau/codegen/nv50_ir_from_nir.cpp  | 204 +++---
>  .../nouveau/codegen/nv50_ir_from_tgsi.cpp | 256 +-
>  .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp |   6 +-
>  .../nouveau/codegen/nv50_ir_target.cpp|   2 +-
>  .../drivers/nouveau/codegen/nv50_ir_target.h  |   5 +-
>  .../nouveau/codegen/nv50_ir_target_nv50.cpp   |  17 +-
>  .../nouveau/codegen/nv50_ir_target_nv50.h |   3 +-
>  .../drivers/nouveau/nouveau_compiler.c|   9 +-
>  .../drivers/nouveau/nv50/nv50_program.c   |  62 +++--
>  .../drivers/nouveau/nvc0/nvc0_program.c   |  87 +++---
>  15 files changed, 449 insertions(+), 394 deletions(-)
> 
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> index c65853578f6..c2c5956874a 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> @@ -1241,15 +1241,18 @@ void Program::releaseValue(Value *value)
>  extern "C" {
>  
>  static void
> -nv50_ir_init_prog_info(struct nv50_ir_prog_info *info)
> +nv50_ir_init_prog_info(struct nv50_ir_prog_info *info,
> +   struct nv50_ir_prog_info_out *info_out)
>  {
> +   info_out->target = info->target;
> +   info_out->type = info->type;
> if (info->type == PIPE_SHADER_TESS_CTRL || info->type == 
> PIPE_SHADER_TESS_EVAL) {
> -  info->prop.tp.domain = PIPE_PRIM_MAX;
> -  info->prop.tp.outputPrim = PIPE_PRIM_MAX;
> +  info_out->prop.tp.domain = PIPE_PRIM_MAX;
> +  info_out->prop.tp.outputPrim = PIPE_PRIM_MAX;
> }
> if (info->type == PIPE_SHADER_GEOMETRY) {
> -  info->prop.gp.instanceCount = 1;
> -  info->prop.gp.maxVertices = 1;
> +  info_out->prop.gp.instanceCount = 1;
> +  info_out->prop.gp.maxVertices = 1;
> }
> if (info->type == PIPE_SHADER_COMPUTE) {
>info->prop.cp.numThreads[0] =
> @@ -1257,23 +1260,26 @@ nv50_ir_init_prog_info(struct nv50_ir_prog_info *info)
>info->prop.cp.numThreads[2] = 1;
> }
> info->io.pointSize = 0xff;
> -   info->io.instanceId = 0xff;
> -   info->io.vertexId = 0xff;
> -   info->io.edgeFlagIn = 0xff;
> -   info->io.edgeFlagOut = 0xff;
> -   info->io.fragDepth = 0xff;
> -   info->io.sampleMask = 0xff;
> +   info_out->bin.smemSize = info->bin.smemSize;
> +   info_out->io.genUserClip = info->io.genUserClip;
> +   info_out->io.instanceId = 0xff;
> +   info_out->io.vertexId = 0xff;
> +   info_out->io.edgeFlagIn = 0xff;
> +   info_out->io.edgeFlagOut = 0xff;
> +   info_out->io.fragDepth = 0xff;
> +   info_out->io.sampleMask = 0xff;
> info->io.backFaceColor[0] = info->io.backFaceColor[1] = 0xff;
>  }
>  
>  int
> -nv50_ir_generate_code(struct nv50_ir_prog_info *info)
> +nv50_ir_generate_code(struct nv50_ir_prog_info *info,
> +  struct nv50_ir_prog_info_out *info_out)
>  {
> int ret = 0;
>  
> nv50_ir::Program::Type type;
>  
> -   nv50_ir_init_prog_info(info);
> +   nv50_ir_init_prog_info(info, info_out);
>  
>  #define PROG_TYPE_CASE(a, b)  \
> case PIPE_SHADER_##a: type = nv50_ir::Program::TYPE_##b; break
> @@ -1301,15 +1307,16 @@ nv50_ir_generate_code(struct nv50_ir_prog_info *info)
>return -1;
> }
> prog->driver = info;
> +   prog->driver_out = info_out;
> prog->dbgFlags = info->dbgFlags;
> prog->optLevel = info->optLevel;
>  
> switch (info->bin.sourceRep) {
> case PIPE_SHADER_IR_NIR:
> -  ret = prog->makeFromNIR(info) ? 0 : -2;
> +  ret = prog->makeFromNIR(info, info_out) ? 0 : -2;
>break;
> case PIPE_SHADER_IR_TGSI:
> -  ret = prog->makeFromTGSI(info) ? 0 : -2;
> +  ret = prog->makeFromTGSI(info, info_out) ? 0 : -2;
>break;
> default:
>ret = -1;
> @@ -1320,7 +1327,7 @@ nv50_ir_generate_code(struct nv50_ir_prog_info *info)
> if (prog->dbgFlags & NV50_IR_DEBUG_VERBOSE)
>prog->print();
>  
> -   targ->parseDriverInfo(info);
> +   targ->parseDriverInfo(info, info_out);
> prog->getTarget()->runLegalizePass(prog, nv50_ir::CG_STAGE_PRE_SSA);
>  
> prog->convertToSSA();
> @@ -1342,7 +1349,7 @@ nv50_ir_generate_code(struct nv50_ir_prog_info *info)
>  
> prog->optimizePostRA(info->optLevel);
>  
> -   

Re: [Mesa-dev] [RFC PATCH 1/2] nv50: Add separate functions for varying bits

2020-03-20 Thread Juan A. Suarez Romero
On Thu, 2020-03-19 at 21:20 +0100, Mark Menzynski wrote:
> This separation will be needed for shader disk caching. The reason for it
> is that when loading shaders from cache, data in info structure already gets
> loaded. That means varying bits for info is needed only when compiling
> shaders and not needed when loading from cache. Varying bits for prog are
> needed in both cases.
> 
> Unfortunately, I don't know how most of the code works, I have separated
> this manually, only by looking at the original code. That means that this
> patch is experimental. Together with following commit it works
> (there seem to be no regressions at all in VK-GL-CTS
> [openglcts/data/mustpass/gl/khronos_mustpass/4.6.1.x/gl33-master.txt]
> and all benchmarks behaved normally). Unfortunately, I cannot test in
> Piglit because of technical problems, so there might be still some
> work needed.
> 
> I am mainly asking to help with the function names,
> look for bugs and pointing out useless code. I will be glad for every
> review.
> 
> Signed-off-by: Mark Menzynski 


Could you submit the patch through GitLab?

Instructions at https://www.mesa3d.org/submittingpatches.html#submit


J.A.



> ---
>  .../drivers/nouveau/nv50/nv50_program.c   | 344 ++
>  1 file changed, 344 insertions(+)
> 
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_program.c
> index a3f3054cbaa..bf63b20f613 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_program.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_program.c
> @@ -139,6 +139,130 @@ nv50_vertprog_assign_slots(struct nv50_ir_prog_info_out 
> *info)
> return 0;
>  }
>  
> +static int
> +nv50_vertprog_assign_slots_info(struct nv50_ir_prog_info_out *info)
> +{
> +   unsigned i, n, c;
> +
> +   n = 0;
> +   for (i = 0; i < info->numInputs; ++i) {
> +   for (c = 0; c < 4; ++c)
> + if (info->in[i].mask & (1 << c))
> +info->in[i].slot[c] = n++;
> +   }
> +
> +   /* VertexID before InstanceID */
> +   if (info->io.vertexId < info->numSysVals)
> +  info->sv[info->io.vertexId].slot[0] = n++;
> +   if (info->io.instanceId < info->numSysVals)
> +  info->sv[info->io.instanceId].slot[0] = n++;
> +
> +   n = 0;
> +   for (i = 0; i < info->numOutputs; ++i) {
> +  for (c = 0; c < 4; ++c)
> + if (info->out[i].mask & (1 << c))
> +info->out[i].slot[c] = n++;
> +   }
> +
> +   return 0;
> +}
> +
> +static int
> +nv50_vertprog_assign_slots_prog(struct nv50_ir_prog_info_out *info)
> +{
> +   struct nv50_program *prog = (struct nv50_program *)info->driverPriv;
> +   unsigned i, n, c;
> +
> +   n = 0;
> +   for (i = 0; i < info->numInputs; ++i) {
> +  prog->in[i].id = i;
> +  prog->in[i].sn = info->in[i].sn;
> +  prog->in[i].si = info->in[i].si;
> +  prog->in[i].hw = n;
> +  prog->in[i].mask = info->in[i].mask;
> +
> +  prog->vp.attrs[(4 * i) / 32] |= info->in[i].mask << ((4 * i) % 32);
> +
> +  for (c = 0; c < 4; ++c)
> + if (info->in[i].mask & (1 << c))
> +n++;
> +
> +  if (info->in[i].sn == TGSI_SEMANTIC_PRIMID)
> + prog->vp.attrs[2] |= NV50_3D_VP_GP_BUILTIN_ATTR_EN_PRIMITIVE_ID;
> +   }
> +   prog->in_nr = info->numInputs;
> +
> +   for (i = 0; i < info->numSysVals; ++i) {
> +  switch (info->sv[i].sn) {
> +  case TGSI_SEMANTIC_INSTANCEID:
> + prog->vp.attrs[2] |= NV50_3D_VP_GP_BUILTIN_ATTR_EN_INSTANCE_ID;
> + continue;
> +  case TGSI_SEMANTIC_VERTEXID:
> + prog->vp.attrs[2] |= NV50_3D_VP_GP_BUILTIN_ATTR_EN_VERTEX_ID;
> + prog->vp.attrs[2] |= 
> NV50_3D_VP_GP_BUILTIN_ATTR_EN_VERTEX_ID_DRAW_ARRAYS_ADD_START;
> + continue;
> +  default:
> + break;
> +  }
> +   }
> +
> +   /*
> +* Corner case: VP has no inputs, but we will still need to submit data to
> +* draw it. HW will shout at us and won't draw anything if we don't enable
> +* any input, so let's just pretend it's the first one.
> +*/
> +   if (prog->vp.attrs[0] == 0 &&
> +   prog->vp.attrs[1] == 0 &&
> +   prog->vp.attrs[2] == 0)
> +  prog->vp.attrs[0] |= 0xf;
> +
> +   n = 0;
> +   for (i = 0; i < info->numOutputs; ++i) {
> +  switch (info->out[i].sn) {
> +  case TGSI_SEMANTIC_PSIZE:
> + prog->vp.psiz = i;
> + break;
> +  case TGSI_SEMANTIC_CLIPDIST:
> + prog->vp.clpd[info->out[i].si] = n;
> + break;
> +  case TGSI_SEMANTIC_EDGEFLAG:
> + prog->vp.edgeflag = i;
> + break;
> +  case TGSI_SEMANTIC_BCOLOR:
> + prog->vp.bfc[info->out[i].si] = i;
> + break;
> +  case TGSI_SEMANTIC_LAYER:
> + prog->gp.has_layer = true;
> + prog->gp.layerid = n;
> + break;
> +  case TGSI_SEMANTIC_VIEWPORT_INDEX:
> + prog->gp.has_viewport = true;
> + prog->gp.viewportid = n;
> + break;
> +  default:
> + break;
> +  }
> +  

Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem

2020-03-20 Thread Michel Dänzer
On 2020-03-19 8:54 p.m., Marek Olšák wrote:
> On Thu., Mar. 19, 2020, 06:51 Daniel Vetter, 
> wrote:
>> 
>> Yeah, this is entirely about the programming model visible to
>> userspace. There shouldn't be any impact on the driver's choice of
>> a top vs. bottom of the gpu pipeline used for synchronization,
>> that's entirely up to what you're hw/driver/scheduler can pull
>> off.
>> 
>> Doing a full gfx pipeline flush for shared buffers, when your hw
>> can do be, sounds like an issue to me that's not related to this
>> here at all. It might be intertwined with amdgpu's special
>> interpretation of dma_resv fences though, no idea. We might need to
>> revamp all that. But for a userspace client that does nothing fancy
>> (no multiple render buffer targets in one bo, or vk style "I write
>> to everything all the time, perhaps" stuff) there should be 0 perf
>> difference between implicit sync through dma_resv and explicit sync
>> through sync_file/syncobj/dma_fence directly.
>> 
>> If there is I'd consider that a bit a driver bug.
> 
> Last time I checked, there was no fence sync in gnome shell and
> compiz after an app passes a buffer to it.

They are not required (though encouraged) to do that.


> So drivers have to invent hacks to work around it and decrease
> performance. It's not a driver bug.
> 
> Implicit sync really means that apps and compositors don't sync, so
> the driver has to guess when it should sync.

Making implicit sync work correctly is ultimately the kernel driver's
responsibility. It sounds like radeonsi is having to work around the
amdgpu/radeon kernel driver(s) not fully living up to this responsibility.


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev